GCS Client Library Functions

The GCS client library provides the following functions:

Functions

cloudstorage.delete() deletes the specified object from the GCS bucket.
cloudstorage.listbucket() lists the objects in the GCS bucket.
cloudstorage.open() opens an existing object in the GCS bucket for reading or overwriting, or creates a new object, depending on the specified mode
cloudstorage.stat() provides metadata information about the file, such as content type, size, timestamp, MD5 digest, and GCS headers.

Once cloudstorage.open() is invoked to return the file-like object representing the GCS object specified, you can use the standard Python file functions, such as write() and close() , to write an object to the GCS bucket, or read() to read an object from the GCS bucket.

Classes

Functions

cloudstorage.delete ( filename , retry_params = None )

Deletes the specified file from the GCS bucket.

Raises cloudstorage.NotFoundError if the specified GCS object doesn't exist.

Arguments

filename (Required): The full GCS file name for the object, in the format /bucket/object_name . Must be a full filename and can include the delimiter `/`.
retry_params = None (Optional.): A RetryParams object in which you supply any changes you want to make to the default timeout and retry settings for this call.

Example

To delete a file but ignore the error when the file does not exist:

import cloudstorage

try:
  cloudstorage.delete(filename)
except cloudstorage.NotFoundError:
  pass

cloudstorage.listbucket ( path_prefix , marker = None , max_keys = None , delimiter = None , retry_params = None )

Returns a bucket iterator object. This iterator returns a sorted list of objects that match all filters. Note that this function is asynchronous. It does not block unless the iterator is called before the iterator gets results.

This function operates in two different modes depending on whether you use the delimiter argument or not:

Regular mode (default): Lists all files in the bucket without any concept of hierarchy. (GCS doesn't have real directory hierarchies.)
Directory emulation mode: If you specify the delimiter argument, it is used as a path separator to emulate a hierarchy of directories.

Arguments

path_prefix (Required)

A Google Cloud Storage path of format


              /bucket


              /bucket/prefix

, for example,


              /bucket/foo/2001

. Only objects whose fullpath starts with


              path_prefix

will be returned.

marker = None (Optional)

String. Another path prefix. Only objects whose fullpath starts lexicographically after marker exclusively will be returned. The file used as `marker` is not returned. For example, if you want all fileslisted starting at


              superduperfoo3.txt

to be listed, you specify the file immediately preceding


              superduperfoo3.txt

, for example:

stat = gcs.listbucket("/my_bucket/foo", marker='/my_bucket/foo/superduperfoo2.txt')

One way to use this parameter is to use it with


              max_keys

to "page through" the bucket file names.

max_keys = None (Optional)

Integer. Specifies the maximum number of objects to be returned. Use it if you know how many objects you want. (Otherwise, the GCS client library automatically buffers and paginates through all results.) Can be used with


              marker

to page through filenames in a bucket.

stats = gcs.listbucket(bucket + '/foo', max_keys=page_size, marker=stat.filename)

delimiter = None (Optional)

String. Turns on directory mode. You can specify one or multiple characters to be used as a directory separator.

retry_params = None (Optional.)

A RetryParams object in which you supply any changes you want to make to the default timeout and retry settings for this call.

Result Value

Returns an iterator of GCSFileStat objects over the matched files, sorted by filename. In regular mode, the returned GCSFileStat objects have the following data:

filename
etag (MD5 digest)
st_size (content length of headers)
st_ctime
is_dir

Note: If the GCSFileStat object's is_dir property is True , then the only other property in the object is filename . If is_dir is False , then the GCSFileStat contains all of the other properties as well.

cloudstorage.open ( filename , mode = 'r' , content_type = None , options = None , read_buffer_size = storage_api.ReadBuffer.DEFAULT_BUFFER_SIZE , retry_params = None )

In read mode ( r ) opens the specified GCS object for read. In write mode w , if the specified file exists, it opens it for an overwrite (append is not supported). If the file doesn't exist, it is created in the specified bucket.

When you finish writing, if you want to read the file and/or store it at GCS, close the file using the close function. It is not an error if you don't call close , but the file will not be readable or persisted at GCS.

Raises:

cloudstorage.NotFoundError if in read mode and the specified object doesn't exist.

Arguments

filename (Required)

The file to open, in the format


               /bucket/object

. For example,


               /my_bucket/lyrics/southamerica/list5.txt

mode (Optional)

String. Specify 'r' to open a file for read (default). Specify 'w' to open an existing file for overwriting or to create a new file.

content_type: (Optional)

String. Used only in write mode. You should specify the MIME type of the file (You can specify any valid MIME type.) If you don't supply this value, GCS defaults to the type


               binary/octet-stream

when it serves the object.

options: (Optional)

Dict. Used only in write mode. GCS controls access to objects in buckets by means of an access control list (ACL). If you don't specify an ACL, GCS uses the bucket's default ACL . If you want to supply a specific ACL, you can do so by specifying the appropriate ACL in options . The valid values you can supply are listed in the GCS documentation for x-goog-acl

In options , you can also specify custom metadata, using x-goog-meta- headers.

gcs_file = cloudstorage.open(filename, 'w', content_type='text/plain', options={'x-goog-acl': 'private','x-goog-meta-foo': 'foo', 'x-goog-meta-bar': 'bar'})

read_buffer_size: (Optional)

Integer. Used only in read mode. If you don't set this value, the default buffer size is used (recommended). When you read, you should read by


               read_buffer_size

for optimum prefetch performance.

retry_params = None (Optional.)

A RetryParams object in which you supply any changes you want to make to the default timeout and retry settings for this call.

Result Value

Returns a reading or writing buffer, supporting a file-like interface on which you can invoke standard Python read , write , and close functions. This buffer must be closed after you finish reading or writing.

cloudstorage.stat ( filename , retry_params = None )

Returns a GCSFileStat object containing file metadata.

Raises:

cloudstorage.NotFoundError if in read mode and the specified bucket or object doesn't exist.

Arguments

filename (Required): The file to open, in the format /bucket/object . For example, /my_bucket/lyrics/southamerica/list5.txt
retry_params = None (Optional.): A RetryParams object in which you supply any changes you want to make to the default timeout and retry settings for this call.

Result Value

Returns a GCSFileStat object containing file metadata.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 3.0 License , and code samples are licensed under the Apache 2.0 License . For details, see our Site Policies .

Last updated May 6, 2014.

Python

GCS Client Library Functions

Functions

Authentication required

Signing you in...