The GCS client library provides the following functions:
Functions
- cloudstorage.delete() deletes the specified object from the GCS bucket.
- cloudstorage.listbucket() lists the objects in the GCS bucket.
- cloudstorage.open() opens an existing object in the GCS bucket for reading or overwriting, or creates a new object, depending on the specified mode
- cloudstorage.stat() provides metadata information about the file, such as content type, size, timestamp, MD5 digest, and GCS headers.
Once
cloudstorage.open()
is invoked to return the file-like
object representing the GCS object specified, you can use the standard Python
file functions, such as
write()
and
close()
, to write
an object to the GCS bucket, or
read()
to read an object from the
GCS bucket.
Classes
Functions
- cloudstorage.delete ( filename , retry_params = None )
-
Deletes the specified file from the GCS bucket.
Raises cloudstorage.NotFoundError if the specified GCS object doesn't exist.
Arguments
- filename (Required)
-
The full GCS file name for the object, in the format
/bucket/object_name
. Must be a full filename and can include the delimiter `/`. - retry_params = None (Optional.)
- A RetryParams object in which you supply any changes you want to make to the default timeout and retry settings for this call.
Example
- To delete a file but ignore the error when the file does not exist:
-
import cloudstorage try: cloudstorage.delete(filename) except cloudstorage.NotFoundError: pass
- cloudstorage.listbucket ( path_prefix , marker = None , max_keys = None , delimiter = None , retry_params = None )
-
Returns a bucket iterator object. This iterator returns a sorted list of
objects that match all filters. Note that this
function is asynchronous. It does not block unless the iterator is called
before the iterator gets results.
This function operates in two different modes depending on whether you use the
delimiter
argument or not:- Regular mode (default): Lists all files in the bucket without any concept of hierarchy. (GCS doesn't have real directory hierarchies.)
-
Directory emulation mode: If you specify the
delimiter
argument, it is used as a path separator to emulate a hierarchy of directories.
Arguments
- path_prefix (Required)
-
A Google Cloud Storage path of format
/bucket
or/bucket/prefix
, for example,/bucket/foo/2001
. Only objects whose fullpath starts withpath_prefix
will be returned. - marker = None (Optional)
-
String. Another path prefix. Only objects whose fullpath starts
lexicographically after marker exclusively will be returned. The file used as `marker` is not returned. For example, if
you want all fileslisted starting at
superduperfoo3.txt
to be listed, you specify the file immediately precedingsuperduperfoo3.txt
, for example:stat = gcs.listbucket("/my_bucket/foo", marker='/my_bucket/foo/superduperfoo2.txt')
One way to use this parameter is to use it withmax_keys
to "page through" the bucket file names. - max_keys = None (Optional)
-
Integer. Specifies the maximum number of objects to be returned. Use
it if you know how many objects you want. (Otherwise, the GCS client
library automatically buffers and paginates through all results.) Can be
used with
marker
to page through filenames in a bucket.stats = gcs.listbucket(bucket + '/foo', max_keys=page_size, marker=stat.filename)
- delimiter = None (Optional)
- String. Turns on directory mode. You can specify one or multiple characters to be used as a directory separator.
- retry_params = None (Optional.)
- A RetryParams object in which you supply any changes you want to make to the default timeout and retry settings for this call.
Result Value
Returns an iterator of GCSFileStat objects over the matched files, sorted by filename. In regular mode, the returned
GCSFileStat
objects have the following data:-
filename
-
etag
(MD5 digest) -
st_size
(content length of headers) -
st_ctime
-
is_dir
Note: If the
GCSFileStat
object'sis_dir
property isTrue
, then the only other property in the object isfilename
. Ifis_dir
isFalse
, then theGCSFileStat
contains all of the other properties as well. - cloudstorage.open ( filename , mode = 'r' , content_type = None , options = None , read_buffer_size = storage_api.ReadBuffer.DEFAULT_BUFFER_SIZE , retry_params = None )
-
In read mode (
r
) opens the specified GCS object for read. In write modew
, if the specified file exists, it opens it for an overwrite (append is not supported). If the file doesn't exist, it is created in the specified bucket.When you finish writing, if you want to read the file and/or store it at GCS, close the file using the
close
function. It is not an error if you don't callclose
, but the file will not be readable or persisted at GCS.Raises:
- cloudstorage.NotFoundError if in read mode and the specified object doesn't exist.
Arguments
- filename (Required)
-
The file to open, in the format
/bucket/object
. For example,/my_bucket/lyrics/southamerica/list5.txt
. - mode (Optional)
-
String. Specify
'r'
to open a file for
read (default). Specify
'w'
to open an
existing file for overwriting or to create a new file.
- content_type: (Optional)
-
String. Used only in write mode. You should specify the MIME type of
the file (You can specify any valid MIME type.) If you don't supply this
value, GCS defaults to the type
binary/octet-stream
when it serves the object. - options: (Optional)
-
Dict. Used only in write mode. GCS controls access to objects in buckets by means of an access control list (ACL). If you don't specify an ACL, GCS uses the bucket's default ACL . If you want to supply a specific ACL, you can do so by specifying the appropriate ACL in
options
. The valid values you can supply are listed in the GCS documentation for x-goog-aclIn
options
, you can also specify custom metadata, using x-goog-meta- headers.gcs_file = cloudstorage.open(filename, 'w', content_type='text/plain', options={'x-goog-acl': 'private','x-goog-meta-foo': 'foo', 'x-goog-meta-bar': 'bar'})
- read_buffer_size: (Optional)
-
Integer. Used only in read mode. If you don't set this value,
the default buffer size is used (recommended). When you read, you should
read by
read_buffer_size
for optimum prefetch performance. - retry_params = None (Optional.)
- A RetryParams object in which you supply any changes you want to make to the default timeout and retry settings for this call.
Result Value
Returns a reading or writing buffer, supporting a file-like interface on which you can invoke standard Python
read
,write
, andclose
functions. This buffer must be closed after you finish reading or writing. - cloudstorage.stat ( filename , retry_params = None )
-
Returns a GCSFileStat object containing file metadata.
Raises:
- cloudstorage.NotFoundError if in read mode and the specified bucket or object doesn't exist.
Arguments
- filename (Required)
-
The file to open, in the format
/bucket/object
. For example,/my_bucket/lyrics/southamerica/list5.txt
- retry_params = None (Optional.)
- A RetryParams object in which you supply any changes you want to make to the default timeout and retry settings for this call.
Result Value
Returns a GCSFileStat object containing file metadata.