Please note that the contents of this offline web site may be out of date. To access the most recent documentation visit the online version .
Note that links that point to online resources are green in color and will open in a new window.
We would love it if you could give us feedback about this material by filling this form (You have to be online to fill it)



CRC32C and Installing crcmod

CRC32C and Installing crcmod

Overview

Google Cloud Storage provides a cyclic redundancy check (CRC) header that allows clients to verify the integrity of object contents. For non-composite objects GCS also provides an MD5 header to allow clients to verify object integrity, but for composite objects only the CRC is available. gsutil automatically performs integrity checks on all uploads and downloads. Additionally, you can use the “gsutil hash” command to calculate a CRC for any local file.

The CRC variant used by Google Cloud Storage is called CRC32C (Castagnoli), which is not available in the standard Python distribution. The implementation of CRC32C used by gsutil is provided by a third-party Python module called crcmod .

The crcmod module contains a pure-Python implementation of CRC32C, but using it results in very poor performance. A Python C extension is also provided by crcmod, which requires compiling into a binary module for use. gsutil ships with a precompiled crcmod C extension for Mac OS X; for other platforms, see the installation instructions below.

At the end of each copy operation, the gsutil cp and rsync commands validate that the checksum of the source file/object matches the checksum of the destination file/object. If the checksums do not match, gsutil will delete the invalid copy and print a warning message. This very rarely happens, but if it does, please contact gs-team @ google . com .

Configuration

To determine if the compiled version of crcmod is available in your Python environment, you can inspect the output of the gsutil version command for the “compiled crcmod” entry::

$ gsutil version -l
...
compiled crcmod: True
...

If your crcmod library is compiled to a native binary, this value will be True. If using the pure-Python version, the value will be False.

To control gsutil’s behavior in response to crcmod’s status, you can set the “check_hashes” configuration variable. For details on this variable, see the surrounding comments in your gsutil configuration file. If check_hashes is not present in your configuration file, rerun gsutil config to regenerate the file.

Installation

CentOS, RHEL, and Fedora

To compile and install crcmod:

sudo yum install gcc python-devel python-setuptools
sudo easy_install -U pip
sudo pip uninstall crcmod
sudo pip install -U crcmod

Debian and Ubuntu

To compile and install crcmod:

sudo apt-get install gcc python-dev python-setuptools
sudo easy_install -U pip
sudo pip uninstall crcmod
sudo pip install -U crcmod

Mac OS X

gsutil distributes a pre-compiled version of crcmod for OS X, so you shouldn’t need to compile and install it yourself. If for some reason the pre-compiled version is not being detected, please let the Google Cloud Storage team know (see “gsutil help support”).

To compile manually on OS X, you will first need to install XCode and then run:

sudo easy_install -U pip
sudo pip install -U crcmod

Windows

An installer is available for the compiled version of crcmod from the Python Package Index (PyPi) at the following URL:

https://pypi.python.org/pypi/crcmod/1.7

MSI installers are available for the 32-bit versions of Python 2.6 and 2.7.

Authentication required

You need to be signed in with Google+ to do that.

Signing you in...

Google Developers needs your permission to do that.