Please note that the contents of this offline web site may be out of date. To access the most recent documentation visit the online version .
Note that links that point to online resources are green in color and will open in a new window.
We would love it if you could give us feedback about this material by filling this form (You have to be online to fill it)



BigQuery API Quickstart

This page provides a 25 minute exercise in building a simple command-line application (in either Java or Python) with the Google BigQuery API . This simple application will run a query on one of the available Sample Datasets and display the result.

  1. Prerequisites
  2. Create a New BigQuery Project
  3. Generate a Client ID and Client Secret for your Application
  4. Import Google API Client Libraries
  5. Authorize Access to the BigQuery API
  6. Running Queries
  7. Next Steps
  8. Complete Source Code

Prerequisites

  1. Review basic BigQuery concepts and terminology

    This tutorial and the rest of the BigQuery documentation uses basic terms and concepts that you should be familiar with:

    • Projects: A project is the container for BigQuery data. Datasets and jobs are contained inside a project, billing is done per-project, and projects have ACLs that determine access to the datasets and the jobs they contain. Projects are created and managed using the Google APIs Console .

    • Jobs: Jobs are used to start and manage long running tasks, such as querying data, loading and exporting data, and copying data between tables. You can learn more about jobs in the reference section .

  2. Read about how to activate the BigQuery service from a Google APIs Console project

    If you are a member of an existing Google APIs Console project that already has BigQuery enabled, you can use that project to build your application. To learn how to sign up for BigQuery, check out our sign up process .

  3. Be able to write and run a simple application in either Java or Python

    The BigQuery service provides a REST-based API that can be programmatically accessed using Java or Python. In addition to a basic understand of how to develop Java or Python applications, you should also be able to download and install additional libraries before attempting this tutorial.

Create a New BigQuery Project

Before you can use the BigQuery service from server-side code or installed applications, you must first create a Google APIs Console project. This project provides three unique values that you must use with your application to identify it to Google and authorize it to use the BigQuery API. The first is a project number , which identifies which project you are making queries from. The other two values are the client ID and a client secret . These values identify your application to Google to facilitate authorization of the BigQuery API. They also enable per-project accounting for quotas and billing.

  1. Visit the Google APIs Console , create a new APIs Console project, and name it "BigQuery App QuickStart."
  2. Under the Services tab, turn on the BigQuery API .
  3. Finally, note the number of the project you just created. The project number is available in the URL of the project in the API console ( https://code.google.com/apis/console/#project: 12345XXXXXXX ). We will be using this project number to make calls to the BigQuery API.

Generate a Client ID and Client Secret for your Application

In order to use OAuth 2.0 for web server applications , your application will need a client ID and client secret.

  1. Visit the Google APIs Console .
  2. Create a new project, or select an existing one.
  3. Under the Services tab, make sure that the BigQuery API service is activated.
  4. Click on API Access on the left, then Create an OAuth2.0 client ID .
  5. Enter a product name for your application. You may leave the logo empty for development, but you should supply it for your production application.
  6. Select Installed application .
  7. Click Create client ID .
  8. Store your application's client ID and client secret in a file accessible to your application called client_secrets.json . You can use the sample below as a template - replace the client_id and client_secret values with the ones you generated above. Your application will refer to this file as part of the flow of authorizing access to the BigQuery API.
{
    "installed": {
        "client_id": "XXXXXXXXXXXXXXXXXXXX.apps.googleusercontent.com",
        "client_secret": "XXXXXXXXXXXXXXXXXXXX",
        "redirect_uris": ["urn:ietf:wg:oauth:2.0:oob"],
        "auth_uri": "https://accounts.google.com/o/oauth2/auth",
        "token_uri": "https://accounts.google.com/o/oauth2/token"
    }
}

For more about using the client_secrets.json format, visit this page .

Import Google API Client Libraries

Our samples use the Google APIs Client libraries to make calls to the BigQuery API. Download and install the appropriate library from the links below:

If you are developing with Java, Maven is a useful way to manage and organize dependencies. The Google APIs Client Libraries for Java are available in the central Maven repository.

Authorize Access to the BigQuery API

To authorize access to BigQuery, our first step is to write the code necessary to read the values from the client_secrets.json file. The application then prompts the user to authorize the application to access the BigQuery API, and uses the resulting authorization code to create a new BigQuery client. Much of the complexity of these tasks is handled by methods available in the Google APIs Client libraries.

The sample featured in this quick start guide uses an OAuth 2.0 installed application authorization flow. This is the suggested method for providing user-authorized access to the BigQuery API for installed or desktop applications. Here are the steps involved in this authorization flow:

  1. The application displays a URL to a Google API authorization page.
  2. The user authorizes access to the API from this page. If the user grants access to the API, an access code is displayed.
  3. The user pastes this access code into the command line prompt provided by the application.
  4. The application uses this access code to obtain a credential to create an authorized BigQuery API client, which can be used to run queries indefinitely unless the user revokes access via their Google account settings .

The user interface for authorizing installed applications can be improved to eliminate the manual access code copy-paste step, with additional code not covered in this tutorial. See our guide for Using OAuth 2.0 for Installed Applications to learn more. For information about other authorization flows, such as those for web applications or server-to-server authorization using service accounts, see our guide to authorizing access to the BigQuery API using OAuth 2.0 .

Java

This sample uses the Google APIs Client Library for Java .

  // Enter your Google Developer Project number or string id.
  private static final String PROJECT_ID = "name_of_project";

  // Use a Google APIs Client standard client_secrets.json OAuth 2.0 parameters file.
  private static final String CLIENTSECRETS_LOCATION = "client_secrets.json";

  // Objects for handling HTTP transport and JSON formatting of API calls.
  private static final HttpTransport HTTP_TRANSPORT = new NetHttpTransport();
  private static final JsonFactory JSON_FACTORY = new JacksonFactory();

  public static void main(String[] args) throws IOException {
    GoogleClientSecrets clientSecrets = GoogleClientSecrets.load(new JacksonFactory(),
        new InputStreamReader(ApiQuickstart.class.getResourceAsStream(CLIENTSECRETS_LOCATION)));

    Credential credential = getCredentials(clientSecrets, new Scanner(System.in));
    Bigquery bigquery = new Bigquery(HTTP_TRANSPORT, JSON_FACTORY, credential);
    String query = "SELECT TOP( title, 10) as title, COUNT(*) as revision_count "
        + "FROM [publicdata:samples.wikipedia] WHERE wp_namespace = 0;";
    runQueryRpcAndPrint(bigquery, PROJECT_ID, query, System.out);
  }

  static Credential getCredentials(GoogleClientSecrets clientSecrets, Scanner scanner)
      throws IOException {
    String authorizeUrl = new GoogleAuthorizationCodeRequestUrl(
        clientSecrets, clientSecrets.getInstalled().getRedirectUris().get(0),
        Collections.singleton(BigqueryScopes.BIGQUERY)).build();
    System.out.println(
        "Paste this URL into a web browser to authorize BigQuery Access:\n" + authorizeUrl);
    System.out.println("... and paste the code you received here: ");
    String authorizationCode = scanner.nextLine();

    // Exchange the auth code for an access token.
    GoogleAuthorizationCodeFlow flow = new GoogleAuthorizationCodeFlow.Builder(
        HTTP_TRANSPORT, JSON_FACTORY, clientSecrets, Arrays.asList(BigqueryScopes.BIGQUERY))
        .build();
    GoogleTokenResponse response = flow.newTokenRequest(authorizationCode)
        .setRedirectUri(clientSecrets.getInstalled().getRedirectUris().get(0)).execute();
    return flow.createAndStoreCredential(response, null);
  }

Python

This sample uses the Google APIs Client Library for Python .

# Enter your Google Developer Project number
PROJECT_NUMBER = '12345XXXXXXX'

FLOW = flow_from_clientsecrets('client_secrets.json',
                               scope='https://www.googleapis.com/auth/bigquery')

storage = Storage('bigquery_credentials.dat')
credentials = storage.get()

if credentials is None or credentials.invalid:
  from oauth2client import tools
  # Run oauth2 flow with default arguments.
  credentials = tools.run_flow(FLOW, storage, tools.argparser.parse_args([]))

http = httplib2.Http()
http = credentials.authorize(http)

bigquery_service = build('bigquery', 'v2', http=http)
  

Running Queries

BigQuery provides two API methods for running queries. The synchronous query method involves a single API call, and will wait to provide a response until the query is complete (unless provided with an optional timeout value). The asynchronous query method will "insert" a query job, and immediately return an ID for that job. You then use this job ID to poll for the status of the query, and retrieve the query result if complete. This example uses the synchronous query method. For more information about different ways to query using BigQuery, see querying data .

Running the Query

To run a synchronous query, the application makes an API call that passes the query itself (as a string), along with the project number that the query will be run under for billing and quota purposes. The query in the example below finds the top 10 most revised articles on Wikipedia from a 350 million row dataset. BigQuery uses a SQL-like syntax, which is described in our query reference guide.

Java

This sample uses the Google APIs Client Library for Java .

  /**
   * Runs a synchronous BigQuery query and displays the result.
   *
   * @param bigquery An authorized BigQuery client
   * @param projectId The current project id
   * @param query A String containing a BigQuery SQL statement
   * @param out A PrintStream for output, normally System.out
   */
  static void runQueryRpcAndPrint(
      Bigquery bigquery, String projectId, String query, PrintStream out) throws IOException {
    QueryRequest queryRequest = new QueryRequest().setQuery(query);
    QueryResponse queryResponse = bigquery.jobs().query(projectId, queryRequest).execute();
    if (queryResponse.getJobComplete()) {
      printRows(queryResponse.getRows(), out);
      if (null == queryResponse.getPageToken()) {
        return;
      }
    }
    // This loop polls until results are present, then loops over result pages.
    String pageToken = null;
    while (true) {
      GetQueryResultsResponse queryResults = bigquery.jobs()
          .getQueryResults(projectId, queryResponse.getJobReference().getJobId())
          .setPageToken(pageToken).execute();
      if (queryResults.getJobComplete()) {
        printRows(queryResults.getRows(), out);
        pageToken = queryResults.getPageToken();
        if (null == pageToken) {
          return;
        }
      }
    }
  }

Python

This sample uses the Google APIs Client Library for Python .

# Create a query statement and query request object
query_data = {'query':'SELECT TOP(title, 10) as title, COUNT(*) as revision_count FROM [publicdata:samples.wikipedia] WHERE wp_namespace = 0;'}
query_request = bigquery_service.jobs()

# Make a call to the BigQuery API
query_response = query_request.query(projectId=PROJECT_NUMBER,
                                     body=query_data).execute()
  

Displaying the query result

Once the query has completed, the API returns the result set as a JSON object. In addition to the actual query results, the JSON response contains metadata about the query job, including a unique job ID and the schema of the result set. The application parses the query response and displays the resulting values.

Java

This sample uses the Google APIs Client Library for Java .

  private static void printRows(List<TableRow> rows, PrintStream out) {
    if (rows != null) {
      for (TableRow row : rows) {
        for (TableCell cell : row.getF()) {
          // Data.isNull() is the recommended way to check for the 'null object' in TableCell.
          out.printf("%s, ", Data.isNull(cell.getV()) ? "null" : cell.getV().toString());
        }
        out.println();
      }
    }
  }

Python

This sample uses the Google APIs Client Library for Python .

# Retrieve and print the result of the query API response
query_response = query_request.query(projectId=PROJECT_NUMBER,
                                     body=query_data).execute()
print 'Query Results:'
for row in query_response['rows']:
  result_row = []
  for field in row['f']:
    result_row.append(field['v'])
  print ('\t').join(result_row)
  

Next Steps

This tutorial covers only the most basic steps necessary to make calls to the BigQuery API from a command-line application. The BigQuery API also provides methods for running asynchronous queries, creating tables and datasets, listing projects, and more.

Complete Source Code

Here is the complete source code for the examples in this quick start guide, with relevant import statements, exception handling, and in the case of Java, authorization flow steps broken out into helper methods.

Java

This sample uses the Google APIs Client Library for Java .

package com.google.cloud.helix.samples;

import com.google.api.client.auth.oauth2.Credential;
import com.google.api.client.googleapis.auth.oauth2.GoogleAuthorizationCodeFlow;
import com.google.api.client.googleapis.auth.oauth2.GoogleAuthorizationCodeRequestUrl;
import com.google.api.client.googleapis.auth.oauth2.GoogleClientSecrets;
import com.google.api.client.googleapis.auth.oauth2.GoogleTokenResponse;
import com.google.api.client.http.HttpTransport;
import com.google.api.client.http.javanet.NetHttpTransport;
import com.google.api.client.json.JsonFactory;
import com.google.api.client.json.jackson.JacksonFactory;
import com.google.api.client.util.Data;
import com.google.api.services.bigquery.Bigquery;
import com.google.api.services.bigquery.BigqueryScopes;
import com.google.api.services.bigquery.model.GetQueryResultsResponse;
import com.google.api.services.bigquery.model.QueryRequest;
import com.google.api.services.bigquery.model.QueryResponse;
import com.google.api.services.bigquery.model.TableCell;
import com.google.api.services.bigquery.model.TableRow;

import java.io.IOException;
import java.io.InputStreamReader;
import java.io.PrintStream;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
import java.util.Scanner;

/**
 * Api Quickstart sample code. Runs queries using client_secrets.json authentication.
 */
public class ApiQuickstart {

  // Enter your Google Developer Project number or string id.
  private static final String PROJECT_ID = "name_of_project";

  // Use a Google APIs Client standard client_secrets.json OAuth 2.0 parameters file.
  private static final String CLIENTSECRETS_LOCATION = "client_secrets.json";

  // Objects for handling HTTP transport and JSON formatting of API calls.
  private static final HttpTransport HTTP_TRANSPORT = new NetHttpTransport();
  private static final JsonFactory JSON_FACTORY = new JacksonFactory();

  public static void main(String[] args) throws IOException {
    GoogleClientSecrets clientSecrets = GoogleClientSecrets.load(new JacksonFactory(),
        new InputStreamReader(ApiQuickstart.class.getResourceAsStream(CLIENTSECRETS_LOCATION)));

    Credential credential = getCredentials(clientSecrets, new Scanner(System.in));
    Bigquery bigquery = new Bigquery(HTTP_TRANSPORT, JSON_FACTORY, credential);
    String query = "SELECT TOP( title, 10) as title, COUNT(*) as revision_count "
        + "FROM [publicdata:samples.wikipedia] WHERE wp_namespace = 0;";
    runQueryRpcAndPrint(bigquery, PROJECT_ID, query, System.out);
  }

  static Credential getCredentials(GoogleClientSecrets clientSecrets, Scanner scanner)
      throws IOException {
    String authorizeUrl = new GoogleAuthorizationCodeRequestUrl(
        clientSecrets, clientSecrets.getInstalled().getRedirectUris().get(0),
        Collections.singleton(BigqueryScopes.BIGQUERY)).build();
    System.out.println(
        "Paste this URL into a web browser to authorize BigQuery Access:\n" + authorizeUrl);
    System.out.println("... and paste the code you received here: ");
    String authorizationCode = scanner.nextLine();

    // Exchange the auth code for an access token.
    GoogleAuthorizationCodeFlow flow = new GoogleAuthorizationCodeFlow.Builder(
        HTTP_TRANSPORT, JSON_FACTORY, clientSecrets, Arrays.asList(BigqueryScopes.BIGQUERY))
        .build();
    GoogleTokenResponse response = flow.newTokenRequest(authorizationCode)
        .setRedirectUri(clientSecrets.getInstalled().getRedirectUris().get(0)).execute();
    return flow.createAndStoreCredential(response, null);
  }



  /**
   * Runs a synchronous BigQuery query and displays the result.
   *
   * @param bigquery An authorized BigQuery client
   * @param projectId The current project id
   * @param query A String containing a BigQuery SQL statement
   * @param out A PrintStream for output, normally System.out
   */
  static void runQueryRpcAndPrint(
      Bigquery bigquery, String projectId, String query, PrintStream out) throws IOException {
    QueryRequest queryRequest = new QueryRequest().setQuery(query);
    QueryResponse queryResponse = bigquery.jobs().query(projectId, queryRequest).execute();
    if (queryResponse.getJobComplete()) {
      printRows(queryResponse.getRows(), out);
      if (null == queryResponse.getPageToken()) {
        return;
      }
    }
    // This loop polls until results are present, then loops over result pages.
    String pageToken = null;
    while (true) {
      GetQueryResultsResponse queryResults = bigquery.jobs()
          .getQueryResults(projectId, queryResponse.getJobReference().getJobId())
          .setPageToken(pageToken).execute();
      if (queryResults.getJobComplete()) {
        printRows(queryResults.getRows(), out);
        pageToken = queryResults.getPageToken();
        if (null == pageToken) {
          return;
        }
      }
    }
  }


  private static void printRows(List<TableRow> rows, PrintStream out) {
    if (rows != null) {
      for (TableRow row : rows) {
        for (TableCell cell : row.getF()) {
          // Data.isNull() is the recommended way to check for the 'null object' in TableCell.
          out.printf("%s, ", Data.isNull(cell.getV()) ? "null" : cell.getV().toString());
        }
        out.println();
      }
    }
  }

}

Python

This sample uses the Google APIs Client Library for Python .

import httplib2
import pprint
import sys

from apiclient.discovery import build
from apiclient.errors import HttpError

from oauth2client.client import AccessTokenRefreshError
from oauth2client.client import OAuth2WebServerFlow
from oauth2client.client import flow_from_clientsecrets
from oauth2client.file import Storage
from oauth2client.tools import run


# Enter your Google Developer Project number
PROJECT_NUMBER = '12345XXXXXXX'

FLOW = flow_from_clientsecrets('client_secrets.json',
                               scope='https://www.googleapis.com/auth/bigquery')


def main():

  storage = Storage('bigquery_credentials.dat')
  credentials = storage.get()

  if credentials is None or credentials.invalid:
    from oauth2client import tools
    # Run oauth2 flow with default arguments.
    credentials = tools.run_flow(FLOW, storage, tools.argparser.parse_args([]))

  http = httplib2.Http()
  http = credentials.authorize(http)

  bigquery_service = build('bigquery', 'v2', http=http)

  try:
    query_request = bigquery_service.jobs()
    query_data = {'query':'SELECT TOP( title, 10) as title, COUNT(*) as revision_count FROM [publicdata:samples.wikipedia] WHERE wp_namespace = 0;'}

    query_response = query_request.query(projectId=PROJECT_NUMBER,
                                         body=query_data).execute()
    print 'Query Results:'
    for row in query_response['rows']:
      result_row = []
      for field in row['f']:
        result_row.append(field['v'])
      print ('\t').join(result_row)

  except HttpError as err:
    print 'Error:', pprint.pprint(err.content)

  except AccessTokenRefreshError:
    print ("Credentials have been revoked or expired, please re-run"
           "the application to re-authorize")

if __name__ == '__main__':
  main()
  

Authentication required

You need to be signed in with Google+ to do that.

Signing you in...

Google Developers needs your permission to do that.