This page gives a quick example of using the Prediction API that you can set up and run in 15 minutes. After trying out this example, you can read the full documentation to learn how to use it for your own specific needs.
Contents
Prerequisites
- You must have a Google Account , with a Google name and password.
-
You must have a
Google Developers Console
project
with
Google Prediction
and
Google Cloud Storage
activated.
To activate an API for your project, do the following:
- Go to the Google Developers Console .
- Select a project, or create a new one.
- In the sidebar on the left, expand APIs & auth .
- Click APIs .
- In the displayed list of available APIs, find the one you want to activate, and set its status to ON .
Some APIs also prompt you to accept their Terms of Service before you can activate them.
Google Cloud Storage is required by Google Prediction if you want to train from a CSV file, which is the use case covered here. However, if you wish to train from instances passed in the request or by updating an empty model, it is sufficient to only have Google Prediction enabled.
The Problem
Imagine that your company receives emails requesting help in several different languages, and you want to route the email to someone with the appropriate language skills. The problem here is to detect whether a given phrase is English, Spanish, or French.
To do this, you must create some training data to train the prediction engine. This training data consists of several text entries, each labeled "English," "Spanish," or "French." After training the system on this data, you will be able to submit arbitrary words or phrases in any of those languages, and the prediction engine will categorize your data as being closest to one of them.
The Solution
Here's how to run Hello Prediction to determine the language of an arbitrary text snippet:
- Upload training data. We will provide you a sample training data file that includes English, Spanish, and French language examples. You must upload this to your Google Cloud Storage account.
- Train the system. Tell the Prediction API to load your training data from Google Cloud Storage and analyze it. This is an asynchronous process, so you'll have to query the server periodically to check the status of the training session. Training must be complete before you can start to send queries.
- Send queries. After training is done, you can send queries containing phrases in English, Spanish, or French, and Google Prediction will respond with the language of that text. You can run this step as many times as you want.
1. Upload Training Data
In this step you will upload a file of training data to your Google Cloud Storage account.
Download this training file (language_id.txt) , which contains English, French, and Spanish training data entries. The format of the training data is a comma-separated values file with many entries and two columns: the second column is a long text snippet in a single language; the first column is the string name of the snippet language. Open the file to see what the training entries look like.
-
Upload the file to Google Cloud Storage:
- Go to the Google Developers Console .
- Select the project under which to store the data.
- Select the "Cloud Storage" tab.
- Click "New Bucket" or select an existing bucket.
- Click on the bucket to which to upload the file, and click "Upload"
- Create a new bucket by clicking New Bucket .
- Select the bucket and click Upload , and upload the language_id.txt file from your computer.
-
Copy the bucket/path name of your file from the
path column in the Google Cloud Storage Manager. For example:
mybucket/language_id.txt
2. Train the System
The next step is to train the system against the training data that you
uploaded. To do this, call
trainedmodels.insert()
, specifying the
following parameters:
-
project:
The Project Number listed in the Overview tab in Google Developers Console . -
id:
The string id that will be used to reference the model. -
storageDataLocation:
The Google Cloud Storage path where you uploaded your training data.