Contents
Data | Method |
REST URI
Relative to
https://www.googleapis.com/prediction/v1.4
|
---|---|---|
hostedmodels Collection
|
Predict against a hosted model
prediction.hostedmodels.predict |
POST
|
trainedmodels Collection
↳ trainedmodels Resource |
Get model information
prediction.trainedmodels.get |
GET
|
Train a new model
prediction.trainedmodels.insert |
POST
|
|
Add streaming training
prediction.trainedmodels.update |
PUT
|
|
Delete a trained model
prediction.trainedmodels.delete |
DELETE
|
|
Predict against your own model
prediction.trainedmodels.predict |
POST
|
Standard query parameters
Query parameters that apply to all Google Prediction API operations are shown in the table below.
Notes (on API keys and auth tokens):
-
The
key
parameter is required with every request, unless you provide an OAuth 2.0 token with the request. - You must send an authorization token with every request that is marked (AUTHENTICATED) . OAuth 2.0 is the preferred authorization protocol.
-
You can provide an OAuth 2.0 token with any request in one of two ways:
-
Using the
access_token
query parameter like this:?access_token=
oauth2-token
-
Using the HTTP
Authorization
header like this:Authorization: Bearer
oauth2-token
-
Using the
All parameters are optional except where noted.
Parameter | Meaning | Notes |
---|---|---|
access_token
|
OAuth 2.0 token for the current user. |
|
callback
|
Callback function. |
|
fields
|
Selector specifying a subset of fields to include in the response. |
|
key
|
API key. (REQUIRED*) |
|
prettyPrint
|
Returns response with indentations and line breaks. |
|
quotaUser
|
Alternative to
userIp
.
|
|
userIp
|
IP address of the end user for whom the API call is being made. |
|
hostedmodels Collection
hostedmodels
The hostedmodels collection is a collection of publicly available trained models. These models can be free, but most have a usage fee associated with them, as described in their documentation. See a list of hosted models in the hosted model gallery .
Using hosted models is convenient when you don't have the time, resources, or expertise to build a model for a specific topic. If you have a model that you'd like to make public, follow the submission links in the hosted model gallery.
Sending a prediction request against a hosted model is nearly the same as sending a prediction against any other model; the only difference is the request URL.
prediction.hostedmodels.predict
Run a prediction request against a hosted model.
Request
POST https://www.googleapis.com/prediction/v1.4/hostedmodels/{hostedModelName}/predict
{
"input":{
"csvInstance":[ col1_value, col2_value, ... ]
}
}
Property Name | Value | Description |
---|---|---|
col1_value
,
col2_value
, ...
|
Array of
string
or
number
|
An array of entity features, as described by the hosted model's documentation. Note that string fields must be surrounded by escaped quotes. The array can be a mix of string and number columns. |
Try it now in the APIs Explorer!
Response
{ "kind": "prediction#output", "id": string, "selfLink": string, "outputLabel": string, "outputMulti": [ { "label": string, "score": double } ], "outputValue": double }
Property Name | Value | Description |
---|---|---|
kind
|
string
|
What kind of resource this is. |
id
|
string
|
The unique name for the predictive model. |
selfLink
|
string
|
A URL to re-request this resource. |
outputLabel
|
string
|
[ Categorical models only ] The most likely class label. |
outputMulti[]
|
list
|
[ Categorical models only ] A list of class labels with their estimated scores. |
outputMulti[].label
|
string
|
The class label. |
outputMulti[].score
|
double
|
A score for this class label. A few notes on the scores:
|
outputValue
|
double
|
[ Regression models only ] The estimated regression value . |
Invoking this method requires the use of a token with access
to:
https://www.googleapis.com/auth/prediction
trainedmodels Collection
trainedmodels
trainedmodels Resource
Represents a trained model.
{ "kind": "prediction#training", "id": string, "storageDataLocation": string, "storagePMMLLocation": string, "selfLink": string, "utility": [ { any value: double } ], "modelInfo": { "numberInstances": long, "modelType": string, "numberLabels": long, "classificationAccuracy": double, "classWeightedAccuracy": double, "confusionMatrix": { any value: { any value: double } }, "confusionMatrixRowTotals": { any value: double }, "meanSquaredError": double, }, "trainingStatus": string, "dataAnalysis": { // Present only in GET requests for models with warnings or errors "warnings": [ string array ] } }
Property Name | Value | Description |
---|---|---|
kind
|
string
|
What kind of resource this is. |
id
|
string
|
A name for the predictive model, unique within this user account. Naming
restrictions are 1-255 characters long, any mix of digits, lowercase
letters, dashes, and underscores:
[0-9a-z_\-]
|
storageDataLocation
|
string
|
Google Cloud Storage location of the training data file. |
storagePMMLLocation
|
string
|
Google Cloud Storage location of the preprocessing PMML file. See Importing PMML Models for details. |
selfLink
|
string
|
A URL to re-request this resource. |
utility[]
|
list
|
[
Categorical models only
] A class label weighting function,
which allows the importance weights for class labels to be specified.
See
The format of this array is:
|
modelInfo
|
object
|
Model metadata. |
modelInfo.numberInstances
|
long
|
Number of valid data instances used in the trained model. |
modelInfo.modelType
|
string
|
Type of predictive model: either
CLASSIFICATION
or
REGRESSION
.
|
modelInfo.numberLabels
|
long
|
[ Categorical models only ] Number of class labels in the trained model. |
modelInfo.classificationAccuracy
|
double
|
[ Categorical models only ] A number between 0.0 and 1.0, where 1.0 is 100% accurate. This is an estimate, based on the amount and quality of the training data, of the estimated prediction accuracy. You can use this is a guide to decide whether the results are accurate enough for your needs. This estimate will be more reliable if your real input data is similar to your training data. If you are retraining an existing model, the modelInfo field will show an accuracy value in even if the new training is not complete. This number will be the accuracy of the previously trained model, which is still usable, until the new model has finished training. |
modelInfo.classWeightedAccuracy
|
double
|
[ Categorical models only ] Estimated accuracy of the model, taking utility weights into account. |
modelInfo.confusionMatrix
|
object
|
[
Categorical models only
] An output
confusion
matrix
. This shows an estimate for how accurate this model will
be in actual use. See
prediction.trainedmodels.get()
for
information.
|
modelInfo.confusionMatrixRowTotals
|
object
|
A list of the confusion matrix row totals. See
prediction.trainedmodels.get()
for
more information.
|
modelInfo.meanSquaredError
|
double
|
[ Regression models only ] An estimated mean squared error. The can be used to measure the quality of the predicted model . |
trainingStatus
|
string
|
The current status of the training job. This can be one of following:
|
dataAnalysis
|
object
|
An object that is only present if there are problems training the model. |
dataAnalysis.warnings
|
string array
|
An array of strings describing recommendations, warnings, or errors in model data, training, or other aspects of the model. |
prediction.trainedmodels.get
Returns information about a trained model, including training status, confusion matrix, and estimated error values.
Note that this will not return successfully for a new model until training has completed successfully.
-
If this is an attempt to train a new model,
trainedmodels.get
will return HTTP 404 "No model found. Training running." until training completes successfully or not. If training completes successfully, the method will return the trainedmodels resource; if training fails, this method will return an HTTP 404 "No model found. Model must first be trained" -
If this is an attempt to retrain an existing model,
trainedmodels.get
will always return a trainedmodels resource. If the retraining succeeds, the resource will be for the new model. If the retraining fails, the resource will be for the previous model, but thetrainingStatus
property value will beERROR
.
Important: Only the user who trained a model can call this method.
Confusion Matrices
This method returns a
modelInfo.confusionMatrix
property that
describes a
confusion
matrix
. This matrix describes how many labels were properly and improperly
guessed for each training entry during training. This is useful for evaluating
the accuracy of training over your data; if the matrix indicates that specific
values are often confused, you might want to change your training data structure.
Here is an example confusion matrix for a language identification model. In this model, for all entries with the label "French", 12 were properly identified as French and 0.5 were improperly identified as English. You can see the values for items labeled "Spanish" and "English" as well. Numbers can be fractions because they are averaged across multiple training runs. confusionMatrixRowTotals describes the total number of each label applied.
"confusionMatrix": { "French": { "French": 12.0, "English": 0.5 }, "Spanish": { "Spanish": 6.0, "English": 1.0 }, "English": { "French": 0.5, "Spanish": 2.0, "English": 20.0 } }, "confusionMatrixRowTotals": { "French": 12.5, "Spanish": 7.0, "English": 22.5 } }
Request
GET https://www.googleapis.com/prediction/v1.4/trainedmodels/{id}
Try it now in the APIs Explorer!
Response
Returns a trainedmodels resource .
Invoking this method requires the use of a token with access
to:
https://www.googleapis.com/auth/prediction
prediction.trainedmodels.insert
Asynchronous call to start training a model. When training has completed succesfully,
prediction.trainedmodels.get()
will
return information about the training model. Training can take up to several hours,
depending on the complexity and size of the data, but will typically take less
time.
Note that each time you call this method, it will overwrite any existing model with the same ID if training succeeds. If training fails, the existing model will not be replaced. You can continue to run queries against an existing model during the training of a new model.
In order to train against data stored in Google Cloud Storage, you must have read or owner rights on that data. You must have read permission on the Google Cloud Storage object that holds your training data. By default, when you upload a file, you are assigned owner rights. When you train against that model, your training method call must be authenticated to a user with read rights.
Note that a model can be used only by the user who calls
trainedmodels.insert
to create that model.
Learn more about modifying Google Cloud Storage object ACLs.
Authentication is required.
Request
POST https://www.googleapis.com/prediction/v1.4/trainedmodels { "id": string, "storageDataLocation": string, "storagePMMLLocation": string, // Only used for PMML preprocessing "utility": [ // Optional, categorical models only { any value: double } ] }
Property Name | Value | Description |
---|---|---|
id
|
string
|
A name for the model. The ID must be unique within this user account. Naming restrictions are 1-255 characters long, any mix of digits, lowercase letters, dashes, and underscores: [0-9a-z_\-] |
storageDataLocation
|
string
|
[
Optional
] The Google Cloud Storage path to your training data,
without
the
If not specified, you can add examples to the empty model by calling update() . However, you will not be able to run any predictions until you add at least one example. |
storagePMMLLocation
|
string
|
[
Optional
] If you want to preprocess your data using a PMML
transform, this is the location of your PMML file in Google Cloud Storage,
without the
gs://
prefix.
|
utility
|
array of values |
[ Optional, categorical models only ] Assigns a numeric weight to one or more categories in the training data. The purpose of this property is to prevent false positives by assigning a relative weight to specific categories, where the higher the value, the higher the associated cost with mislabeling something that is actually in that category as something else. Example: In a spam identification model, identifying some spam as non-spam is relatively lower cost than identifying some non-spam as spam. Therefore you would include a utility property with the following value (assuming your non-spam examples have the label 'not_spam'):
Unlisted labels receive a default weight of 1.0, so the previous example would assign 'spam' a utility value of 1.0. |
Try it now in the APIs Explorer!
Response
Returns a trainedmodels resource if training has completed successfully, or an HTTP 404 error if training is not yet complete, or failed because of an error. Here is an example error message when training has not yet completed:
{ "error": { "errors": [{ "domain": "global", "reason": "notFound", "message": "No model found. Model must first be trained." }], "code": 404, "message": "No model found. Model must first be trained." } }
Invoking this method requires the use of a token with access
to:
https://www.googleapis.com/auth/prediction
prediction.trainedmodels.update
[ Categorization models only ] Add new data to a trained model.
Adding new data to a trained model is called streaming training . Streaming training trains a previously trained model against a new example. This is useful if you have a regular stream of new information that you'd like to add to your model as it becomes available, rather than having to recompile, re-upload, and retrain the data with batches of new data. The model is not retrained each time it receives a new example; rather, it retrains after every N new examples have been added, where N is a small number.
Note that the system may weight newer streamed examples more than earlier
examples. If you do not want this, you should add the examples to your training
data and retrain the system against all the data by calling
prediction.trainedmodels.insert()
.
Note: If you retrain a model against its original training data file, all the streamed data will be lost. If you want to retain the streamed data, you must store it and update the model data yourself.
Authentication is required.
Request
PUT https://www.googleapis.com/prediction/v1.4/trainedmodels/{id}
{
"label" : my_label
"csvInstance: [ col1, col2....colN ]
}
Property Name | Value | Description |
---|---|---|
label
|
string
|
The category label to assign to this example. Only category examples can be streamed to an existing model. |
csvInstance
|
Array of
string
or
number
|
The example data as an array of columns, in the same format as the CSV file . |
Try it now in the APIs Explorer!
Response
Invoking this method requires the use of a token with access
to:
https://www.googleapis.com/auth/prediction
prediction.trainedmodels.delete
Delete a trained model.
Authentication is required.
Request
DELETE https://www.googleapis.com/prediction/v1.4/trainedmodels/{id}
Try it now in the APIs Explorer!
Response
{}
Invoking this method requires the use of a token with access
to:
https://www.googleapis.com/auth/prediction
prediction.trainedmodels.predict
Run a prediction against your model.
Here's an example request to a hosted model that predicts a person's height, if the model expects a string gender ("M" or "F"), two height numbers, and a string country name:
{ "input":{ "csvInstance":["M", 1.59, 1.51,"France"] } }
Authentication is required.
Request
POST https://www.googleapis.com/prediction/v1.4/trainedmodels/{id}/predict
{
"input":{
"csvInstance":[ col1_value, col2_value, ... ]
}
}
Property Name | Value | Description |
---|---|---|
col1_value
,
col2_value
, ...
|
Array of
string
or
number
|
An array of entity features, as described by the model's schema. Note that string fields must be surrounded by escaped quotes. The array can be a mix of string and number columns. |
Try it now in the APIs Explorer!
Response
{ "kind": "prediction#output", "id": string, "selfLink": string, "outputLabel": string, "outputMulti": [ { "label": string, "score": double } ], "outputValue": double }
Property Name | Value | Description |
---|---|---|
kind
|
string
|
What kind of resource this is. |
id
|
string
|
The name of the predictive model. |
selfLink
|
string
|
A URL to re-request this resource. |
outputLabel
|
string
|
[ Categorical models only ] The most likely class label. |
outputMulti[]
|
list
|
[ Categorical models only ] A list of class labels with their estimated scores. |
outputMulti[].label
|
string
|
The class label. |
outputMulti[].score
|
double
|
A score for this label. Some notes on the score:
|
outputValue
|
double
|
[ Regression models only ] The estimated regression value. |
Invoking this method requires the use of a token with access
to:
https://www.googleapis.com/auth/prediction