GQL Reference
GQL maps roughly to SQL: you can think of GQL
kind
as a SQL table, a GQL
entity
as a SQL row, and a GQL
property
as a SQL column. However, a SQL
row-column lookup is limited to a single value, whereas in GQL a
property can be a multiple value property.
Grammar
The GQL grammar is summarized as follows:
<query> :=
SELECT ( * | [ DISTINCT ] <aggregated-property>+, )
[ FROM <kind> ]
[ WHERE <compound-condition> ]
[ GROUP BY <property-name>+, ]
[ ORDER BY ( <property-name> [ ASC | DESC ] )+, ]
[ OFFSET <result-position> [ “+” <result-position> ]
[ LIMIT ( <result-position> |
FIRST “(” <result-position> ,
<result-position> “)” ) ]
<aggregated-property> :=
<property-name>
| aggregator “(” <property-name> “)”
<aggregator> := FIRST
<compound-condition> := <condition>+AND
<condition> :=
<property-name> IS NULL
| <property-name> <comparator> <value>
| <value> <comparator> <property-name>
<comparator> :=
=
| <
| <=
| >
| >=
| IN
| CONTAINS
| HAS ANCESTOR
| HAS DESCENDANT
<result-position> := <binding-site> | <integer-literal>
<value> :=
<binding-site>
| <synthetic-literal>
| <string-literal>
| <integer-literal>
| <double-literal>
| <boolean-literal>
| <null-literal>
<synthetic-literal> :=
KEY “(”
[ “DATASET” “(“ <string-literal> “)” “,” ]
[ “NAMESPACE” “(“ <string-literal> “)” “,” ]
<key-path-element>+, “)”
| BLOB “(” <string-literal> “)”
| BLOBKEY “(” <string-literal> “)”
| DATETIME “(” <string-literal> “)”
<key-path-element> :=
<kind> "," ( <integer-literal> | <string-literal> )
<kind> := <name>
<property-name> := <name>
In the above grammar list, note that:
-
The symbol
+,
after an expression indicates that it can be repeated, with repeated expressions separated by a comma. -
The use of
+AND
above means anAND
-separated set of conditions.
The following are some examples that return entire entities:
SELECT * FROM myKind WHERE myProp >= 100 AND myProp < 200
SELECT * FROM myKind OFFSET @startCursor LIMIT 50
SELECT * FROM myKind ORDER BY myProp DESC
Every GQL query string always begins with
SELECT <something>
, where
<something>
is one of the following:
-
*
-
<property-list>
, a comma delimited list of properties to be returned from the query. -
__key__
, which returns keys only.
Similar to SQL, GQL keywords are case insensitive. Kind, property, and binding site names are case sensitive.
A GQL query returns zero or more entity results of the requested kind, with each result consisting of an entire entity or some subset of the properties of the entity, or even just the entity key . For example,
-
SELECT * FROM myKind
-
SELECT __key__ FROM myKind
-
SELECT title, year FROM Song WHERE composer = 'Lennon, John'
A result list produced by
SELECT *
or
SELECT __key__
never contains
duplicates. A result list produced by
SELECT <property-list>
may contain
multiple results from one entity, typically when any of those properties are multiple value properties.
Tip:
SELECT __key__
or
SELECT <property-list>
queries are faster than
SELECT *
queries, and are cheaper, because they are charged at the less
expensive
small operations
rates, rather than at
the read rates. See also
projection queries
.
Clauses
The following optional
SELECT
clauses are recognized:
Clause | Description |
---|---|
DISTINCT
|
Specifies that only completely unique results will be returned. Normally used only with
projection queries
because non-projection queries return unique results. If you use
DISTINCT
in a projection query where more than one entity has the same values in the properties being projected, only the first entity is returned. Note that a query string can use either
DISTINCT
or
GROUP BY
, but not both.
|
GROUP BY
|
Groups data from a selected set of entities into a set of summary rows by the value of the specfied property or properties. |
FROM
|
Limits the result set to those entities of the given kind. A query string without a
FROM
clause is called a
kindless query
and the only filtering allowed in the
WHERE
clause is filtering by
__key__
.
|
WHERE
|
Limits the result set to those entities that meet one or more conditions. Each condition compares a property of the entity with a value using a comparison operator. If multiple conditions are combined with the
AND
keyword, then an entity must meet all of the conditions to be returned by the query. GQL does not have an
OR
operator.
|
ORDER BY
|
Causes results to be sorted by the specified properties. The
ORDER BY
clause can specify multiple sort orders as a comma-delimited list, evaluated from left to right. Specify
ASC
for ascending or
DESC
for descending order. Note that the order is applied to each property. If the direction is not specified, it defaults to
ASC
. If no
ORDER BY
clause is specified, the order of the results is undefined and may change over time.
|
OFFSET
|
Specifies offsets into the result set: either a cursor, or a count, or both. If
OFFSET
has two
<result-position>
s, the left one must be a cursor and the right one must be an integer. Note that an
OFFSET
with an integer starts at the beginning or at the cursor, then discards the specified number of entities, and so you still incur the cost of reading those entities. (Note also that
OFFSET
and
LIMIT
are independent.)
|
LIMIT
|
Limits query results to a count, to results preceding a cursor, or both. Often used to page through results of a query. If LIMIT has two
<result-position>
s, one must be a cursor and the other must be an integer. (Note that
OFFSET
and
LIMIT
are independent.)
|
How to form entity and property names
Kind, property, and binding site names are formed as follows:
-
With any sequence of letters, digits, underscores, dollar signs, or unicode characters in the range from
U+0080
toU+FFFF
(inclusive), so long as the name does not begin with a digit. For example,foo
,bar17
,x_y
,big$bux
,__qux__
. -
You can also use a non-empty backquoted string:
`fig-bash`
or`x.y`
. A backquote character can be represented in a backquoted name by doubling it, for example,`silly``putty`
. A backquoted name can contain escaped characters . -
An unquoted name can match a predefined name, but must not match a keyword. A backquoted name can contain any character except a newline. (It can contain a newline via
\n
, but not as a raw newline.) -
Names are case-sensitive.
How to form literals
You can use the following literals in a comparison:
Literal Type | Description |
---|---|
string
|
Formed following these rules:
|
integer
|
A sequence of decimal digits with the following options or characteristics:
|
double
|
A sequence of decimal digits with the following options or characteristics:
|
boolean
|
Can be the values
TRUE
or
FALSE
, case-insensitive.
|
null
|
Represents
NULL
. Case insensitive.
|
Synthetic literals
A synthetic literal is a value that is constructed by a function. The following table lists the supported synthetic literals:
Literal Name | Description |
---|---|
KEY |
KEY([DATASET(<dataset>),] [NAMESPACE(<namespace>),] <key-path-element>*,)
represents a key.
If
<dataset>
and
<namespace>
are not supplied, defaults from the current query context are used. (See
partition ID
.) The
<Key-path-element>
is an
entity path
, which is an even-length comma-separated list of kinds alternating with either integer ids or string names. The integers must be greater than
0
and the strings must not be empty.
|
BLOB |
BLOB(<string>)
represents a blob encoded as
<string>
via base-64 encoding with character set
[A-Za-z0-9-_]
and no padding.
|
BLOBKEY |
BLOBKEY(<string>)
represents a blobkey with value
<string>
, which must not be empty.
|
DATETIME |
DATETIME(<string>)
represents a timestamp.
<string>
must be in the time format specified in
RFC 3339 section 5.6
. (However, the second precision is limited to microseconds and leap seconds are omitted.) This standard format is:
YYYY-MM-DDThh:mm:ss.SSSSSS+zz:ZZ
where:
|
How to escape characters
You can escape certain characters in string literals and backquoted names. The
escaped characters are case sensitive: for example
\r
is valid while
\R
is
not.
The following is a list of all the characters that can be escaped in GQL:
Character | Escaped |
---|---|
backslash character |
\\
|
null character |
\0
|
backspace character |
\b
|
newline character |
\n
|
return character |
\r
|
tab character |
\t
|
the character with decimal code 26 |
\Z
|
single quotation mark |
\’
|
double quotation mark |
\”
|
backquote character |
\`
|
\%
(2 characters, retaining the backslash, per MySQL)
|
\%
|
\_
(2 characters, retaining the backslash, per MySQL)
|
\_
|
Operators and comparisons
Comparators are either equivalence comparators:
=
,
IN
,
CONTAINS
,
= NULL
,
HAS ANCESTOR
, and
HAS DESCENDANT
, or inequality comparators:
<
,
<=
,
>
, and
>=
.
Notice that the operator
=
is another name for the
IN
and
CONTAINS
operators. For
example,
<value> = <property-name>
is the same as
<value> IN <property-name>
,
and
<property-name> = <value>
is the same as
<property-name> CONTAINS <value>
.
Also
<property-name> IS NULL
is the same as
<property-name> = NULL
.
A condition can also test whether one entity has another entity as an
ancestor, using the
HAS ANCESTOR
or
HAS DESCENDANT
operators. These
operators test ancestor relationships between keys. They
can
operate on
__key__
, but they can also operate on a key-valued property. For
HAS ANCESTOR
, the right operand cannot be a property. For
HAS DESCENDANT
,
the left operand cannot be a property. For more information on ancestor relationships, see
ancestor paths
.
Only one property may be compared with inequality operators. When a query with
an
ORDER BY
clause applies an inequality operator to a property, that
property must be the first property in the
ORDER BY
clause.
A typical property name consists of alphanumeric characters optionally mixed with
underscore (
_
) and dollar sign (
$
). In other words, they match the regular expression
[a-zA-Z0-9_$]+(.[0-9_$]+)*
. Property names
containing other printable characters must be quoted with backquotes,
for example:
`first-name`
.
Restrictions
Comparisons must be between a property name and a literal, but these can
be on either side of the operator. For example,
A < 7
or
7 > A
. Note,
however, that there is no inverse operator for IS NULL, so while you can have
<property-name> IS NULL
, you cannot have
NULL IS <property-name>
.
There is no way to determine whether an entity lacks a value for a
property (that is, whether the property has no value). If you use a condition of
the form
property = NULL
, what will occur is a check whether a null value is
explicitly stored for that property. Datastore queries that refer to a property
will never return entities that don't have a value for that property.
Examples
To find all of the entities of kind
Person
whose ages are
between 18 and 35, use this query string:
SELECT * FROM Person WHERE age >= 18 AND age <= 35
To find the three entities of kind
Person
whose ages are the
greatest, use this query string:
SELECT * FROM Person ORDER BY age DESC LIMIT 3
To return only the
name
property for each
Person
, use
this query string:
SELECT name FROM Person
To return only the
name
property for each
Person
,
ordered by
age
, use this query string:
SELECT name FROM Person ORDER BY age
To find the keys of the entities of kind
Person
that have an
age of
NULL
, use this query string:
SELECT __key__ FROM Person WHERE age = NULL
To find all the entities, regardless of kind, that are in Amy's entity group (i.e. Amy and Fred), use this query:
SELECT * WHERE __key__ HAS ANCESTOR KEY(Person, 'Amy')
Argument binding
GQL supports argument binding. For information about these, see Using GQL .
Unsupported features and behavior differences from MySQL/Python GQL
If you are familiar with MySQL or the Python GQL provided by Google App Engine, you might want to take a look at the following list, which highlights the main differences between those products and the Datastore GQL behavior.
MySQL Differences
-
MySQL binding sites are represented by
?
. Datastore GQL binding sites are represented by@<name>
or@<number>
. -
MySQL supports only
OFFSET
/LIMIT
counts . Datastore GQL also supportsOFFSET
/LIMIT
cursors . -
Datastore GQL supports an
OFFSET
without aLIMIT
, MySQL does not. -
MySQL supports an offset count via keyword
LIMIT
, Datastore GQL does not. - A MySQL literal string can contain a raw newline. A Datastore GQL literal string cannot.
-
A MySQL literal string can
\
-escape any character. A Datastore GQL literal string can only\
-escape a specified list of characters . - A MySQL name can begin with a digit. A Datastore GQL name cannot.
- A Datastore GQL name can contain null characters. A MySQL name cannot.
-
A quoted Datastore GQL name can contain
\
-escaped characters. A quoted MySQL name interprets a\
as an ordinary character. - MySQL has different operators, keywords, and predefined names than Datastore GQL.
Python GQL for App Engine Differences
-
Python GQL binding sites begin with
:
. Datastore GQL binding sites begin with@
. -
Python GQL binding argument names may match the regular expression
__.*__
. Datastore GQL reserves those names for future predefined binding arguments. Note that although binding sites can use those names, argument bindings cannot. -
Python GQL supports operators
!=
andOR
. Those operators are not yet supported by Datastore GQL. -
Datastore GQL supports the
IN
operator differently . -
Python GQL supports functions
datetime
,date
,time
,key
,user
, andgeopt
. Datastore GQL supports functiondatetime
andkey
, but with different arguments. Datastore GQL supports functionsblob
andblobkey
. Datastore GQL support for functiongeopoint
(note spelling) is expected to be added in some future release. -
Python GQL expression
ANCESTOR IS <entity-or-key-value>
is represented in Datastore GQL as the more general expression__key__ HAS ANCESTOR <key-value>
. -
Datastore GQL supports the expression
<property-name> IS NULL
. Python GQL does not. -
Python GQL literal strings are quoted with
’
. Datastore GQL literal strings are quoted with either’
or”
. -
Python GQL names are quoted with
”
. Datastore GQL names are quoted with`
. - Datastore GQL literal strings and quoted names can contain spaces, return characters, backslashed characters, and the enclosing quote character (doubled). Python GQL literal strings and quoted names cannot have these.
-
Python GQL names may contain
.
without quoting. Datastore GQL reserves.
for future use. -
Datastore GQL names may contain
$
andU+0080
toU+FFFF
without quoting. Python GQL names may not. - Python GQL has different keywords and operators than Datastore GQL.
Keywords and predefined names
Keywords and predefined names are case-insensitive.
The following keywords are recognized, although not all of these are currently used in GQL. The unused keywords are marked with an asterisk.
Keyword | ||
---|---|---|
ALL* | DIV* | NULL |
ANCESTOR | EXISTS* | OFFSET |
AND | FALSE | OR* |
ANY* | FROM | ORDER |
AS* | GROUP | REGEXP* |
ASC | HAS | RLIKE* |
BETWEEN* | HAVING* | SELECT |
BINARY* | IN. | SUBSET* |
BY | IS | SUPERSET |
CONTAINS | JOIN* | TRUE |
CURSOR* | LIKE* | WHERE |
DESC | LIMIT | XOR* |
DESCENDANT | MOD* | |
DISTINCT | NOT* |
The following predefined names are recognized:
Predefined Name |
---|
BLOB |
BLOBKEY |
DATETIME |
FIRST |
KEY |