Storing data in a scalable web application can be tricky. A user could be interacting with any of dozens of web servers at a given time, and the user's next request could go to a different web server than the one that handled the previous request. A web server may depend on data that is spread out across dozens of machines, possibly in different locations around the world.
Thanks to Google App Engine, you don't have to worry about any of that. App Engine's infrastructure takes care of all of the distribution, replication and load balancing of data behind a simple API—and you get a powerful query engine as well.
App Engine's data repository, the High Replication Datastore (HRD), uses the Paxos algorithm to replicate data across multiple datacenters. Data is written to the Datastore in objects known as entities . Each entity has a key that uniquely identifies it. An entity can optionally designate another entity as its parent; the first entity is a child of the parent entity. The entities in the Datastore thus form a hierarchically structured space similar to the directory structure of a file system. An entity's parent, parent's parent, and so on recursively, are its ancestors; its children, children's children, and so on, are its descendants. An entity without a parent is a root entity.
The Datastore is extremely resilient in the face of catastrophic failure, but its consistency guarantees may differ from what you're familiar with. Entities descended from a common ancestor are said to belong to the same entity group; the common ancestor's key is the group's parent key, which serves to identify the entire group. Queries over a single entity group, called ancestor queries , refer to the parent key instead of a specific entity's key. Entity groups are a unit of both consistency and transactionality: whereas queries over multiple entity groups may return stale, eventually consistent results, those limited to a single entity group always return up-to-date, strongly consistent results.
The code samples in this guide organize related entities into entity groups, and use ancestor queries on those entity groups to return strongly consistent results. In the example code comments, we highlight some ways this might affect the design of your application. For more detailed information, see Structuring Data for Strong Consistency .
Storing the Submitted Greetings
For the guestbook application, we want to store greetings posted by users. Each greeting includes the author's name, the message content, and the date and time the message was posted so we can display messages in chronological order.
To represent this data we create a Go struct named
Greeting
:
Now that we have a data type for greetings, the application can create new
Greeting
values and put them into the datastore.
The new version of the
sign
handler does just that:
This creates a new
Greeting
value, setting its
Author
field to the current user, its
Content
field
with the data posted by the user, and its
Date
field to the
current time.
Finally,
datastore.Put
saves our new value to the datastore.
We pass it a new, incomplete key so that the datastore will create a new
key for this record automatically.
Because querying in the High Replication Datastore is strongly consistent only within entity groups, we assign all of one book's greetings to the same entity group in this example by setting the same parent for each greeting. This means a user will always see a greeting immediately after it was written. However, the rate at which you can write to the same entity group is limited to 1 write to the entity group per second. When you design a real application you'll need to keep this fact in mind. By using services such as Memcache , you can mitigate the chance that a user won't see fresh results when querying across entity groups immediately after a write.
Retrieving the Stored Greetings With
datastore.Query
The
datastore
package provides a
Query
type for
querying the datastore and iterating over the results.
The new version of the
root
handler queries the datastore for
greetings:
First the function constructs a
Query
value that requests
Greeting
objects that are descendants of the root guestbook key,
in
Date
-descending order, with a limit of 10 objects.
Then it calls
q.GetAll(c, &greetings)
, which runs the query
and appends the query results to the
greetings
slice.
Finally, the
guestbookTemplate.Execute
function renders an HTML
page containing these greetings and writes it out to the
http.ResponseWriter
. For more details on the templating language,
see the
text/template package
documentation
. Note that here we use the
html/template
,
a package that wraps
text/template
and automatically escapes
content in HTML templates, preventing a class of script injection attacks.
For a complete description of the Datastore API, see the Datastore reference .
Clearing the Development Server Datastore
The development web server uses a local version of the datastore for testing your application, using temporary files. The data persists as long as the temporary files exist, and the web server does not reset these files unless you ask it to do so.
If you want the development server to erase its datastore prior to starting up, see the Development Server reference , which explains the datastore configuration options for the development server.
A Complete Example Using the Datastore
Here is a new version of
myapp/hello.go
that stores
greetings in the datastore. The rest of this page discusses the new pieces.
Replace
myapp/hello.go
with this, then reload
http://localhost:8080/
in your browser.
Post a few messages to verify that messages get stored and displayed correctly.
Warning!
Exercising the queries in your
application locally causes App Engine to create or update
index.yaml
. If
index.yaml
is missing or incomplete,
you will see index errors when your uploaded application executes queries for
which the necessary indexes have not been specified. To avoid missing index
errors in production, always test new queries at least once locally before
uploading your application. See
Go Datastore Index Configuration
for more information.
A Word About Datastore Indexes
Every query in the App Engine Datastore is computed from one or more indexes —tables that map ordered property values to entity keys. This is how App Engine is able to serve results quickly regardless of the size of your application's Datastore. Many queries can be computed from the builtin indexes, but for queries that are more complex the Datastore requires a custom index . Without a custom index, the Datastore can't execute these queries efficiently.
For example, our guestbook application above filters by guestbook entries and orders by
date, using an ancestor query and a sort order. This requires a custom index to be
specified in your application's
index.yaml
file. You can edit this file
manually or, as noted in the warning box earlier on this page, you can take care of it
automatically by running the queries in your application locally. Once the
index is defined in
index.yaml
, uploading your application will also
upload your custom index information.
The definition for the query in your
index.yaml
file looks like this:
You can read all about Datastore indexes in the
Datastore Indexes page
.
You can read about the proper specification for your
index.yaml
file in
Go Datastore Index Configuration.
Next…
We now have a working guest book application that authenticates users using Google accounts, lets them submit messages, and displays messages other users have left. Because App Engine handles scaling automatically, we will not need to restructure our application as it gets popular.