Jeff Scudder
December 2009
This is one of a series of in-depth articles discussing App Engine's datastore. To see the other articles in the series, see Related links .
The datastore API provides a simple interface for interacting with a complex distributed data storage system based on Bigtable. Inserting a new entity in the datastore can be as simple as executing
MyEntity().put()
for Python, or
pm.makePersistent(new MyEntity())
for Java. But what happens behind the scenes when you make that call?
We'll dive into a bit more detail in terms of what new data is placed in the datastore as part of write operations such as inserts, deletions, updates, and transactions. The focus is on the backend work that is common to all of the runtimes. Our hope is to give you a better understanding of the costs and tradeoffs, and better equip you to get maximum performance from your code.
An Insert
Let's begin with an example data model item, a ToDo item.
Python
class ToDo(db.Model): owner = db.User(auto_current_user_add=True) created = db.DateTimeProperty(auto_now_add=True) priority = db.IntegerProperty() description = db.TextProperty() due = db.DateProperty()
Java
// Java JDO @PersistenceCapable(identityType = IdentityType.APPLICATION) public class ToDo { @PrimaryKey @Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY) private Key key; @Persistent private User owner; @Persistent private Date created; @Persistent private Long priority; @Persistent private Text description; @Persistent private Date due; public ToDo(Long priority, Text description, Date due) { // ... } // ... }
We create a new ToDo item and put it in the datastore.
Python
my_todo = ToDo(priority=5, description='Fix that leaky faucet', due=datetime.date(2009, 8, 7)) my_todo.put()
Java
// Java JDO PersistenceManager pm = PMF.get().getPersistenceManager(); //... myTodo = new ToDo(new Long(5), new Text("Fix that leaky faucet"), Calendar.set(2009 + 1900, 8, 7).getTime()); pm.makePersistent(myTodo);
When we call
put
or
makePersistent
, several things happen behind the scenes before the call returns and sets the entity's key:
-
The
my_todo
object is converted into a protocol buffer . - The appserver makes an RPC call to the datastore server, sending the entity data in a protocol buffer.
-
If a key name is not provided, a unique ID is determined for this entity's key. The entity key is composed of
app ID | ancestor keys | kind name | key name or ID
. - The datastore server processes the request in two phases that are executed in order: commit, then apply. In each phase, the datastore server identifies the Bigtable tablet servers that should receive the data.
The Commit Phase
The datastore performs two actions in the commit phase:
- It writes the data for the entities to the entity group's log.
- It marks the new log entry as committed.
The Apply Phase
In the apply phase, the entity's data and the index rows are written to disk in parallel. Depending on the properties that are set in the entity, a large number of index entries might need to be added. The datastore checks for matching indexes in the composite indexes that are defined for your application in a config file, along with the indexes that are automatically present for all apps—these do not need to be present in a config file. Before stepping through the apply phase, we will look at an example of an app-specific composite index.
Sample Composite Index
When we want to view a list of the ToDo items that are due the soonest and are also sorted with high priority, we use a query like this:
select * from ToDo order by due asc, priority desc
This requires a composite index in our index config file:
Python
index.yaml
indexes: - kind: ToDo ancestor: no properties: - name: due - name: priority direction: desc
Java
datastore-indexes.xml
<?xml version="1.0" encoding="utf-8"?> <datastore-indexes autoGenerate="true"> <datastore-index kind="ToDo" ancestor="false"> <property name="due" direction="asc" /> <property name="priority" direction="desc" /> </datastore-index> </datastore-indexes>
The config file shown above defines a composite index, which is updated in an insert.
Apply Steps
When the datastore server applies a request, it goes through the following steps:
- Write the new entity itself. The key for the entity serves as the row name, and the entity's data is placed in a single column as a byte-encoded protocol buffer.
-
Write a new index row for each of the properties that is indexed. Since each index can live in a separate location in Bigtable, these writes can be fanned out in parallel to multiple tablet servers. In our example above, we'd see index rows that look like the following. (The
key
refers to the row name of the row containing the entity's data.)-
ToDo created ▲:
(now) = key
-
ToDo created ▼:
(now) = key
-
ToDo owner ▲:
(current user) = key
-
ToDo owner ▼:
(current user) = key
-
ToDo priority ▲:
5 = key
-
ToDo priority ▼:
5 = key
-
ToDo due ▲:
2009/8/7 = key
-
ToDo due ▼:
2009/8/7 = key
-
ToDo due ▼ priority ▼:
2009/8/7, 5 = key
This is the composite index we defined above. -
Add an entry to the index of all entities of a kind:
ToDo = key
-
ToDo created ▲:
Note that
description
is not included in the list above, because Text properties are not indexed. You can also
explicitly mark
properties as unindexed. Also, if an entity does not contain a value for an indexed property, an index row is not created for that property. Index rows are added only if the entity has the required properties.
The Return Value
We've seen the steps executed by the datastore behind the scenes, but how does this relate to the code that you've written for the app? More specifically, at what point in the list of items does the datastore server respond telling your app that the put succeeded or failed?
If you are using the (now standard) High Replication Datastore (HRD), the Datastore returns after the commit and does the apply asynchronously. In the (deprecated) Master/Slave Datastore, the Datastore returns only after applying all changes, so if you receive a success response, all data and index rows should already have been written to disk; if there is a failure either in the commit phase or the apply phase, the backend will automatically retry several times, but if failures continue, the datastore will return an error message that is converted to an exception.
When a write operation fails, the entity data and index data are either partially written (some indexes and entities not updated) or not updated at all. If the entity data fails to update in the commit phase, no index changes are made. For more details, see the article on transaction isolation .
If the commit phase has succeeded but the apply phase failed, the datastore will roll forward to apply the changes to indexes under two circumstances:
- The next time you execute a read or write or start a transaction on this entity group, the datastore will first roll forward and fully apply this committed but unapplied write, based on the data in the log.
- The datastore continuously sweeps for partially applied jobs and rolls forward writes to indexes and entities that have not yet received the changes to the entity.
There is an expected failure rate on writes, because Bigtable tablets are sometimes unavailable, for example when they are being moved or split. The presence of more indexes increases the probability of hitting an unavailable tablet, because an exception is raised if a write fails for any of the indexes.
In those situations, your application will need to decide how to handle the exception. One option is to add a task to the task queue to retry the write at a later point in time. Another idea would be to respond with an error from the app and have the client retry. This tends to work with things like AJAX requests where there is client side logic that can handle an error message from the server.
About transactions
A datastore transaction (for Python or Java ) is a set of operations. Each transaction is guaranteed to be atomic, which means that transactions are never partially applied.
Note: If your app receives an exception when submitting a transaction, it does not always mean that the transaction failed.
You can receive the following exceptions in cases where transactions have been committed and eventually will be applied successfully:
-
In Python,
Timeout
,TransactionFailedError
, orInternalError
. -
In Java,
DatastoreTimeoutException
,ConcurrentModificationException
, orDatastoreFailureException
.
Whenever possible, make your datastore transactions idempotent so that if you repeat a transaction, the end result will be the same.
Conclusions
The App Engine datastore is a distributed system that abstracts many of the complexities of reliably storing and retrieving data with fast read times across a multitude of physical machines. The simplicity of the API intentionally hides the complexity behinds the scenes, but understanding the algorithms being carried out can give you insights into how to optimize your application. Our hope is that this article has illustrated backend operations that occur during a datastore write and illustrated where the costs and potential points of failure lie.