August 2008
Introduction
The App Engine DB library provides you with most of what you need to store any kind of data you would like in the datastore through its range of property types. Occasionally, these types are not adequate for representing a new piece of information and it would be nice if you could write your own types. It turns out you can! This article will tell you what you need to know to get started making your own properties.
When exactly was my grandpa born?
In a previous article on modeling entity relationships I demonstrated how to build a contacts model for a simple address book feature of an application. This address book allowed the user to capture information about people they knew, including their name, date of birth, phone numbers and addresses. For a while, this model works pretty well, and is good at capturing the structure of contacts and their relationship to things. If I were using this application, however, I would run in to problems when I was ready to enter information for my grandpa into my address book.
In the old days (and in some countries today) people are less precise when it
comes to things like dates and times. My grandpa was born in such a time
and place. The only thing we know about his date of birth is that it was
some time during the month of November in 1904. One easy thing we could
do is just enter his birth date as 11/1/1904. However, if you're like me
(and suffer from a mild form of OCD) you would prefer to have a way to store
the date in a way that shows that it is imprecise, and not to place the day
arbitrarily on November 1st. Since the
DateProperty
type is
not able to represent imprecise dates, it would be nice to implement a date
type that correctly models missing information and be able to put it in the
datastore.
Let's take a moment to revisit this class definition. Here is the basic
Contact
class containing only the contact's personal information.
class Contact(db.Model): # Basic info. name = db.StringProperty() birth_day = db.DateProperty()
We need to create a class that we can use to define the
birth_day
property to use instead of
DateProperty
.
A fuzzy date
A really simple way to represent a missing part of a date is if you could set one or more of the fields to zero to indicate that this information is missing. So, in the case of my grandfather's date of birth, we could represent it at 11/0/1904. Similarly, let's say his brother was born two years afterward, but we don't even know what month. We could simply indicate the while year as 0/0/1906. First, lets create a class to keep dates like this in memory. Mind you, this is just an ordinary Python object that we can use in our application.
class FuzzyDate(object): def __init__(self, year=0, month=0, day=0): self.year = year self.month = month self.day = day def has_day(self): return self.day > 0 def has_month(self): return self.month > 0 def has_year(self): return self.year > 0 def __str__(self): if self.has_day(): return '%02d/%02d/%04d' % (self.month, self. day, self.year) if self.has_month(): return '%02d/%04d' % (self.month, self.year) if self.has_year(): return '%04d' % self.year return 'Unknown' def __not__(self): return (not(self.has_year() or self.has_month() or self.has_day()))
Putting fuzzy dates in the datastore
Representing fuzzy dates by replacing unknown fields with zero has the additional benefit of being capable of storage as a simple packed integer. For example, the number 19041100 represents November of 1904.
A date converted to a number this way has the added advantage of automatically being sorted correctly!
You could add a method to the
FuzzyDate
class that
would tell it how to convert itself to and from this kind of encoded integer
and then just use an
IntegerProperty
on your
model. The problem with that, however, is you would have to convert the
FuzzyDate
to an integer whenever you assigned it to a property on a
Contact
object and then back again whenever you wanted to use it!
It would be better if App Engine DB could be taught how to automatically
convert a
FuzzyDate
to and from and integer when it saved and loaded
your model from the datastore. To do this, you need to create a sub-class of
Property that will mediate between your
FuzzyDate
class
and the datastore.
To create a new property type, simply extend Property from the App Engine DB library and override these four attributes:
get_value_for_datastore | Extract the value from a model instance and convert it to one the type that goes in the datastore. |
make_value_from_datastore | Convert a value as found in the datastore to your new user type. |
validate (optional) | Called when an assignment is made to a property to make sure that it is compatible with your assigned attributes. |
empty (optional) | Used to indicate to the datastore whether a given user type value is 'empty' and should be stored as None. |
You also need to indicate to the datastore what your user type will be by
specifying a
data_type
class attribute.
We can define a
FuzzyDateProperty
like so:
class FuzzyDateProperty(db.Property): # Tell what the user type is. data_type = FuzzyDate # For writing to datastore. def get_value_for_datastore(self, model_instance): date = super(FuzzyDateProperty, self).get_value_for_datastore(model_instance) return (date.year * 10000) + (date.month * 100) + date.day # For reading from datastore. def make_value_from_datastore(self, value): if value is None: return None return FuzzyDate(year=value / 10000, month=(value / 100) % 100, day=value % 100) def validate(self, value): if value is not None and not isinstance(value, FuzzyDate): raise BadValueError('Property %s must be convertible ' 'to a FuzzyDate instance (%s)' % (self.name, value)) return super(FuzzyDateProperty, self).validate(value) def empty(self, value): return not value
Using FuzzyDateProperty in a model
Now that we have defined
FuzzyDate
and
FuzzyDateProperty
that can work with the datastore, we are ready to use the
FuzzyDateProperty
in our model. Here is the
Contact
model using the
FuzzyDateProperty
instead of the built-in
DateProperty
:
class Contact(db.Model): # Basic info. name = db.StringProperty() birth_day = FuzzyDateProperty()
Now when we create Contact objects, we assign them FuzzyDate birthdays rather than the normal precise ones. For example, I might create my grandpa like this:
grandpa = Contact(name='Milton', birth_day=FuzzyDate(1904, 11)) grandpa.put()
Searching for fuzzy dates using GQL
Right now, there is no way to extend GQL so that it is easy to look for custom types. In some cases, this might make it very difficult to write GQL to search for records using the new data-type. Luckily, searching fuzzy dates is not very difficult as long as you realize that you are searching over integers in the datastore.
So, for example, let's say I wanted to find everyone that was born before 1950. I could write a GQL query to do that like this:
Contact.gql('WHERE birth_day < 19500000')
Let's say I wanted to find everyone born in the same year as my grandpa. It's a little more complicated:
Contact.gql('WHERE birth_day < 19050000 AND birth_day >= 19040000')
And that's when my grandpa was born, more or less
As has been demonstrated it is possible to customize App Engine models to
work more closely with your custom data types by extending and using the
Property
class.
In this article we were able to create a new custom type with a
simple storage representation, make a new
Property
sub-class
to mediate between this type and the datastore, use it in a model and
perform searches over it.