App Engine Modules (or just "Modules" hereafter) is a feature that lets developers factor large applications into logical components that can share stateful services and communicate in a secure fashion. An app that handles customer requests might include separate modules to handle other tasks, such as:
- API requests from mobile devices
- Internal, admin-like requests
- Backend processing such as billing pipelines and data analysis
Modules can be configured to use different runtimes and to operate with different performance settings.
- Application hierarchy
- Instance scaling and class
- Configuration
- Uploading modules
- Instance states
- Instance uptime
- Background threads
- Monitoring resource usage
- Logging
- Communication between modules
- Limits
Application hierarchy
At the highest level, an App Engine application is made up of one or more modules . Each module consists of source code and configuration files. The files used by a module represent a version of the module. When you deploy a module, you always deploy a specific version of the module. For this reason, whenever we speak of a module, it usually means a version of a module.
You can deploy multiple versions of the same module, to account for alternative implementations or progressive upgrades as time goes on.
Every module and version must have a name. A name can contain numbers, letters, and hyphens. It cannot be longer than 63 characters and cannot start or end with a hyphen.
While running, a particular module/version will have one or more instances . Each instance runs its own separate executable. The number of instances running at any time depends on the module's scaling type and the amount of incoming requests:
Stateful services (such as Memcache, Datastore, and Task Queues) are shared by all the modules in an application. Every module, version, and instance has its own unique URI (for example,
v1.my-module.my-app.appspot.com
). Incoming user requests are routed to an instance of a particular module/version according to
URL addressing conventions and an optional customized dispatch file
.
Please note that in April of 2013, Google stopped issuing SSL certificates for double-wildcard domains hosted at
appspot.com
(i.e.
*.*.appspot.com
). If you rely on such URLs for HTTPS access to your application, please change any application logic to use "-dot-" instead of ".". For example, to access version "1" of application "myapp" use "https://1-dot-myapp.appspot.com" instead of "https://1.myapp.appspot.com." If you continue to use "https://1.myapp.appspot.com" the certificate will not match, which will result in an error for any User-Agent that expects the URL and certificate to match exactly.
Instance scaling and class
While an application is running, incoming requests are routed to an existing or new instance of the appropriate module/version. The scaling type of a module/version controls how instances are created. There are three scaling types: manual , basic , and automatic .
Manual Scaling
- A module with manual scaling runs continuously, allowing you to perform complex initialization and rely on the state of its memory over time.
Basic Scaling
- A module with basic scaling will create an instance when the application receives a request. The instance will be turned down when the app becomes idle. Basic scaling is ideal for work that is intermittent or driven by user activity.
Automatic Scaling
- Automatic scaling is the scaling policy that App Engine has used since its inception. It is based on request rate, response latencies, and other application metrics. Previously users could use the Admin Console to configure the automatic scaling parameters (instance class, idle instances and pending latency) for an application's frontend versions only. These settings now apply to every version of every module that has automatic scaling.
Each scaling type offers a selection of instance classes, with different amounts of CPU and Memory. The following tables list the features of the three types of scaling, and the service levels and costs of the various instance classes:
Scaling Types
Feature | Automatic Scaling | Manual Scaling | Basic Scaling |
---|---|---|---|
Deadlines | 60-second deadline for HTTP requests, 10-minute deadline for tasks |
Requests can run indefinitely. A manually-scaled instance can choose to handle
/_ah/start
and execute a program or script for many hours without returning an HTTP response code.
|
Same as manual scaling. |
CPU/Memory | Configurable by selecting an F1, F2, F4, or F4_1G instance class | Configurable by selecting a B1, B2, B4, B4_1G, or B8 instance class | Configurable by selecting a B1, B2, B4, B4_1G, or B8 instance class |
Residence | Instances are evicted from memory based on usage patterns. |
Instances remain in memory, and state is preserved across requests. When instances are restarted, an
/_ah/stop
request appears in the logs. If there is a registered stop callback method, it has 30 seconds to complete before shutdown occurs.
|
Instances are evicted based on the
idle_timeout
parameter. If an instance has been idle, i.e. has not received a request, for more than
idle_timeout
, then the instance is evicted.
|
Startup and Shutdown | Instances are created on demand to handle requests and automatically turned down when idle. |
Instances are sent a start request automatically by App Engine in the form of an empty GET request to
/_ah/start
. An instance that is stopped with
appcfg stop
(or via the Admin Console UI) has 30 seconds to finish handling requests before it is forcibly terminated.
|
Instances are created on demand to handle requests and automatically turned down when idle, based on the
idle_timeout
configuration parameter. As with manual scaling, an instance that is stopped with
appcfg stop
(or via the Admin Console UI) has 30 seconds to finish handling requests before it is forcibly terminated.
|
Instance Addressability | Instances are anonymous. |
Instances are addressable at URLs with the form:
http://instance.version.module.app_id.appspot.com
. If you have set up a wildcard subdomain mapping for a custom domain, you can also address a module or any of its instances via a URL of the form
http://module.domain.com
or
http://instance.module.domain.com
. You can reliably cache state in each instance and retrieve it in subsequent requests.
|
Same as manual scaling. |
Scaling |
App Engine scales the number of instances automatically in response to processing volume. This scaling factors in the
automatic_scaling
settings that are provided on a per-version basis in the configuration file uploaded with the module version.
|
You configure the number of instances of each module version in that module’s configuration file. The number of instances usually corresponds to the size of a dataset being held in memory or the desired throughput for offline work. You can adjust the number of instances of a manually-scaled version very quickly, without stopping instances that are currently running, using the Modules API
set_num_instances
function.
|
A basic scaling module version is configured with a maximum number of instances using the
basic_scaling
setting's
max_instances
parameter. The number of live instances scales with the processing volume.
|
Free Daily Usage Quota | 28 instance-hours | 8 instance-hours | 8 instance-hours |
Instance classes
Instances are priced based on an hourly rate determined by the instance class.
Instance Class | Memory Limit | CPU Limit | Cost per Hour per Instance |
---|---|---|---|
B1 | 128 MB | 600 Mhz | $0.05 |
B2 | 256 MB | 1.2 Ghz | $0.10 |
B4 | 512 MB | 2.4 Ghz | $0.20 |
B4_1G | 1024 MB | 2.4 Ghz | $0.30 |
B8 | 1024 MB | 4.8 Ghz | $0.40 |
F1 | 128 MB | 600 Mhz | $0.05 |
F2 | 256 MB | 1.2 Ghz | $0.10 |
F4 | 512 MB | 2.4 Ghz | $0.20 |
F4_1G | 1024 MB | 2.4 Ghz | $0.30 |
Manual and basic scaling instances are billed at hourly rates based on uptime. Billing begins when an instance starts and ends fifteen minutes after a manual instance shuts down or fifteen minutes after a basic instance has finished processing its last request. Runtime overhead is counted against the instance memory limit. This will be higher for Java than for other languages.
Important: When you are billed for instance hours, you will not see any instance classes in your billing line items. Instead, you will see the appropriate multiple of instance hours. For example, if you use an F4 instance for one hour, you do not see "F4" listed, but you will see billing for four instance hours at the F1 rate.
Configuration
Each version of a module is defined in a
.yaml
file, which gives the name of the module and version. The yaml file usually takes the same name as the module it defines, but this is not required. If you are deploying several versions of a module, you can create multiple yaml files in the same directory, one for each version.
Typically, you create a directory for each module, which contains the module's yaml file(s) and associated source code. Optional application-level configuration files (dispatch.yaml, cron.yaml, index.yaml, and queue.yaml) are included in the top level app directory. The example below shows three modules. In module1 the source files are contained in a subdirectory, in module2 they are at the same level as the yaml file; module3 has yaml files for two versions:
For small, simple projects, all the app's files can live in one directory:
Every yaml file must include a version parameter. To define the default module, you can explicitly include the parameter "module: default" or leave the module parameter out of the file.
Each module's configuration file defines the scaling type and instance class for a specific module/version. Different scaling parameters are used depending on which type of scaling you specify. If you do not specify scaling, automatic scaling is the default.
For each module you can also specify settings that map URL requests to specific scripts and identify static files for better server efficiency. These settings are also included in the yaml file and are described in the App Config section. The following examples show how to configure modules for each scaling type.
Manual Scaling
application: simple-sample
module: my-module
version: uno
runtime: python27
instance_class: B8
manual_scaling:
instances: 5
instance_class:
- The instance class size for this module. When using manual scaling, the B1, B2, B4, B4_1G, and B8 instance classes are available. If you do not specify a class, B2 is assigned by default.
manual_scaling:
- Required to enable manual scaling for a module.
instances:
-
The number of instances to assign to the module at the start. This number can later be altered by using the Modules API
set_num_instances()
function.
Basic Scaling
application: simple-sample
module: my-module
version: uno
runtime: python27
instance_class: B8
basic_scaling:
max_instances: 11
idle_timeout: 10m
instance_class:
- The instance class size for this module. When using basic scaling, the B1, B2, B4, B4_1G, and B8 instance classes are available. If you do not specify a class, B2 is assigned by default.
basic_scaling:
- Required to enable basic scaling for a module.
max_instances:
- Required. The maximum number of instances for App Engine to create for this module version. This is useful to limit the costs of a module.
idle_timeout:
- Optional. The instance will be shut down this amount of time after receiving its last request.
Automatic Scaling
application: simple-sample
module: my-module
version: uno
runtime: python27
instance_class: F2
automatic_scaling:
min_idle_instances: 5
max_idle_instances: automatic # default value
min_pending_latency: automatic # default value
max_pending_latency: 30ms
max_concurrent_requests: 50
instance_class:
- The Instance Class size for this module. When using automatic scaling, only the F1, F2, F4, and F4_1G instance classes are available. If you do not specify a class, F1 is assigned by default.
automatic_scaling:
- Optional. Automatic scaling is assumed by default.
min_idle_instances:
- The minimum number of idle instances that App Engine should maintain for this version. Only applies to the default version of a module, since other versions are not expected to receive significant traffic. Please keep in mind:
- A low minimum helps keep your running costs down during idle periods, but means that fewer instances may be immediately available to respond to a sudden load spike.
-
A high minimum allows you to prime the application for rapid spikes in request load. App Engine keeps that number of instances in reserve at all times, so an instance is always available to serve an incoming request, but you pay for those instances. This functionality replaces the deprecated "Always On" feature, which ensured that a fixed number of instances were always available for your application. Once you've set the minimum number of idle instances, you can see these instances marked as "Resident" in the
Instances
tab of the Admin Console.
If you set a minimum number of idle instances, pending latency will have less effect on your application's performance. Because App Engine keeps idle instances in reserve, it is unlikely that requests will enter the pending queue except in exceptionally high load spikes. You will need to test your application and expected traffic volume to determine the ideal number of instances to keep in reserve.
max_idle_instances:
- The maximum number of idle instances that App Engine should maintain for this version. Please keep in mind:
- A high maximum reduces the number of idle instances more gradually when load levels return to normal after a spike. This helps your application maintain steady performance through fluctuations in request load, but also raises the number of idle instances (and consequent running costs) during such periods of heavy load.
- A low maximum keeps running costs lower, but can degrade performance in the face of volatile load levels.
Note: When settling back to normal levels after a load spike, the number of idle instances may temporarily exceed your specified maximum. However, you will not be charged for more instances than the maximum number you've specified.
min_pending_latency:
- The minimum amount of time that App Engine should allow a request to wait in the pending queue before starting a new instance to handle it.
- A low minimum means requests must spend less time in the pending queue when all existing instances are active. This improves performance but increases the cost of running your application.
- A high minimum means requests will remain pending longer if all existing instances are active. This lowers running costs but increases the time users must wait for their requests to be served.
max_pending_latency:
- The maximum amount of time that App Engine should allow a request to wait in the pending queue before starting a new instance to handle it.
- A low maximum means App Engine will start new instances sooner for pending requests, improving performance but raising running costs.
- A high maximum means users may wait longer for their requests to be served (if there are pending requests and no idle instances to serve them), but your application will cost less to run.
max_concurrent_requests:
- Optional. The number of concurrent requests an automatic scaling instance can accept before the scheduler spawns a new instance (Default: 8, Maximum: 80). Note that the scheduler may spawn a new instance before the actual maximum number of requests is reached.
The default module
Every application must have a single default module. To define the default module,
include the setting
module: default
in the module's yaml file,
or leave the setting out.
An example
Here is an example of how you would configure yaml files for an application that has three modules: a default module that handles web requests, plus two more modules that handle mobile requests and backend processing.
Start by defining a configuration file named
app.yaml
that will handle all web-related requests:
application: simple-sample
version: uno
runtime: python27
api_version: 1
threadsafe: true
This configuration would create a default module with automatic scaling and a public address of
http://simple-sample.appspot.com
.
Next, assume that you want to create a module to handle mobile web requests. For the sake of the mobile users (in this example) the max pending latency will be just a second and we’ll always have at least two instances idle. To configure this you would create a
mobile-frontend.yaml
configuration file. with the following contents:
application: simple-sample
module: mobile-frontend
version: uno
runtime: python27
api_version: 1
threadsafe: true
automatic_scaling:
min_idle_instances: 2
max_pending_latency: 1s
The module this file creates would then be reachable at
http://mobile-frontend.simple-sample.appspot.com
.
Finally, add a module, called
my-module
for handling static backend work. This could be a
continuous job that exports data from Datastore to BigQuery. The amount of work
is relatively fixed, therefore you simply need 1 resident module at any given
time. Also, these jobs will need to handle a large amount of in-memory
processing, thus you’ll want modules with an increased memory configuration. To
configure this you would create a
my-module.yaml
configuration file with
the following contents.
application: simple-sample
module: my-module
version: uno
runtime: python27
api_version: 1
threadsafe: true
instance_class: B8
manual_scaling:
instances: 1
The module this file creates would then be reachable at
http://my-module.simple-sample.appspot.com
.
Notice the
manual_scaling:
setting. The
instances:
parameter tells App Engine how many instances to create for this module.
Uploading modules
To deploy the example above, use the
appcfg update
command.
If you are uploading the app for the first time, the default module must be uploaded
first, or if you are listing multiple modules, the default module must be the
first module in the file list:
cd simple-sample
appcfg update app.yaml mobile-frontend.yaml my-module.yaml
You will receive verification via the command line as each module is successfully deployed.
Once the application has been successfully deployed you can access it at
http://simple-sample.appspot.com
. You can also access each of the modules individually:
-
http://default.simple-sample.appspot.com
-
http://mobile-frontend.simple-sample.appspot.com
-
http://my-module.simple-sample.appspot.com
To address a specific version, prepend the version name to the URI. For example, to target version uno of the mobile-frontend module use
http://uno.mobile-frontend.simple-sample.appspot.com
.
Instance states
A manual or basic scaled instance can be in one of two states:
Running
or
Stopped
. All instances of a particular module/version share the same state. You can change the state of all the instances belonging to a module/version using the
appcfg
command or the Modules API.
Startup
Each module instance is created in response to a start request, which is an empty GET request to
/_ah/start
. App Engine sends this request to bring an instance into existence; users cannot send a request to
/_ah/start
. Manual and basic scaling instances must respond to the start request before they can handle another request. The start request can be used for two purposes:
- To start a program that runs indefinitely, without accepting further requests
- To initialize an instance before it receives additional traffic
Manual scaling instances and basic scaling instances startup differently. When you start a manual scaling instance, App Engine immediately sends a
/_ah/start
request to each instance. When you start an instance of a basic scaling module, App Engine allows it to accept traffic, but the
/_ah/start
request is not sent to an instance until it receives its first user request. Multiple basic scaling instances are only started as necessary, in order to handle increased traffic.
When an instance responds to the
/_ah/start
request with an HTTP status code of
200–299
or
404
, it is considered to have successfully started and can handle additional requests. Otherwise, App Engine terminates the instance. Manual scaling instances are restarted immediately, while basic scaling instances are restarted only when needed for serving traffic.
Shutdown
The shutdown process may be triggered by a variety of planned and unplanned events, such as:
-
You manually stop an instance using the
appcfg stop
command or the Modules APIstop_version
function call. - Manually stop an instance from the Admin Console Versions page.
-
You update the module version using
appcfg update
. -
The instance exceeds the maximum memory for its configured
instance_class
. - Your application runs out of Instance Hours quota.
- The machine running the instance is restarted, forcing your instance to move to a different machine.
- App Engine needs to move your instance to a different machine to improve load distribution.
is_shutting_down()
method from
google.appengine.api.runtime
begins returning
true
. Second, if you have registered a shutdown hook, it will be called. It's a good idea to register a shutdown hook in your start request. After the notification is issued, existing requests are given 30 seconds to complete, and new requests immediately return
404
.