Some Administration Console settings let you tune your application's performance. In some cases, you want to optimize to minimize cost. In other cases, you want to serve heavy request volume quickly. These controls allow you to
- Set the Frontend Instance Class . Your application can use faster, "bigger," more expensive servers.
- Configure the Scheduler . The App Engine scheduler controls how your application responds to increased load. As more requests come in, the scheduler might start up more servers or queue incoming requests.
Note: The Admin Console controls described on this page will be removed in the near future. You should use the Modules feature and its configuration files, rather than frontends.
Setting the Frontend Instance Class
App Engine provides several different classes of frontend instances, each with different memory and and CPU limits. These classes allow you to configure your frontend instance with the processing capacity you need to perform your work. Each class has a specific hourly billing rate. Please see Billable Quota Unit Costs for pricing.
Important: Currently, when you are billed for instance hours, you will not see any instance classes in your billing line items. Instead, you will see the appropriate multiple of instance hours. For example, if you use an F4 instance for one hour, you do not see "F4" listed, but you will see billing for four instance hours at the F1 rate.
The default class for frontends is F1, which gives you 128MB of memory and 600MHz of CPU capacity. You can change the class of the frontend using the performance settings in the admin console.
The Frontend Instance Class setting is selected by choosing one of the values in the dropdown menu. Each value represents a memory size and processing power, with larger memory sizes and processing power providing extra performance but at an increased cost. The value you select is used for all of the instances used by all versions of your app.
You can change the current frontend instance class for your app at any time. Python and Go apps automatically get the new instance class that you choose. A Java app must be restarted to get the new instance class.
Frontend instances are priced based on an hourly rate determined by the frontend class. The following table describes the cost for each class:
Frontend class | Memory limit | CPU limit | Cost per hour per instance |
---|---|---|---|
F1 (default)
|
128MB | 600MHz | $0.05 |
F2
|
256MB | 1.2GHz | $0.10 |
F4
|
512MB | 2.4GHz | $0.20 |
F4_1G
|
1024MB | 2.4GHz | $0.30 |
Configuring the Scheduler
App Engine's scheduler is responsible for routing incoming requests to be served by your app's instances . Sometimes the volume of incoming requests exceeds the capacity of the instances currently available to your app. When this happens, incoming requests may have to wait in the Pending Queue until busy instances become available, or until the scheduler starts new instances.
The scheduler is responsible for deciding how to serve your app's request load. Under regular conditions, it may spin up new idle instances to absorb traffic and minimize latency in the event of a sudden load spike . Because new instances take time to create, unusually heavy surges of traffic may consume all available idle instances faster than the scheduler can create new ones. This can cause your users to experience delays (latency) in the serving of requests.
The default settings enable App Engine's scheduling algorithm to scale the number of instances based on your recent request load and latency profile. If you use manual settings instead, you may need to adjust them continually as your request volume changes.
Setting the Number of Idle Instances
The Idle Instances sliders control the minimum and maximum number of idle instances available to your application at any given time.
The upper slider sets the minimum number of idle instances:
Note: In order to specify the minimum number of idle instances, you must have a paid app .
- A low minimum helps keep your running costs down during idle periods, but means that fewer instances may be immediately available to respond to a sudden load spike.
-
A high minimum allows you to prime the application for rapid spikes in request load. App Engine keeps that number of instances in reserve at all times, so an instance is always available to serve an incoming request, but you pay for those instances. This functionality replaces the deprecated "Always On" feature, which ensured that a fixed number of instances were always available for your application. Once you've set the minimum number of idle instances, you can see these instances marked as "Resident" in the
Instances
tab of the Admin Console.
Note: If you set a minimum number of idle instances, the pending latency slider will have less effect on your application's performance. Because App Engine keeps idle instances in reserve, it is unlikely that requests will enter the pending queue except in exceptionally high load spikes. You will need to test your application and expected traffic volume to determine the ideal number of instances to keep in reserve.
The lower slider controls the maximum number of idle instances (up to 100):
- A high maximum reduces the number of idle instances more gradually when load levels return to normal after a spike. This helps your application maintain steady performance through fluctuations in request load, but also raises the number of idle instances (and consequent running costs) during such periods of heavy load.
- A low maximum keeps running costs lower, but can degrade performance in the face of volatile load levels.
Note: When settling back to normal levels after a load spike, the number of idle instances may temporarily exceed your specified maximum. However, you will not be charged for more instances than the maximum number you've specified.
Setting the Pending Latency
Note: In order to specify the maximum pending latency, you must have a paid app .
Pending request latency arises when all of your application's available instances are too busy to serve new requests. When this happens, incoming requests go to a pending request queue . The scheduler automatically manages creation of new instances for pending requests, but you can adjust its behavior through minimum and maximum latency settings. These settings effectively control how long a request waits in the pending queue when there are no available instances: no less than the minimum, and no more than the maximum.
The App Engine scheduler will always wait until the specified minimum latency for a free instance to become available. Once the minimum is reached, it applies heuristics to determine if and when to start a new instance. (Waiting for an existing instance to become free may be faster than starting a new one.) If the request is still pending when the specified maximum latency is reached, App Engine immediately starts a new instance to serve it.
Note: If you set a minimum number of idle instances, the Pending Latency controls will have little or no effect on your app's performance. See Minimum Idle Instances for more information.
The first Pending Latency slider sets the minimum period of time (at least 10 milliseconds) that the scheduler will wait for a free instance to serve the request.
- A low minimum means requests must spend less time in the pending queue when all existing instances are active. This improves performance but increases the cost of running your application.
- A high minimum means requests will remain pending longer if all existing instances are active. This lowers running costs but increases the time users must wait for their requests to be served.
The second slider sets the maximum period of time (at most 15 seconds) that the scheduler will wait before resolving to create a new instance for the request.
- A low maximum means App Engine will start new instances sooner for pending requests, improving performance but raising running costs.
- A high maximum means users may wait longer for their requests to be served (if there are pending requests and no idle instances to serve them), but your application will cost less to run.