Please note that the contents of this offline web site may be out of date. To access the most recent documentation visit the online version .
Note that links that point to online resources are green in color and will open in a new window.
We would love it if you could give us feedback about this material by filling this form (You have to be online to fill it)



Compute Engine Autoscaler

Limited Preview

This is a Limited Preview release of Compute Engine Autoscaler. As a result, it might be changed in backward-incompatible ways and is not recommended for production use. It is not subject to any SLA or deprecation policy. Request to be whitelisted to use this feature .


Compute Engine Autoscaler lets you create autoscalers that watch your Google Compute Engine virtual machine instances and automatically add or remove resources to account for changes in traffic or other signals, without user interaction. In this case, Compute Engine Autoscalers will watch the utilization levels of a replica pool and scale the number of virtual machines in the pool based on your desired parameters. You can choose to scale a replica pool based on the average CPU utilization of a group of virtual machines, the serving capacity of a group of load balanced virtual machines, or based on a number of custom Cloud Monitoring metrics .

Autoscaling is useful in scenarios when your service experiences heavy traffic and strains existing virtual machine instances and your users start to experience a decrease in performance so you want to add additional resources or in periods where you may experience low traffic and want to remove virtual machines to save resources and costs. The autoscaler can perform these functions for you without additional action on your part.

The Compute Engine Autoscaler service can be used as a standalone API or configured through Deployment Manager .

Contents

Overview

The Compute Engine Autoscaler service works by watching a group of virtual machines inside a replica pool and scaling up and down the number of virtual machines in the pool based on your selected parameters. You can scale the replica pool based on the serving capacity or CPU utilization of the virtual machines, or on a Cloud Monitoring metric.

The Compute Engine Autoscaler service uses the Replica Pool API and you will not be able to use Compute Engine Autoscaler without Replica Pool API access.

Scaling based on CPU utilization

The autoscaler can scale a replica pool based on the average CPU utilization of the virtual machines in the pool. This option tells the autoscaler to watch the average CPU utilization of virtual machines inside a replica pool and scale up or down based on the CPU utilization.

For example, if you set up the Compute Engine Autoscaler service to maintain the average CPU of a group of virtual machines at 75%, the autoscaler will scale up or down the number of virtual machines to maintain the 75% target level. This target is represented in the fraction of used cores, so if your total number of cores for the replica pool is 8, then 75% utilization means 6 cores should be used.

In detail, the Autoscaler service performs these actions:

  1. Watches and calculates the average CPU level of your replicas in the specified replica pool every minute.
  2. If the average CPU utilization of your replicas is significantly less than the target CPU utilization, the autoscaler scales down the number of replicas until it reaches the minimum number of replicas you specified, or until the average CPU of your replicas reaches the target utilization level. In this example, the target CPU utilization is 75%.
  3. If the average CPU utilization of your replicas is higher than the target utilization, the autoscaler automatically scales up to the maximum number of replicas or until the average CPU of your replicas comes down to your target utilization.

Autoscaling a network load balanced replica pool

A network load balancer allows you to balance the load of your systems based on incoming protocol-based data, such as address, port, or protocol type. You can use autoscaling with a network load balancer to scale virtual machines based on CPU utilization.

A network load balancer uses forwarding rules that point to target pools containing virtual machines to handle load. To use autoscaler with a network load balancer:

  1. Specify the target pool that is part of your load balancer in your replica pool template when you create your replica pool.

    This adds the replicas in your pool to the target pool as part of the load balancing virtual machines.

  2. Set up autoscaling for the replica pool using the CPU utilization scaling method.

The autoscaler will then scale based on the CPU utilization of your now network load balanced replica pool.

Scaling based on serving capacity

Compute Engine also offers the ability to use HTTP load balancing for your virtual machine instances. HTTP load balancing lets you balance load based on patterns in the URL of your requests. It is only possible to load balance HTTP requests on port 80.

If your replica pool is part of an HTTP load balancer, you can use autoscaler to scale your virtual machines based on the virtual machines' serving capacity, defined in your load balancer setup. To use autoscaling with an HTTP load balancer, you must:

  1. Add your replica pool that manages a group of instances to your load load balanced resource view. Provide the resource view URL in the replica pool template , or create a new resource view and specify your replica pool as part of the view. This lets the replica pool automatically register replicas so that they are part of the view.
  2. Create a global forwarding rule that points to a backend service , if it does not already exist.
  3. Point your backend service to the resource view that contains your replica pool, if it does not already do so.

To connect your load balancer with an autoscaler, provide the replica pool when creating your autoscaler object. The autoscaler will look up the attached load balancer through the replica pool. The following diagram describes this relationship:

Diagram describing how autoscaling works with load balancing

The serving capacity of a group of virtual machines is the maximum amount of traffic that the group can serve. Once the virtual machines exceed the maximum capacity, additional traffic is diverted to other groups of virtual machines. For example, consider a group of virtual machines with a serving capacity defined using the maxRatePerInstance property and set to 50 requests per second (RPS). Once the virtual machines exceed the serving capacity, in this case 50 RPS per machine, the load balancer starts to send traffic to virtual machines in different regions or zones. This can cause an increase in latency or other undesired effects for your users.

To alleviate this, you can set up an autoscaler to scale when a replica pool reaches a fraction of the serving capacity. In this case, if your serving capacity is 50 RPS per instance, you can set the autoscaler to 0.7 or 70% of the serving capacity so once the virtual machines reach 35 RPS, which is 70% of 50, the autoscaler starts adding virtual machines.

The autoscaler works similarly if the serving capacity of a group of virtual machines is set using the maxUtilization parameter.

Scaling based on a Cloud Monitoring metric

If you want to scale a group of virtual machines based on metrics other than CPU utilization or serving capacity, you can choose from a list of metrics provided by the Cloud Monitoring service.

From the list of metrics , you must choose a metric that does not have any negative values and is also a virtual machine utilization metric , indicated by the following prefix:

compute.googleapis.com/instance/*

For example, the following is a valid metric:

compute.googleapis.com/instance/network/sent_bytes_count

The following is an invalid metric because the value does not change based on utilization and the autoscaler cannot use the value to scale proportionally:

compute.googleapis.com/instance/cpu/reserved_cores

Sign up for Compute Engine Autoscaler

Compute Engine Autoscaler is currently in Limited Preview and you must request access to the service. To sign up for Compute Engine Autoscaler, follow the instructions below.

Enable the Compute Engine API

The Compute Engine Autoscaler service requires access to Compute Engine resources. You must first enable the Compute Engine API before enabling Replica Pools.

To sign up for and enable Compute Engine, see the Compute Engine signup instructions

Enable the Replica Pool API

The Compute Engine Autoscaler product uses the Replica Pool service to manage virtual machines. Sign up and enable the Replica Pool API.

Request access to the service

To request access to the Compute Engine Autoscaler service, click the button below and fill out the Limited Preview form.

Request access

Enable the Autoscaler API

After you have been given access to the API, enable it for your Google Developers Console project.

Enable the API

Select your Google Compute Engine project. Then click Continue to enable the API for that project.

Concepts

The following are key concepts that make up the Compute Engine Autoscaler service.

Compute Engine replica pools

The Compute Engine Autoscaler service does not manage virtual machines directly but through Compute Engine replica pools . Replica pools are pools of homogeneous virtual machines. When you create your autoscaling configuration, you must provide the name of a replica pool that the autoscaler should scale.

The replica pool you want to use for autoscaling must exist prior to defining your autoscaler. To use the autoscaler together with replica pools, you must enable the Replica Pool API and the Compute Engine Autoscaler in the same Developers Console project.

Zones

Zones are the geographical locations where virtual machines live. In order to scale your desired virtual machines, an autoscaler must also live in the same zone as the replica pool. For example, if a replica pool lives in the us-central1-a zone, your autoscaler must also live in the us-central1-a zone.

Utilization metric

When you define your autoscaler, you must select what utilization metric the autoscaler should measure. A utilization metric describes how busy a virtual machine is and is defined as a metric whose value increases or decreases proportionally to the number of available virtual machines. For example, consider the following as valid utilization methods:

  1. The CPU usage of a group of virtual machines. The CPU utilization will decrease or increase proportionally as the autoscaler adds and removes additional virtual machines.
  2. The output of your virtual machines. The number of bytes a virtual machine outputs can increase or decrease based on the number of virtual machines that the autoscaler adds or removes.

The following are examples of bad utilization metrics:

  1. The disk usage of an application that does lots of processing. Although this can used as a measured amount, it doesn't provide much information about the actually processing work being done by the virtual machine and would be an inaccurate way to measure a virtual machine's utilization.
  2. The reserved cores of a virtual machine. Since this is a static number, it doesn't qualify as a utilization metric.

The autoscaler collects utilization information about your metric and uses it to decide when to scale based on the target utilization level. With Compute Engine Autoscaler, it is possible to measure:

Target utilization level

Along with the utilization metric, you must also define the target utilization level of the metric. This is the target utilization that the autoscaler works to maintain by scaling up and down the number of virtual machines in a replica pool.

Based on the metric type, the target utilization level will apply differently. For the average CPU utilization of a group of virtual machines, the target utilization level represents the target average CPU utilization level you would like the autoscaler to maintain. If you are scaling based on the serving capacity of a group of virtual machines, the target utilization level applies as a fraction of the maximum serving capacity. For example, if the maximum requests per second per instance is 100 and you set an autoscaler to maintain 70% of the serving capacity utilization, the autoscaler will scale once the RPS exceeds 70% of the maximum rate per instance, which is 70 RPS.

For any custom Cloud Monitoring metric, the autoscaler will scale up and down to maintain your target utilization level for that metric.

Creating an autoscaler

You can create an autoscaler through the command-line tool or the API. This section provides instructions on how to use the service through both methods.

Set up your environment

In order to use Compute Engine Autoscaler in the API or the command-line tool, you must set up your environment. These examples use the command-line tool and the Google APIs Python client library .

Set up the command-line tool


To prepare the command-line tool to use Compute Engine Autoscaler service, follow these steps.

  1. Install the gcloud tool.
  2. Once you have installed the tool, enable preview features in the gcloud tool by running the following command. If you have already enabled preview features for another service, you do not need to run this command again. If you are not sure, you can rerun this command.

    $ gcloud components update preview
    
  3. Invoke the command-line tool for Compute Engine Autoscaler:

    $ gcloud [--project PROJECT] preview autoscaler COMMAND
    

    See the sections below that describe how to work with Compute Engine Autoscaler. You will replace COMMAND in the example above with the subcommand that you need to use.

Set up the Python client library


To use the Python client library, you must authorize and build the service.

Authorize access

The Compute Engine Autoscaler API uses OAuth 2.0 authorization. You will need to create a client ID and client secret, and use both with the oauth2client library. By default, the oauth2 library is included in the Python client library.

To find your project's client ID and client secret, do the following:

  1. Go to the Google Developers Console .
  2. Select a project, or create a new one.
  3. In the sidebar on the left, expand APIs & auth . Next, click APIs . In the list of APIs, make sure the status is ON for the Google Compute Engine API.
  4. In the sidebar on the left, select Credentials .
  5. If you haven't done so already, create your project's OAuth 2.0 credentials by clicking Create new Client ID , and providing the information needed to create the credentials.
  6. Look for the Client ID and Client secret in the table associated with each of your credentials.

Note that not all types of credentials use both a client ID and client secret and won't be listed in the table if they are not used.

For this sample, select Installed Application when creating your client ID, and Other under Installed Application Type .

Once you have created your client ID and secret, save it locally by clicking Download JSON and naming the file client_secrets.json in the same directory as the code.

In your code, you can authorize to the Compute Engine Autoscaler service using your client_secrets.json file:

#!/usr/bin/env python

import logging
import sys
import argparse
import httplib2
from oauth2client.client import flow_from_clientsecrets
from oauth2client.file import Storage
from oauth2client import tools
from oauth2client.tools import run_flow

from apiclient.discovery import build

CLIENT_SECRETS = "client_secrets.json" OAUTH2_STORAGE = "oauth2.dat" OAUTH2_SCOPE = "https://www.googleapis.com/auth/compute"

def main(argv): logging.basicConfig(level=logging.INFO) parser = argparse.ArgumentParser( description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter, parents=[tools.argparser]) # Parse the command-line flags. flags = parser.parse_args(argv[1:])

# Perform OAuth 2.0 authorization. flow = flow_from_clientsecrets(CLIENT_SECRETS, scope=OAUTH2_SCOPE) storage = Storage(OAUTH2_STORAGE) credentials = storage.get() if credentials is None or credentials.invalid: credentials = run_flow(flow, storage, flags) http = httplib2.Http() auth_http = credentials.authorize(http)

if __name__ == '__main__': main(sys.argv)

Build the service

Next, you must build the service.

To build the autoscaling service, use the build method, and provide the autoscaling API name and the desired version. Add the following bold lines to your hello-world.py file:

#!/usr/bin/env python

import logging
import sys
import argparse
import httplib2
from oauth2client.client import flow_from_clientsecrets
from oauth2client.file import Storage
from oauth2client import tools
from oauth2client.tools import run_flow

from apiclient.discovery import build

CLIENT_SECRETS = "client_secrets.json" OAUTH2_STORAGE = "oauth2.dat" OAUTH2_SCOPE = "https://www.googleapis.com/auth/compute"

API_VERSION = "v1beta2"

def main(argv): logging.basicConfig(level=logging.INFO) parser = argparse.ArgumentParser( description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter, parents=[tools.argparser]) # Parse the command-line flags. flags = parser.parse_args(argv[1:])

# Perform OAuth 2.0 authorization. flow = flow_from_clientsecrets(CLIENT_SECRETS, scope=OAUTH2_SCOPE) storage = Storage(OAUTH2_STORAGE) credentials = storage.get() if credentials is None or credentials.invalid: credentials = run_flow(flow, storage, flags) http = httplib2.Http() auth_http = credentials.authorize(http)

# Build the service autoscaler_service = build('autoscaler', API_VERSION, http=auth_http)

if __name__ == '__main__': main(sys.argv)

Create your replica pool

The Autoscaler service scales a group of virtual machines within a single replica pool. If you have an existing replica pool, you can use it in your autoscaler. If you haven't already created a replica pool, you must create one before you can use an autoscaler. The easiest way to create a replica pool is through the command-line tool.

  1. Make sure you have downloaded and installed the command-line tool.
  2. Follow the instructions for creating a replica pool . If you intend to use load balancing alongside autoscaler for this replica pool, make sure to add the replica pool to the appropriate target pool for network load balancing, or resource view for HTTP load balancing, in your replica pool template .
  3. Get your replica pool selfLink URL. For example, run the following command to get information about your replica pools in zone us-central1-a.

    $ gcloud preview replica-pools --zone us-central1-a list
    

    Look for your replica pool selfLink , which appears in a format similar to the following:

    {
        "autoRestart": true,
        "currentNumReplicas": 1,
        "name": "gcm-8217d4fa-1ec7-4827-94a0-40a2ceb618c1",
        "numReplicas": 5,
        "selfLink": "https://www.googleapis.com/replicapools/v1beta1/project/<project-id>/zones/<zone>/pools/sample-pool",
        ... }
    

Create an autoscaler

Command-line tool


In the command-line tool, you can create a new autoscaler by running the the gcloud preview autoscaler create command. For example, the following creates an autoscaler that scales based on CPU utilization:

$ gcloud preview autoscaler --zone us-central1-a create example-autoscaler \
         --max-num-replicas 20 \
         [--min-num-replicas 5] \
         --target-cpu-utilization 0.6 \
         --target https://www.googleapis.com/replicapools/v1beta/project/my-project/zones/us-central1-a/pools/example-pool

See the gcloud compute reference for more information. Here are some important flags and parameters that can be used with this command:

Important flags and parameters
--max-num-replicas MAX_NUM
[Required] The maximum number of replicas the autoscaler must maintain. The autoscaler will never resize above this number.
--min-num-replicas MIN_NUM
[Optional] The minimum number of replicas the autoscaler must maintain. The autoscaler will never resize down below this number. If not specified, the default is 2.
--target-cpu-utilization TARGET
[Optional] The target CPU utilization that the autoscaler should maintain. It is represented as a fraction of used cores. For example: In 8-core VMs, 6 cores being used means 75% utilization and is represented here as 0.75. Must be a float between 0.0 and 1.0. If not specified, the default is 0.8. You must specify only one of target-cpu-utilization , target-load-balancer-utilization , or target-custom-utilization . If none of these are specified, the default is a target-cpu-utilization at 0.8. For more information, see Scaling based on CPU utilization .
--target URL_REPLICA_POOL
[Required] The fully-qualified URL to the replica pool. You can only provide one replica pool per autoscaler. For example:
  https://www.googleapis.com/replicapool/v1beta1/projects/<project>/zones/<zone-name>/pools/<pool>
--cool-down-period PERIOD
[Optional] The number of seconds the autoscaler should wait between resizing the replica pool. The default is 60 seconds.
--custom-metric METRIC
[Optional] The custom Cloud Monitoring metric to use for this autoscaler. The metric must not have negative values and must provide utilization details of the virtual machine instances. If you specify this, you must also specify the target-custom-utilization flag.
--target-custom-utilization TARGET
[Optional] This is required if you use custom-metric . Defines the target utilization value of the custom Cloud Metric you selected. You must specify only one of target-cpu-utilization , target-load-balancer-utilization , or target-custom-utilization . If none of these are specified, the default is a target-cpu-utilization at 0.8. For more information, see Scaling based on custom metrics .
--target-load-balancer-utilization TARGET
[Optional] Defines the target utilization value of the load balancer attached to the replica pool. This defines the fraction of the load balancer's maximum serving capacity that the autoscaler should maintain. You must specify only one of target-cpu-utilization , target-load-balancer-utilization , or target-custom-utilization . If none of these are specified, the default is a target-cpu-utilization at 0.8. For more information, see Scaling based on serving capacity .

Python client library


To create an autoscaler in the API, you must first construct your request body. The required parameters for creating an autoscaler are:

All other parameters are optional and if not set, have default values. To see a full list of parameters, see the reference documentation .

Once you have the above information, you can construct the body of your request. Add the following bold lines to your hello-world.py file:

#!/usr/bin/env python

import logging
import sys
import argparse
import httplib2
from oauth2client.client import flow_from_clientsecrets
from oauth2client.file import Storage
from oauth2client import tools
from oauth2client.tools import run_flow

from apiclient.discovery import build

CLIENT_SECRETS = "client_secrets.json" OAUTH2_STORAGE = "oauth2.dat" OAUTH2_SCOPE = "https://www.googleapis.com/auth/compute"

API_VERSION = "v1beta2"

PROJECT_ID = "<your-project-id>" ZONE = "<desired-zone>" # For example, us-central1-a REPLICA_POOL = "<url-to-replica-pool>" AVG_CPU = "<desired-target-cpu-utilization>"

def main(argv): logging.basicConfig(level=logging.INFO) parser = argparse.ArgumentParser( description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter, parents=[tools.argparser]) # Parse the command-line flags. flags = parser.parse_args(argv[1:])

# Perform OAuth 2.0 authorization. flow = flow_from_clientsecrets(CLIENT_SECRETS, scope=OAUTH2_SCOPE) storage = Storage(OAUTH2_STORAGE) credentials = storage.get() if credentials is None or credentials.invalid: credentials = run_flow(flow, storage, flags) http = httplib2.Http() auth_http = credentials.authorize(http)

# Build the service autoscaler_service = build('autoscaler', API_VERSION, http=auth_http)

insertAutoscaler(autoscaler_service)

def insertAutoscaler(autoscaler_service): '''Creates a new autoscaler that scales based on cpuUtilization. It is possible to create an autoscaler that scales based on cpuUtilization, customMetricUtilizations, or loadBalancingUtilization.''' body = { "name": "new-autoscaler", "target": REPLICA_POOL, "autoscalingPolicy": { "cpuUtilization": { "utilizationTarget": AVG_CPU }, "maxNumReplicas": 10 } } request = autoscaler_service.autoscalers().insert(project=PROJECT_ID, zone=ZONE, body=body) response = request.execute() print response

if __name__ == '__main__': main(sys.argv)

In the code snippet above:

If you want to create an autoscaler based on load balancing or a custom metric, use the loadBalancingUtilization property or the customMetricUtilizations .

For example, the following creates an autoscaler that uses load balancing utilization:

def insertAutoscalerCustomMetric(autoscaler_service):
'''Creates a new autoscaler that scales base on load balancing utilization. It is
possible to create an autoscaler that scales based on cpuUtilization,
customMetricUtilizations, or loadBalancingUtilization.'''

  body = {
    "name": "load-balancing-autoscaler",
    "target": REPLICA_POOL,
    "autoscalingPolicy": {
       "loadBalancingUtilization": {
          "utilizationTarget": 0.6
        },
       "maxNumReplicas": 10
    }
  }

  request = autoscaler_service.autoscalers().insert(project=PROJECT_ID, zone=ZONE, body=body)
  response = request.execute()

  print response

Get information about your autoscaler

To confirm that the autoscaler was successfully created, you can make a request to get information about the autoscaler you just created.

Command-line tool


In the command-line tool, run the get command:

$ gcloud preview autoscaler --zone ZONE get AUTOSCALER

Python client library


Add the following bold lines to your script:

#!/usr/bin/env python

import logging
import sys
import argparse
import httplib2
from oauth2client.client import flow_from_clientsecrets
from oauth2client.file import Storage
from oauth2client import tools
from oauth2client.tools import run_flow

from apiclient.discovery import build

CLIENT_SECRETS = "client_secrets.json" OAUTH2_STORAGE = "oauth2.dat" OAUTH2_SCOPE = "https://www.googleapis.com/auth/compute"

API_VERSION = "v1beta2"

PROJECT_ID = "<your-project-id>" ZONE = "<desired-zone>" # For example, us-central1-a REPLICA_POOL = "<url-to-replica-pool>" AVG_CPU = "<desired-target-cpu-utilization>"

def main(argv): logging.basicConfig(level=logging.INFO) parser = argparse.ArgumentParser( description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter, parents=[tools.argparser]) # Parse the command-line flags. flags = parser.parse_args(argv[1:])

# Perform OAuth 2.0 authorization. flow = flow_from_clientsecrets(CLIENT_SECRETS, scope=OAUTH2_SCOPE) storage = Storage(OAUTH2_STORAGE) credentials = storage.get() if credentials is None or credentials.invalid: credentials = run_flow(flow, storage, flags) http = httplib2.Http() auth_http = credentials.authorize(http)

# Build the service autoscaler_service = build('autoscaler', API_VERSION, http=auth_http)

# insertAutoscaler(autoscaler_service) getAutoscaler(autoscaler_service)

def getAutoscaler(autoscaler_service): ''' Gets information about a specific autoscaler.''' request = autoscaler_service.autoscalers().get(project=PROJECT_ID, zone=ZONE, autoscaler="new-autoscaler") response = request.execute() print response

def insertAutoscaler(autoscaler_service): '''Creates a new autoscaler that scales based on cpuUtilization. It is possible to create an autoscaler that scales based on cpuUtilization, customMetricUtilizations, or loadBalancingUtilization.''' body = { "name": "new-autoscaler", "target": REPLICA_POOL, "autoscalingPolicy": { "cpuUtilization": { "utilizationTarget": AVG_CPU }, "maxNumReplicas": 10 } } request = autoscaler_service.autoscalers().insert(project=PROJECT_ID, zone=ZONE, body=body) response = request.execute() print response

if __name__ == '__main__': main(sys.argv)

The actual request is made using the get() method, passing in the project ID, the zone where the autoscaler lives, and the name of the autoscaler.

In response, you should receive a JSON representation of the operation object responsible for this request, similar to the following:

{u'status': u'DONE', u'kind': u'autoscaler#operation',
u'name': u'new-autoscaler',
u'targetLink': u'https://www.googleapis.com/autoscaler/v1beta2/projects/<project>/zones/<zone>/autoscalers/new-autoscaler',
u'operationType': u'insert', u'progress': 100}

Updating an autoscaler

You can update an autoscaler using the update command in gcloud, or using the update() method in the API.

Updating lets you modify an existing autoscaler, but you must set all configuration settings for your autoscaler in every update request if you want to keep the current values. Otherwise, any settings not explicitly defined in your request will reset to the default values.

For example, if you make an update request with a new value for the maximum number of replicas and minimum number of replicas but do not define any other configuration parameters, such as the target CPU utilization, the cool down period, and so on, the undefined parameters will all be reset to the default values.

When you update an autoscaler, it may take some time for the changes to propagate, and it may be a couple of minutes before your new autoscaler settings are reflected.

Command-line tool


Provide the same fields as you would to create a new autoscaler . The same required fields for creating an autoscaler are also required for all update requests:

$ gcloud preview autoscaler --zone ZONE update AUTOSCALER \
--target URL --max-num-replicas MAX_NUM ...

Python client library

To update your autoscaler, provide an update request body using the update() method:

def updateAutoscaler(autoscaler_service):
''' Updates an autoscaler. Requires a batch update.'''

  newConfig = {
    "target": REPLICA_POOL,
    "autoscalingPolicy": {
       "cpuUtilization": {
          "utilizationTarget": AVG_CPU
        },
       "maxNumReplicas": 50,
       "coolDownPeriodSec": 120
    }
  }

  request = autoscaler_service.autoscalers().update(project=PROJECT_ID, zone=ZONE, autoscaler="new-autoscaler", body=newConfig)
  response = request.execute()

  print response

When you perform any requests that modify data, a Zone Operation resource is returned, and you can query the operation to check the status of your change.

Deleting an autoscaler

Command-line tool


Use the delete command to delete an autoscaler.

$ gcloud preview autoscaler --zone ZONE delete AUTOSCALER

Python client library


Similarly, to delete an autoscaler, make a request using the delete() method:

def deleteAutoscaler(autoscaler_service):
  request = autoscaler_service.autoscalers().deleteconfig(project=PROJECT_ID, zone=ZONE, autoscaler="new-autoscaler")
  response = request.execute()

  print response

When you perform any requests that modify data, a Zone Operation resource is returned, and you can query the operation to check the status of your change.

Getting help and providing feedback

For feedback and questions, use the Limited Preview discussion group .

You should have been added to the discussion group when you joined the Limited Preview group, but if not, you can join the discussion group by replying to the initial e-mail you received after joining the program.

Authentication required

You need to be signed in with Google+ to do that.

Signing you in...

Google Developers needs your permission to do that.