Load Balancing

Google Compute Engine offers server-side load balancing so you can distribute incoming network traffic across multiple virtual machine instances. Load balancing provides the following benefits with either network load balancing or HTTP load balancing:

Scale your application
Support heavy traffic
Detect unhealthy virtual machines instances
Balance loads across regions
Route traffic to the closest virtual machine
Support content-based routing

Google Compute Engine load balancing uses forwarding rule objects that match and direct certain types of traffic to a load balancer. For example, a forwarding rule can match TCP traffic to port 80 on a public IP address of 192.0.2.1 . Any traffic destined to that IP, protocol, and port is matched by this forwarding rule and is directed to healthy virtual machine instances based on health-check rules that you define.

Google offers two types of load balancing that differ in capabilities, usage scenarios, and how you configure them. The scenarios below can help you decide whether Network or HTTP best meet your needs.

Prerequisites

The `gcloud` command line interface

For both network and HTTP load balancing, you will use the gcloud command to configure services:

Install the Cloud SDK , which installs the gcloud command and other Cloud tools.
Authenticate with the gcloud command. You can also specify a default project to use for gcloud commands.
```
$ gcloud auth login
$ gcloud config set project PROJECT
```
Install the gcloud preview component:
```
$ gcloud components update preview
```

Load balancing with Windows instances

It is possible to use load balancing with Windows virtual machine instances , but you can only use load balancing with Windows instances that are in zones without the -windows suffix. For example, you can not use load balancing with Windows instances in europe-west1-windows, us-central2-windows, or us-west1-windows zones.

Scenarios

The following situations are examples of load balancing and the type of load balancing configuration that you would need in each scenario.

Network load balancing

Network load
balancing

Assume that you are running a website on Apache and you are starting to get a high enough level of traffic and load that you need to add additional Apache instances to help respond to this load. You can add additional Google Compute Engine instances and configure load balancing to spread the load between these instances. In this situation, you would serve the same content from each of the instances. As your site becomes more popular, you would continue increasing the number of instances that are available to respond to requests.

In this situation, you would choose network load balancing to map incoming TCP/IP requests on port 80 (forwarding rules) to your target pool of virtual machines in the same region. You would configure a forwarding rule object, a target pool that lists the instances to receive traffic, and health checks rules. In this situation, you do not require the capabilities or more complex configuration of HTTP load balancing.

Get started with network load balancing

HTTP load balancing

Cross-region load balancing

Representation of
cross-region load balancing

The network load balancing scenario above scales well for a single region, but to extend the service across regions, you would need to employ unwieldy and sometimes problematic solutions. By using HTTP load balancing in this situation, you can use a global IP address that is a special IP that can intelligently route users based on proximity. You can increase performance and system reliability for a global user base by defining a simple topology.

In this situation, you define global forwarding rules that map to a target HTTP proxy, which routes requests to the closest instances within a back-end service. The back-end service objects defines the groups of instances that are able to handle the requests.

Get started with cross-region load balancing

Content-based load balancing

Representation of
content-based load balancing

Content-based or content-aware load balancing uses HTTP load balancing to distribute traffic to different instances based on the incoming HTTP URI. For example, you have a site that serves static content (css, images), dynamic content, and video uploads. You can architect your load balancing configuration to serve the traffic from different sets of instances that are optimized for the type of content that they are serving.

In this situation, your global forwarding rules map to a target HTTP proxy, which checks requests against a URL map to determine which back-end service is appropriate for the request. The back-end service then distributes the request to one of its resource groups that has one or more virtual machine instances.

Get started with content-based load balancing

Content-based and cross-region load-balancing can work together by using multiple backend services and multiple regions. You can build on top of the scenarios above to configure your own load balancing configuration that meets your application's needs.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 3.0 License , and code samples are licensed under the Apache 2.0 License . For details, see our Site Policies .

Last updated August 6, 2014.

Google Compute Engine

Load Balancing

Prerequisites

The `gcloud` command line interface

Load balancing with Windows instances

Scenarios

Network load balancing

HTTP load balancing

Cross-region load balancing

Content-based load balancing

Authentication required

Signing you in...

Google Compute Engine

Load Balancing

Prerequisites

The gcloud command line interface

Load balancing with Windows instances

Scenarios

Network load balancing

HTTP load balancing

Cross-region load balancing

Content-based load balancing

Authentication required

Signing you in...

The `gcloud` command line interface