Google Compute Engine offers server-side load balancing so you can distribute incoming network traffic across multiple virtual machine instances. Load balancing provides the following benefits with either network load balancing or HTTP load balancing:
- Scale your application
- Support heavy traffic
- Detect unhealthy virtual machines instances
- Balance loads across regions
- Route traffic to the closest virtual machine
- Support content-based routing
Google Compute Engine load balancing uses forwarding rule
objects that match and direct certain types of traffic to a load balancer. For
example, a forwarding rule can match TCP traffic to port 80 on a public IP
address of
192.0.2.1
. Any traffic destined to that IP, protocol, and port is
matched by this forwarding rule and is directed to healthy virtual machine
instances based on health-check rules that you define.
Google offers two types of load balancing that differ in capabilities, usage scenarios, and how you configure them. The scenarios below can help you decide whether Network or HTTP best meet your needs.
Prerequisites
The
gcloud
command line interface
For both network and HTTP load balancing, you will use the
gcloud
command to
configure services:
-
Install the Cloud SDK , which installs the
gcloud
command and other Cloud tools. -
Authenticate with the
gcloud
command. You can also specify a default project to use forgcloud
commands.$ gcloud auth login $ gcloud config set project PROJECT
-
Install the
gcloud preview
component:$ gcloud components update preview
Load balancing with Windows instances
It is possible to use load balancing with
Windows virtual machine instances
,
but you can only use load balancing with Windows instances that are in zones
without the
-windows
suffix. For example, you can not use load balancing with
Windows instances in europe-west1-windows, us-central2-windows, or
us-west1-windows zones.
Scenarios
The following situations are examples of load balancing and the type of load balancing configuration that you would need in each scenario.
Network load balancing
Assume that you are running a website on Apache and you are starting to get a high enough level of traffic and load that you need to add additional Apache instances to help respond to this load. You can add additional Google Compute Engine instances and configure load balancing to spread the load between these instances. In this situation, you would serve the same content from each of the instances. As your site becomes more popular, you would continue increasing the number of instances that are available to respond to requests.
In this situation, you would choose network load balancing to map incoming TCP/IP requests on port 80 (forwarding rules) to your target pool of virtual machines in the same region. You would configure a forwarding rule object, a target pool that lists the instances to receive traffic, and health checks rules. In this situation, you do not require the capabilities or more complex configuration of HTTP load balancing.
Get started with network load balancing
HTTP load balancing
Cross-region load balancing
The network load balancing scenario above scales well for a single region, but to extend the service across regions, you would need to employ unwieldy and sometimes problematic solutions. By using HTTP load balancing in this situation, you can use a global IP address that is a special IP that can intelligently route users based on proximity. You can increase performance and system reliability for a global user base by defining a simple topology.
In this situation, you define global forwarding rules that map to a target HTTP proxy, which routes requests to the closest instances within a back-end service. The back-end service objects defines the groups of instances that are able to handle the requests.
Get started with cross-region load balancing
Content-based load balancing
Content-based or content-aware load balancing uses HTTP load balancing to distribute traffic to different instances based on the incoming HTTP URI. For example, you have a site that serves static content (css, images), dynamic content, and video uploads. You can architect your load balancing configuration to serve the traffic from different sets of instances that are optimized for the type of content that they are serving.
In this situation, your global forwarding rules map to a target HTTP proxy, which checks requests against a URL map to determine which back-end service is appropriate for the request. The back-end service then distributes the request to one of its resource groups that has one or more virtual machine instances.
Get started with content-based load balancing
Content-based and cross-region load-balancing can work together by using multiple backend services and multiple regions. You can build on top of the scenarios above to configure your own load balancing configuration that meets your application's needs.