This is a Limited Preview release of HTTP load balancing. As a result, it might be changed in backward-incompatible ways and is not recommended for production use. It is not subject to any SLA or deprecation policy. Request to be whitelisted to use this feature .
HTTP load balancing provides global load balancing for HTTP requests. It also
allows requests to be sent to different sets of backends, based on patterns in
the URL. You can load balance HTTP requests on port 80 only.
HTTP load balancing is configured by using different API objects from those used for network load balancing. For HTTP load balancing, you use global forwarding rules that point to a target HTTP proxy , which checks requests against a URL map to determine the appropriate backend service for each request. The backend service object defines groups of instances contained in resource views that are available to handle these requests and which health check to use against those instances.
The following examples use HTTP load balancing:
Load distribution algorithm
HTTP load balancing provides two methods of determining instance load. Within
the backend service object, the
balancingMode
property selects between the
requests per second (RPS) and CPU utilization modes. Both modes allow a maximum
value to be specified; the HTTP load balancer will try to ensure that load
remains under the limit, but short bursts above the limit can occur during
failover or load spike events.
Incoming requests are sent to the region that is closest to the user and has remaining capacity. If more than one zone is configured with backends in a region, the traffic is distributed across the resource groups in each zone according to each group's capacity. Within the zone, the requests are spread evenly over the instances within each resource view.
Global forwarding rules and addresses
Global forwarding rules route traffic by IP address, port, and protocol to a load balancing configuration consisting of a target proxy, URL map, and one or more backend services.
The global forwarding rule provides a single global IP address that can be used in DNS records for your application. No DNS-based load balancing is required. You can specify the IP address to be used, or let Google Compute Engine assign one for you.
Target proxies
Target HTTP proxies terminate HTTP/TCP connections from clients, and are referenced by one or more global forwarding rules and route the incoming requests to a URL map.
The proxies set HTTP request/response headers as follows:
-
Via: 1.1 google
(requests and responses) -
X-Forwarded-For: <client IP>, <global forwarding rule external IP>
(requests only)
URL maps
URL maps define matching patterns for URL based-routing of requests to the appropriate backend services. A default service is defined to handle any requests that do not match a specified host rule or path matching rule. In some situations, such as the cross-region load balancing example , you might not define any URL rules and rely only on the default service. For content-based routing of traffic, the URL map allows you to divide your traffic by examining the URL components to send requests to different sets of backends.
Backend services
Backend services define groups of backend instances and their serving capacity, which can be based on CPU or requests per second ( RPS ). Each backend service lists groups of instances within resource views. The backend service also specifies which health checks will be performed against the available instances.
Resource views
Resource views are a grouping mechanism to enable operations against a set of Cloud resources. In the case of load balancing, the resource views group virtual machine instances that are available as a backend services group. You can add or remove instances to these resource groups as needed. Your resource views might also be referenced by other Cloud services.
Get started
The following guides demonstrate two different scenarios using the HTTP load balancing service. These scenarios provide the building blocks for you to understand the HTTP load balancing concepts and demonstrate how you might set up load balancing for your specific needs.
Cross-region load balancing
Start deploying a load balancing solution that serves traffic to users from the nearest region and balances the load across the instances within that region. If the load in a region exceeds the capacity for the instances in that region, traffic will be routed to other regions.
Get startedContent-based load balancing
Deploy a content-aware load balancing solution that routes HTTP requests to specific instances that are optimized for the load. The content-based example uses path matching within the URL maps to pick the appropriate backend for a given request.
Get started