Backend services overview | Load Balancing

A backend service defines how Cloud Load Balancing distributes traffic.The backend service configuration contains a set of values, such as theprotocol used to connect to backends, various distribution and sessionsettings, health checks, and timeouts. These settings provide fine-grainedcontrol over how your load balancer behaves. To get you started,most of the settings have default values that allow for fastconfiguration. A backend service is either global orregional in scope.

Load balancers, Envoy proxies, and proxyless gRPC clients use the configurationinformation in the backend service resource to do the following:

Direct traffic to the correct backends, which are instance groups ornetwork endpoint groups (NEGs).
Distribute traffic according to a balancing mode, which is a setting foreach backend.
Determine which health check is monitoringthe health of the backends.
Specify session affinity.
Determine whether other services are enabled, including the followingservices that are only available for certain loadbalancers:
- Cloud CDN
- Google Cloud Armor security policies
- Identity-Aware Proxy
Designate regional backend services as a service inApp Hub, which isin preview.

You set these values when you create a backend service or add a backend to thebackend service.

Note: If you're using either the global external Application Load Balancer or theclassic Application Load Balancer, and your backends serve static content, considerusing backend bucketsinstead of backend services.

The following table summarizes which load balancers use backend services. Theproduct that you are using also determines the maximum number of backendservices, the scope of a backend service, the type of backends supported, and thebackend service's load balancing scheme. The load balancing scheme is anidentifier that Google uses to classify forwarding rules and backend services. Eachload balancing product uses one load balancing scheme for its forwarding rulesand backend services. Some schemes are shared among products.

**Table:** Backend services and supported backend types
Product	Maximum number of backend services	Scope of backend service	Supported backend types	Load balancing scheme
Global external Application Load Balancer	Multiple	Global	Each backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends ‡ All zonal NEGs: One or more `GCE_VM_IP_PORT` type zonal NEGs‡ All hybrid connectivity NEGs: One or more `NON_GCP_PRIVATE_IP_PORT` type NEGs A combination of zonal and hybrid NEGs: `GCE_VM_IP_PORT` and `NON_GCP_PRIVATE_IP_PORT` type NEGs † All serverless NEGs: One or more App Engine, Cloud Run, or Cloud Functions services Private Service Connect NEGs: If more than one NEG is specified, the NEGs must be in different regions One global internet NEG for an external backend	EXTERNAL_MANAGED
Classic Application Load Balancer	Multiple	Global*	Each backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends All zonal NEGs: One or more `GCE_VM_IP_PORT` type zonal NEGs All hybrid connectivity NEGs: One or more `NON_GCP_PRIVATE_IP_PORT` type NEGs A combination of zonal and hybrid NEGs: `GCE_VM_IP_PORT` and `NON_GCP_PRIVATE_IP_PORT` type NEGs † All serverless NEGs: One or more App Engine, Cloud Run, or Cloud Functions services, or One global internet NEG for an external backend	EXTERNAL
Regional external Application Load Balancer	Multiple	Regional	Each backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends All zonal NEGs: One or more `GCE_VM_IP_PORT` type zonal NEGs All hybrid connectivity NEGs: One or more `NON_GCP_PRIVATE_IP_PORT` type NEGs A combination of zonal and hybrid NEGs: `GCE_VM_IP_PORT` and `NON_GCP_PRIVATE_IP_PORT` type NEGs † A single serverless NEG (Cloud Run only) A single Private Service Connect NEG All regional internet NEGs for an external backend	EXTERNAL_MANAGED
Cross-region internal Application Load Balancer	Multiple	Global	Each backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends All zonal NEGs: One or more `GCE_VM_IP_PORT` type zonal NEGs All hybrid connectivity NEGs: One or more `NON_GCP_PRIVATE_IP_PORT` type NEGs A combination of zonal and hybrid NEGs: `GCE_VM_IP_PORT` and `NON_GCP_PRIVATE_IP_PORT` type NEGs † A single serverless NEG (Cloud Run only) A single Private Service Connect NEG	INTERNAL_MANAGED
Regional internal Application Load Balancer	Multiple	Regional	Each backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends All zonal NEGs: One or more `GCE_VM_IP_PORT` type zonal NEGs All hybrid connectivity NEGs: One or more `NON_GCP_PRIVATE_IP_PORT` type NEGs A combination of zonal and hybrid NEGs: `GCE_VM_IP_PORT` and `NON_GCP_PRIVATE_IP_PORT` type NEGs † A single serverless NEG (Cloud Run only) A single Private Service Connect NEG All regional internet NEGs for an external backend	INTERNAL_MANAGED
Global external proxy Network Load Balancer	1	Global	The backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends ‡ All zonal NEGs: One or more `GCE_VM_IP_PORT` type zonal NEGs‡ All hybrid connectivity NEGs: One or more `NON_GCP_PRIVATE_IP_PORT` type NEGs A combination of zonal and hybrid NEGs: `GCE_VM_IP_PORT` and `NON_GCP_PRIVATE_IP_PORT` type NEGs †	EXTERNAL_MANAGED
Classic proxy Network Load Balancer	1	Global*	The backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends All zonal NEGs: One or more `GCE_VM_IP_PORT` type zonal NEGs All hybrid connectivity NEGs: One or more `NON_GCP_PRIVATE_IP_PORT` type NEGs A combination of zonal and hybrid NEGs: `GCE_VM_IP_PORT` and `NON_GCP_PRIVATE_IP_PORT` type NEGs †	EXTERNAL
Regional external proxy Network Load Balancer	1	Regional	The backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends All zonal NEGs: One or more `GCE_VM_IP_PORT` type zonal NEGs All hybrid connectivity NEGs: One or more `NON_GCP_PRIVATE_IP_PORT` type NEGs A combination of zonal and hybrid NEGs: `GCE_VM_IP_PORT` and `NON_GCP_PRIVATE_IP_PORT` type NEGs All regional internet NEGs for an external backend	EXTERNAL_MANAGED
Regional internal proxy Network Load Balancer	1	Regional	The backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends All zonal NEGs: One or more `GCE_VM_IP_PORT` type zonal NEGs All hybrid connectivity NEGs: One or more `NON_GCP_PRIVATE_IP_PORT` type NEGs A combination of zonal and hybrid NEGs: `GCE_VM_IP_PORT` and `NON_GCP_PRIVATE_IP_PORT` type NEGs All regional internet NEGs for an external backend	INTERNAL_MANAGED
Cross-region internal proxy Network Load Balancer	Multiple	Global	The backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends All zonal NEGs: One or more `GCE_VM_IP_PORT` type zonal NEGs All hybrid connectivity NEGs: One or more `NON_GCP_PRIVATE_IP_PORT` type NEGs A combination of zonal and hybrid NEGs: `GCE_VM_IP_PORT` and `NON_GCP_PRIVATE_IP_PORT` type NEGs	INTERNAL_MANAGED
External passthrough Network Load Balancer	1	Regional	The backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends All zonal NEGs: One or more `GCE_VM_IP` type zonal NEGs	EXTERNAL
Internal passthrough Network Load Balancer	1	Regional, but configurable to be globally accessible	The backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends All zonal NEGs: One or more `GCE_VM_IP` type zonal NEGs	INTERNAL
Traffic Director	Multiple	Global	Each backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends All zonal NEGs: One or more `GCE_VM_IP_PORT` or `NON_GCP_PRIVATE_IP_PORT` type zonal NEGs One internet NEG of type `INTERNET_FQDN_PORT` One or more service bindings	INTERNAL_SELF_MANAGED

^* Backendservices used by classic Application Load Balancers andclassic proxy Network Load Balancers are always global in scope, in either Standard orPremium Network Tier. However, in Standard Tier the following restrictionsapply:

The forwarding rule and its external IP address are regional.
All backends connected to the backend service must be located in the same region as the forwarding rule.

^† ForGKE deployments, mixed NEG backends are only supported withstandalone NEGs.

^‡ Support IPv4 andIPv6 (dual-stack) instance groups and zonal NEG backends. Zonal NEGs supportdual-stack on only GCE_VM_IP_PORT type endpoints.

Backends

A backend is one or more endpoints that receive traffic from a Google Cloudload balancer, a Traffic Director-configured Envoy proxy, or a proxyless gRPCclient. There are several types of backends:

Instance group containing virtual machine (VM) instances.An instance group can be a managed instancegroup (MIG),with or without autoscaling, or it can be anunmanaged instancegroup.More than one backend service can reference an instance group, but all backendservices that reference the instance group must use the same balancing mode.
Zonal NEG
Serverless NEG
Internet NEG
Hybrid connectivity NEG
Private Service Connect NEG
Service Directory service bindings

You cannot delete a backend instance group or NEG that is associated with abackend service. Before you delete an instance group or NEG, you must firstremove it as a backend from all backend services that reference it.

Instance groups

This section discusses how instance groups work with the backend service.

Backend VMs and external IP addresses

Backend VMs in backend services don't need external IP addresses:

For global external Application Load Balancers andexternal proxy Network Load Balancers: Clients communicate with a Google Front End (GFE) whichhosts your load balancer's external IP address. GFEs communicate with backendVMs or endpoints by sending packets to an internal address created by joiningan identifier for the backend's VPC network with the internalIPv4 address of the backend. Communication between GFEs and backend VMs orendpoints is facilitated through specialroutes.
- For instance group backends, the internal IPv4address is always the primary internal IPv4 address that corresponds to thenic0 interface of the VM.
- For GCE_VM_IP_PORT endpoints in a zonal NEG, you can specify theendpoint's IP address as either the primary IPv4 address associated with anynetwork interface of a VM or any IPv4 address from an alias IP address rangeassociated with any network interface of a VM.

For regional external Application Load Balancers: Clients communicate with an Envoy proxywhich hosts your load balancer's external IP address. Envoy proxiescommunicate with backend VMs or endpoints by sending packets to an internaladdress created by joining an identifier for the backend's VPCnetwork with the internal IPv4 address of the backend.
- For instance group backends, the internal IPv4 address is always the primaryinternal IPv4 address that corresponds to the nic0 interface of the VM,and nic0 must be in the same network as the load balancer.
- For GCE_VM_IP_PORT endpoints in a zonal NEG, you can specify theendpoint's IP address as either the primary IPv4 address associated with anynetwork interface of a VM or any IPv4 address from an alias IP address rangeassociated with any network interface of a VM, as long as the networkinterface is in the same network as the load balancer.
For external passthrough Network Load Balancers: Clients communicate directly with backends by wayof Google's Maglev pass-through load balancing infrastructure. Packets are routed and deliveredto backends with the original source anddestination IP addresses preserved. Backends respond to clients using directserver return.The methods used to select a backend and to track connections areconfigurable.
- For instance group backends, packets are always delivered to the nic0interface of the VM.
- For GCE_VM_IP endpoints in a zonal NEG, packets are delivered to the VM'snetwork interface that is in the subnetwork associated with the NEG.

Named ports

The backend service's named port attribute is only applicable to proxy loadbalancers using instance group backends. The named port defines the destinationport used for the TCP connection between the proxy (GFE or Envoy) and thebackend instance.

Named ports are configured as follows:

On each instance group backend, you must configure one or more named portsusing key-value pairs. The key represents a meaningful port name that youchoose, and the value represents the port number you assign to the name. Themapping of names to numbers is done individually for each instance groupbackend.
On the backend service, you specify a single named port using just the portname (--port-name).

On a per-instance group backend basis, the backend service translates the portname to a port number. When an instance group's named port matches the backendservice's --port-name, the backend service uses this port number forcommunication with the instance group's VMs.

For example, you might set the named port on an instance group with the namemy-service-name and the port 8888:

gcloud compute instance-groups unmanaged set-named-ports my-unmanaged-ig \ --named-ports=my-service-name:8888

Then you refer to the named port in the backend service configuration with the--port-name on the backend service set to my-service-name:

gcloud compute backend-services update my-backend-service \ --port-name=my-service-name

A backend service can use a different port number when communicating with VMsin different instance groups if each instance group specifies a different portnumber for the same port name.

The resolved port number used by the proxy load balancer's backend servicedoesn't need to match the port number used by the load balancer's forwardingrules. A proxy load balancer listens for TCP connections sent to the IP addressand destination port of its forwarding rules. Because the proxy opens a secondTCP connection to its backends, the second TCP connection's destination port canbe different.

Named ports are only applicable to instance group backends. Zonal NEGs withGCE_VM_IP_PORT endpoints, hybrid NEGs with NON_GCP_PRIVATE_IP_PORTendpoints, and internet NEGs define ports using a different mechanism, namely,on the endpoints themselves. Serverless NEGs reference Google services and PSCNEGs reference service attachments using abstractions that don't involvespecifying a destination port.

Internal passthrough Network Load Balancers and external passthrough Network Load Balancers don'tuse named ports. This is because they are pass-through load balancers that routeconnections directly to backends instead of creating new connections. Packetsare delivered to the backends preserving the destination IP address and port ofthe load balancer's forwarding rule.

To learn how to create named ports, see the following instructions:

Unmanaged instance groups: Working with namedports
Managed instance groups: Assigning named ports to managed instancegroups

Restrictions and guidance for instance groups

Keep the following restrictions and guidance in mind when you create instancegroups for your load balancers:

Don't put a VM in more than one load-balanced instance group. If a VM is amember of two or more unmanaged instance groups, or a member of one managedinstance group and one or more unmanaged instance groups, Google Cloudlimits you to only using one of those instance groups at a time as a backendfor a particular backend service.
If you need a VM to participate in multiple load balancers, you must use thesame instance group as a backend on each of the backend services.
For proxy load balancers, when you want to balance traffic to differentports, specify the required named ports on oneinstance group and have each backend service subscribe to a unique namedport .
You can use the same instance group as a backend for more than onebackend service. In this situation, the backends must use compatiblebalancing modes. Compatible means that the balancing modes must be thesame, or they must be a combination of CONNECTION and RATE.
Incompatible balancing mode combinations are as follows:
- CONNECTION with UTILIZATION
- RATE with UTILIZATION
Consider the following example:
- You have two backend services: external-https-backend-service for anexternal Application Load Balancer and internal-tcp-backend-service for aninternal passthrough Network Load Balancer.
- You're using an instance group called instance-group-a ininternal-tcp-backend-service.
- In internal-tcp-backend-service, you must apply the CONNECTIONbalancing mode because internal passthrough Network Load Balancers only support the CONNECTIONbalancing mode.
- You can also use instance-group-a in external-https-backend-service ifyou apply the RATE balancing mode in external-https-backend-service.
- You cannot also use instance-group-a inexternal-https-backend-service with the UTILIZATION balancing mode.
To change the balancing mode for an instance group serving as a backend formultiple backend services:
- Remove the instance group from all backend services except for one.
- Change the balancing mode for the backend on the one remaining backend service.
- Re-add the instance group as a backend to the remaining backend services,if they support the new balancing mode.
If your instance group is associated with several backend services,each backend service can reference the same named port or a differentnamed port on the instance group.
We recommend not adding an autoscaled managed instance group to more thanone backend service. Doing so might cause unpredictable and unnecessaryscaling of instances in the group, especially if you use the HTTP LoadBalancing Utilization autoscaling metric.
- While not recommended, this scenario might work if the autoscalingmetric is either CPU Utilization or a Cloud Monitoring Metric thatis unrelated to the load balancer's serving capacity. Using one of theseautoscaling metrics might prevent erratic scaling.

Zonal network endpoint groups

Network endpoints represent services by their IP address or an IP address andport combination, rather than referring to a VM in an instance group. A networkendpoint group (NEG) is a logical grouping of network endpoints.

Zonal network endpoint groups (NEGs) are zonalresources that represent collections of either IP addresses or IPaddress and port combinations for Google Cloud resources within a singlesubnet.

A backend service that uses zonal NEGs as its backendsdistributes traffic among applications or containers running within VMs.

There are two types of network endpoints available for zonal NEGs:

GCE_VM_IP endpoints (supported only with internal passthrough Network Load Balancers and backendservice-based external passthrough Network Load Balancers).
GCE_VM_IP_PORT endpoints.

To see which products support zonal NEG backends,see Table: Backend services and supported backendtypes.

For details, see Zonal NEGsoverview.

Internet network endpoint groups

Internet NEGs are resources that define external backends.An external backend is a backend that is hosted within on-premisesinfrastructure or on infrastructure provided by third parties.

An internet NEG is a combination of a hostname or an IP address, plus anoptional port. There are two types of network endpoints available for internetNEGs: INTERNET_FQDN_PORTand INTERNET_IP_PORT.

Internet NEGs are available in two scopes: global and regional. To see whichproducts support internet NEG backends in each scope, see Table: Backendservices and supported backend types.

Serverless network endpoint groups

A network endpoint group (NEG) specifies a group of backend endpoints for a loadbalancer. A serverless NEG is a backend that points to aCloud Run, App Engine,Cloud Functions, orAPI Gatewayservice.

A serverless NEG can represent one of the following:

A Cloud Run service or a group of services.
A Cloud Functions function or a group of functions.
An App Engine app (Standard or Flex), a specific service within an app,a specific version of an app, or a group of services.
An API Gateway that provides access to your services through aREST APIconsistent across all services, regardless of service implementation.This capability is in Preview.

To set up a serverless NEG for serverless applications that share a URLpattern, you use a URLmask. A URL maskis a template of your URL schema (for example, example.com/<service>). Theserverless NEG will use this template to extract the <service> name from theincoming request's URL and route the request to the matchingCloud Run, Cloud Functions, or App Engineservice with the same name.

To see which load balancers support serverless NEG backends,see Table: Backend services and supported backendtypes.

For more information about serverless NEGs, see the Serverless network endpointgroups overview.

Service bindings

A service binding is a backend that establishes a connection between abackend service in Traffic Director and a service registered inService Directory. A backend service can reference severalservice bindings. A backend service with a service binding cannot referenceany other type of backend.

Mixed backends

The following usage considerations apply when you add different types ofbackends to a single backend service:

A single backend service cannot simultaneously use both instancegroups and zonal NEGs.
You can use a combination of different types of instance groups on the samebackend service. For example, a single backend service can reference acombination of both managed and unmanaged instance groups. For completeinformation about which backends are compatible with which backend services,see the table in the previous section.
With certain proxy load balancers, you can use a combination of zonal NEGs(with GCE_VM_IP_PORT endpoints)and hybrid connectivity NEGs (with NON_GCP_PRIVATE_IP_PORTendpoints) to configure hybrid load balancing.To see which load balancers have this capability, refer Table: Backendservices and supported backend types.

Protocol to the backends

When you create a backend service, you must specify the protocol used tocommunicate with the backends. You can specify only one protocol per backendservice — you cannot specify a secondary protocol to use as a fallback.

Which protocols are valid depends on the type of load balancer or whether youare using Traffic Director.

**Table:** Protocol to the backends
Product	Backend service protocol options
Application Load Balancer	HTTP, HTTPS, HTTP/2
Proxy Network Load Balancer	TCP or SSL The regional proxy Network Load Balancers support only TCP.
Passthrough Network Load Balancer	TCP, UDP, or UNSPECIFIED
Traffic Director	HTTP, HTTPS, HTTP/2, gRPC, TCP

Changing a backend service's protocol makes the backends inaccessible throughload balancers for a few minutes.

IP address selection policy

This field is applicable to global external Application Load Balancersand global external proxy Network Load Balancers with load balancing scheme EXTERNAL_MANAGED.You must use the IP address selection policy to specify the traffic type that issent from the GFE to your backends.

When you select the IP address selection policy, ensure that your backendssupport the selected traffic type. For more information,see Table: Backend services and supported backendtypes.

IP address selection policy is used when you want to migrate your load balancerbackend service to support a different traffic type.For more information, see Choose a workflow for IPv4 to IPv6 migration.

You can specify the following values for the IP address selection policy:

IP address selection policy Description

Only IPv4 Only send IPv4 traffic to the backends of the backend service, regardless of traffic from the client to the GFE. Only IPv4 health checks are used to check the health of the backends.

IP address selection policy	Description
Only IPv4	Only send IPv4 traffic to the backends of the backend service, regardless of traffic from the client to the GFE. Only IPv4 health checks are used to check the health of the backends.
Prefer IPv6	Prioritize the backend's IPv6 connection over the IPv4 connection (provided there is a healthy backend with IPv6 addresses). The health checks periodically monitor the backends' IPv6 and IPv4 connections. The GFE first attempts the IPv6 connection; if the IPv6 connection is broken or slow, the GFE uses happy eyeballs to fall back and connect to IPv4. Even if one of the IPv6 or IPv4 connections is unhealthy, the backend is still treated as healthy, and both connections can be tried by the GFE, with happy eyeballs ultimately selecting which one to use.
Only IPv6	Only send IPv6 traffic to the backends of the backend service, regardless of traffic from the client to the proxy. Only IPv6 health checks are used to check the health of the backends. There is no validation to check if the backend traffic type matches the IP address selection policy. For example, if you have IPV4 backends and select `Only IPv6` as the IP address selection policy, you won't observe configuration errors but the traffic won't flow to your backends.

Prefer IPv6

Prioritize the backend's IPv6 connection over the IPv4 connection (provided there is a healthy backend with IPv6 addresses).

The health checks periodically monitor the backends' IPv6 and IPv4 connections. The GFE first attempts the IPv6 connection; if the IPv6 connection is broken or slow, the GFE uses happy eyeballs to fall back and connect to IPv4.

Even if one of the IPv6 or IPv4 connections is unhealthy, the backend is still treated as healthy, and both connections can be tried by the GFE, with happy eyeballs ultimately selecting which one to use.

Only IPv6

Only send IPv6 traffic to the backends of the backend service, regardless of traffic from the client to the proxy. Only IPv6 health checks are used to check the health of the backends.

There is no validation to check if the backend traffic type matches the IP address selection policy. For example, if you have IPV4 backends and select Only IPv6 as the IP address selection policy, you won't observe configuration errors but the traffic won't flow to your backends.

Encryption between the load balancer and backends

For information about encryption between the load balancer and backends, seeEncryption to thebackends.

Traffic distribution

The values of the following fields in the backend services resource determinesome aspects of the backend's behavior:

A balancing mode defines how the load balancer measures backend readinessfor new requests or connections.
A target capacity defines a target maximum number of connections,a target maximum rate, or target maximum CPU utilization.
A capacity scaler adjusts overall available capacitywithout modifying the target capacity.

Balancing mode

The balancing mode determines whether the backends of a load balancer orTraffic Director can handle additional traffic or are fully loaded.Google Cloud has three balancing modes:

CONNECTION: Determines how the load is spread based on the total number ofconnections that the backend can handle.
RATE: The target maximum number of requests (queries) per second (RPS,QPS). The target maximum RPS/QPS can be exceeded if all backends are at orabove capacity.
UTILIZATION: Determines how the load is spread based on the utilization ofinstances in an instance group.

Balancing modes available for each load balancer

You set the balancing mode when you add a backend to the backend service. Thebalancing modes available to a load balancer depend on the type of load balancerand the type of backends.

The passthrough Network Load Balancers require the CONNECTION balancing mode but don'tsupport setting any target capacity.

The Application Load Balancers support either RATE or UTILIZATIONbalancing modes for instance group backends, RATE balancing mode for zonalNEGs with GCE_VM_IP_PORT endpoints, and RATE balancing mode for hybrid NEGs(NON_GCP_PRIVATE_IP_PORT endpoints). For any other type of supported backend,balancing mode must be omitted.

For classic Application Load Balancers, a region is selected based on the location ofthe client and whether the region has available capacity, based on the loadbalancing mode's target capacity. Then, within a region, the balancing mode'starget capacity is used to compute proportions for how many requests should goto each backend in the region. Requests or connections are then distributedin a round robin fashion among instances or endpoints within the backend.
For global external Application Load Balancers, a region is selected based on the location ofthe client and whether the region has available capacity, based on the loadbalancing mode's target capacity. Within a region, the balancing mode's targetcapacity is used to compute proportions for how many requests should go toeach backend (instance group or NEG) in the region. You can use the serviceload balancing policy(serviceLbPolicy) and thepreferred backend setting to influence the selection of any specificbackends within a region. Furthermore, within each instance group or NEG,the load balancing policy (LocalityLbPolicy) determines how traffic isdistributed to instances or endpoints within the group.

For cross-region internal Application Load Balancers, regional external Application Load Balancers, and regional internal Application Load Balancers, the balancing mode'starget capacity is used to compute proportions for how many requests shouldgo to each backend (instance group or NEG) in the region. Within eachinstance group or NEG, the load balancing policy (LocalityLbPolicy)determines how traffic is distributed to instances or endpoints within thegroup. Only thecross-region internal Application Load Balancer support the use of the service load balancingpolicy (serviceLbPolicy) and thepreferred backend settings to influence the selection of any specificbackends within a region.

Proxy Network Load Balancers support either CONNECTION orUTILIZATION balancing modes for VM instance group backends, CONNECTIONbalancing mode for zonal NEGs with GCE_VM_IP_PORT endpoints, and CONNECTIONbalancing mode for hybrid NEGs (NON_GCP_PRIVATE_IP_PORT endpoints). For anyother type of supported backend, balancing mode must be omitted.

For global external proxy Network Load Balancers, a region is selected based on the location ofthe client and whether the region has available capacity, based on the loadbalancing mode's target capacity. Within a region, the balancing mode's targetcapacity is used to compute proportions for how many requests should go toeach backend (instance group or NEG) in the region. You can use the serviceload balancing policy(serviceLbPolicy) and thepreferred backend setting to influence the selection of any specificbackends within a region. Furthermore, within each instance group or NEG, theload balancing policy (LocalityLbPolicy) determines how traffic isdistributed to instances or endpoints within the group.
For cross-region internal proxy Network Load Balancers, the configured region is selected first.Within a region, the balancing mode's target capacity is used to computeproportions for how many requests should go toeach backend (instance group or NEG) in the region. You can use the serviceload balancing policy(serviceLbPolicy) and thepreferred backend setting to influence the selection of any specificbackends within a region. Furthermore, within each instance group or NEG, theload balancing policy (LocalityLbPolicy) determines how traffic isdistributed to instances or endpoints within the group.
For classic proxy Network Load Balancers, a region is selected based onthe location of the client and whether the region has available capacitybased on the load balancing mode's target capacity. Then, within a region, theload balancing mode's target capacity is used to compute proportions for howmany requests or connections should go to each backend (instance group or NEG)in the region. After the load balancer has selected a backend, requests orconnections are then distributed in a round robin fashion among VM instancesor network endpoints within each individual backend.

For regional external proxy Network Load Balancers and regional internal proxy Network Load Balancers, theload balancing mode's target capacity is used to compute proportions for howmany requests should go to each backend (instance group or NEG). Within eachinstance group or NEG, the load balancing policy (localityLbPolicy)determines how traffic is distributed to instances or endpoints within thegroup.

The following table summarizes the load balancing modes available for eachload balancer and backend combination.

**Table:** Balancing modes available for each load balancer
Load balancer	Backends	Balancing modes available
Application Load Balancer	Instance groups	`RATE` or `UTILIZATION`
	Zonal NEGs (`GCE_VM_IP_PORT` endpoints)	`RATE`
	Hybrid NEGs (`NON_GCP_PRIVATE_IP_PORT` endpoints)	`RATE`
Global external proxy Network Load Balancer Classic proxy Network Load Balancer Regional external proxy Network Load Balancer Regional internal proxy Network Load Balancer Cross-region internal proxy Network Load Balancer	Instance groups	`CONNECTION` or `UTILIZATION`
	Zonal NEGs (`GCE_VM_IP_PORT` endpoints)	`CONNECTION`
	Hybrid NEGs (`NON_GCP_PRIVATE_IP_PORT` endpoints)	`CONNECTION`
Passthrough Network Load Balancer	Instance groups	`CONNECTION`
Passthrough Network Load Balancer	Zonal NEGs (`GCE_VM_IP` endpoints)	`CONNECTION`

If the average utilization of all VMs that are associated with a backend serviceis less than 10%, Google Cloud might prefer specific zones. This canhappen when you use regional managed instance groups, zonal managed instancegroups in different zones, and zonal unmanaged instance groups. This zonalimbalance automatically resolves as more traffic is sent to the load balancer.

For more information, see gcloud compute backend-servicesadd-backend.

Target capacity

Each balancing mode has a corresponding target capacity, which defines one ofthe following target maximums:

Number of connections
Rate
CPU utilization

For every balancing mode, the target capacity is not a circuitbreaker. A load balancer can exceed the maximum under certain conditions, forexample, if all backend VMs or endpoints have reached the maximum.

Connection balancing mode

For CONNECTION balancing mode, the target capacity defines a targetmaximum number of open connections. Except for internal passthrough Network Load Balancersand external passthrough Network Load Balancers, you must use one of the following settings to specify atarget maximum number of connections:

max-connections-per-instance (per VM): Target averagenumber of connections for a single VM.
max-connections-per-endpoint (per endpoint in a zonal NEG): Target averagenumber of connections for a single endpoint.
max-connections (per zonal NEGs and for zonal instance groups):Target average number of connections for the whole NEG orinstance group. For regional managed instance groups, usemax-connections-per-instance instead.

The following table shows how the target capacity parameter defines thefollowing:

The target capacity for the whole backend
The expected target capacity for each instance or endpoint

**Table:** Target capacity for backends using the CONNECTION balancing mode
Backend type	Target capacity
	If you specify	Whole backend capacity	Expected per instance or per endpoint capacity
Instance group `N` instances, `H` healthy	`max-connections-per-instance=X`	`X × N`	`(X × N)/H`
Zonal NEG `N` endpoints, `H` healthy	`max-connections-per-endpoint=X`	`X × N`	`(X × N)/H`
Instance groups (except regional managed instance groups) `H` healthy instances	`max-connections=Y`	`Y`	`Y/H`

As illustrated, the max-connections-per-instance andmax-connections-per-endpoint settings are proxies for calculating atarget maximum number of connections for the whole VM instance group or wholezonal NEG:

In a VM instance group with N instances, settingmax-connections-per-instance=X has the same meaning as settingmax-connections=X × N.
In a zonal NEG with N endpoints, settingmax-connections-per-endpoint=X has the same meaning as settingmax-connections=X × N.

Rate balancing mode

For the RATE balancing mode, you must define the target capacity usingone of the following parameters:

max-rate-per-instance (per VM): Provide a target average HTTPrequest rate for a single VM.
max-rate-per-endpoint (per endpoint in a zonal NEG): Provide a targetaverage HTTP request rate for a single endpoint.
max-rate (per zonal NEGs and for zonal instance groups): Provide atarget average HTTP request rate for the whole NEG or instance group. Forregional managed instance groups, use max-rate-per-instance instead.

The following table shows how the target capacity parameter defines thefollowing:

The target capacity for the whole backend
The expected target capacity for each instance or endpoint

**Table:** Target capacity for backends using the RATE balancing mode
Backend type	Target capacity
	If you specify	Whole backend capacity	Expected per instance or per endpoint capacity
Instance group `N` instances, `H` healthy	`max-rate-per-instance=X`	`X × N`	`(X × N)/H`
zonal NEG `N` endpoints, `H` healthy	`max-rate-per-endpoint=X`	`X × N`	`(X × N)/H`
Instance groups (except regional managed instance groups) `H` healthy instances	`max-rate=Y`	`Y`	`Y/H`

As illustrated, the max-rate-per-instance and max-rate-per-endpoint settingsare proxies for calculating a target maximum rate of HTTP requests for the wholeinstance group or whole zonal NEG:

In an instance group with N instances, setting max-rate-per-instance=Xhas the same meaning as setting max-rate=X × N.
In a zonal NEG with N endpoints, setting max-rate-per-endpoint=X hasthe same meaning as setting max-rate=X × N.

Utilization balancing mode

The UTILIZATION balancing mode has no mandatory target capacity. You have anumber of options that depend on the type of backend, as summarized inthe table in the following section.

The max-utilization target capacity can only be specified per instancegroup and cannot be applied to a particular VM in the group.

The UTILIZATION balancing mode has no mandatory target capacity. When you usethe Google Cloud console to add a backend instance group to a backend service, theGoogle Cloud console sets the value of max-utilization to 0.8 (80%) if theUTILIZATION balancing mode is selected. In addition to max-utilization, theUTILIZATION balancing mode supports more complex target capacities, assummarized in the table in the following section.

Changing the balancing mode of a load balancer

For some load balancers or load balancer configurations, you cannot change thebalancing mode because the backend service has only one possible balancing mode.For others, depending on the backend used, you can change the balancing modebecause more than one mode is available to those backend services.

To see which balancing modes are supported for each load balancer, refer theTable: Balancing modes available for each load balancer

Balancing modes and target capacity settings

This table summarizes all possible balancing modes for a given load balancer andtype of backend. It also shows the available or required capacity settings thatyou must specify with the balancing mode.

**Table:** Target capacity specification for balancing modes
Load balancer	Type of backend	Balancing mode	Target capacity
Application Load Balancer Traffic Director	Instance group	`RATE`	You must specify one of the following: `max-rate` per zonal instance group `max-rate-per-instance` (zonal or regional instance groups)
	Instance group	`UTILIZATION`	You can optionally specify one of the following: (1) `max-utilization` (2) `max-rate` per zonal instance group (3) `max-rate-per-instance` (zonal or regional instance groups) (1) and (2) together (1) and (3) together
	Zonal NEG (`GCP_VM_IP_PORT`)	`RATE`	You must specify one of the following: `max-rate` per zonal NEG `max-rate-per-endpoint`
	Hybrid NEG (`NON_GCP_PRIVATE_IP_PORT`)	`RATE`	You must specify one of the following: `max-rate` per hybrid NEG `max-rate-per-endpoint`
Global external proxy Network Load Balancer Classic proxy Network Load Balancer regional external proxy Network Load Balancer Regional internal proxy Network Load Balancer Cross-region internal proxy Network Load Balancer	Instance group	`CONNECTION`	You must specify one of the following: `max-connections` per zonal instance group `max-connections-per-instance` (zonal or regional instance groups)
	Instance group	`UTILIZATION`	You can optionally specify one of the following: (1) `max-utilization` (2) `max-connections` per zonal instance group (3) `max-connections-per-instance` (zonal or regional instance groups) (1) and (2) together (1) and (3) together
	Zonal NEG (`GCP_VM_IP_PORT`)	`CONNECTION`	You must specify one of the following: `max-connections` per zonal NEG `max-connections-per-endpoint`
	Hybrid NEG (`NON_GCP_PRIVATE_IP_PORT`)	`CONNECTION`	You must specify one of the following: `max-connections` per hybrid NEG `max-connections-per-endpoint`
Passthrough Network Load Balancer	Instance group	`CONNECTION`	You cannot specify a target maximum number of connections.
Passthrough Network Load Balancer	Zonal NEGs (`GCP_VM_IP`)	`CONNECTION`	You cannot specify a target maximum number of connections.

Capacity scaler

Use capacity scaler to scale the target capacity (max utilization,max rate, or max connections) without changing the target capacity.

For the Google Cloud reference documentation, see the following:

Service load balancing policy

A service load balancing policy (serviceLbPolicy) is a resource associatedwith the load balancer's backendservice. It lets you customize theparameters that influence how traffic is distributed within the backendsassociated with a backend service:

Customize the load balancing algorithm used to determine how traffic isdistributed among regions or zones.
Enable auto-capacity draining so that the load balancer can quickly draintraffic from unhealthy backends.

Additionally, you can designate specific backends as preferred backends. Thesebackends must be used to capacity (that is, the target capacity specified bythe backend's balancing mode) before requests are sent to the remainingbackends.

To learn more, see Advanced load balancing optimizations with a service loadbalancing policy.

Traffic Director and traffic distribution

Traffic Director also uses backend service resources. Specifically,Traffic Director uses backend services whose load balancing scheme isINTERNAL_SELF_MANAGED. For an internal self-managed backend service, trafficdistribution is based on the combination of a load balancing mode and aload balancing policy. The backend service directs traffic to a backendaccording to the backend's balancing mode. Then Traffic Director distributestraffic according to a load balancing policy.

Internal self-managed backend services support the following balancing modes:

UTILIZATION, if all the backends are instance groups
RATE, if all the backends are either instance groups or zonal NEGs

If you choose RATE balancing mode, you must specify a maximum rate, maximumrate per instance, or maximum rate per endpoint.

For more information about Traffic Director, seeTraffic Director concepts.

Backend subsetting

Backend subsetting is an optional feature that improves performance andscalability by assigning a subset of backends to each of the proxy instances.

Backend subsetting is supported for the following:

Regional internal Application Load Balancer
Internal passthrough Network Load Balancer

Backend subsetting for regional internal Application Load Balancers

The cross-region internal Application Load Balancer doesn't support backend subsetting.

For regional internal Application Load Balancers, backend subsetting automatically assigns only asubset of the backends within the regional backend service to each proxyinstance. By default, each proxy instance opens connections to allthe backends within a backend service. When the number of proxy instances andthe backends are both large, opening connections to all the backends can lead toperformance issues.

By enabling subsetting, each proxy only opens connections to a subsetof the backends, reducing the number of connections which are kept open to eachbackend. Reducing the number of simultaneously open connections to each backendcan improve performance for both the backends and the proxies.

The following diagram shows a load balancer with two proxies. Without backendsubsetting, traffic from both proxies is distributed to all the backends in thebackend service 1. With backend subsetting enabled, traffic from each proxy isdistributed to a subset of the backends. Traffic from proxy 1 is distributed tobackends 1 and 2, and traffic from proxy 2 is distributed to backends 3 and 4.

You can additionally refine the load balancing traffic to the backends by setting thelocalityLbPolicy policy.For more information, see Traffic policies.

To read about setting up backend subsetting for internal Application Load Balancers, seeConfigure backend subsetting.

Caveats related to backend subsetting for internal Application Load Balancer

Although backend subsetting is designed to ensure that all backend instancesremain well utilized, it can introduce some bias in the amount of traffic thateach backend receives. Setting the localityLbPolicy to LEAST_REQUEST isrecommended for backend services that are sensitive to the balance of backendload.
Enabling and then disabling subsetting breaks existing connections.
Backend subsetting requires that the session affinity is NONE (a 5-tuple hash).Other session affinity options can only be used if backend subsetting isdisabled. The default values of the --subsetting-policy and--session-affinity flags are both NONE, and only one of them at a timecan be set to a different value.

Backend subsetting for internal passthrough Network Load Balancer

Backend subsetting for internal passthrough Network Load Balancers lets you scale your internal passthrough Network Load Balancerto support a larger number of backend VM instances per internal backendservice.

For information about how subsetting affects this limit, see the "Backendservices" section of Load balancing resource quotas andlimits.

By default, subsetting is disabled, which limits the backend service todistributing to up to 250 backend instances or endpoints. If your backendservice needs to support more than 250 backends, you can enable subsetting. Whensubsetting is enabled, a subset of backend instances is selected for each clientconnection.

The following diagram shows a scaled-down model of the difference between thesetwo modes of operation.

Without subsetting, the complete set of healthy backends is better utilized, andnew client connections are distributed among all healthy backends accordingto traffic distribution. Subsettingimposes load balancing restrictions but allows the load balancer to support morethan 250 backends.

For configuration instructions, seeSubsetting.

Caveats related to backend subsetting for internal passthrough Network Load Balancer

When subsetting is enabled, not all backends will receive traffic from a givensender even when the number of backends is small.
For the maximum number of backend instances when subsetting is enabled, seethe quotas page .
Only 5-tuple session affinityis supported with subsetting.
Packet Mirroring is not supported with subsetting.
Enabling and then disabling subsetting breaks existing connections.
If on-premises clients need for to access an internal passthrough Network Load Balancer, subsetting cansubstantially reduce the number of backends that receive connections from youron-premises clients. This is because the region of the Cloud VPNtunnel or Cloud Interconnect VLAN attachment determines the subset ofthe load balancer's backends. All Cloud VPN andCloud Interconnect endpoints in a specific region use the samesubset. Different subsets are used in different regions.

Backend subsetting pricing

There is no charge for using backend subsetting.For more information, see All networking pricing.

Session affinity

Session affinity lets you control how the load balancer selects backendsfor new connections in a predictable way as long as the number of healthybackends remains constant. This is useful for applications that need multiplerequests from a given user to be directed to the same backend or endpoint. Suchapplications usually include stateful servers used by ads serving, games, orservices with heavy internal caching.

Google Cloud load balancers provide session affinity on a best-effortbasis. Factors such as changing backend health check states, adding or removingbackends, or changes to backend fullness, as measured by the balancing mode, canbreak session affinity.

Load balance with session affinity works well when there is a reasonably largedistribution of unique connections. Reasonably large means at least severaltimes the number of backends. Testing a load balancer with a small number ofconnections won't result in an accurate representation of the distribution ofclient connections among backends.

By default, all Google Cloud load balancers select backends by using afive-tuple hash (--session-affinity=NONE), as follows:

Packet's source IP address
Packet's source port (if present in the packet's header)
Packet's destination IP address
Packet's destination port (if present in the packet's header)
Packet's protocol

For pass-through load balancers, new connections are distributed to healthy backend instances or endpoints (in the active pool, if a failover policy is configured). You can control the following:

Whether established connections persist on unhealthy backends. For details,see Connection persistence on unhealthy backends in the internal passthrough Network Load Balancerdocumentation andConnection persistence on unhealthy backends in the backend service-basedexternal external passthrough Network Load Balancer documentation.
Whether established connections persist during failover and failback, if afailover policy is configured. For details, see Connection draining onfailover and failback in the internal passthrough Network Load Balancerdocumentationand Connection draining on failover and failback in the backendservice-based external external passthrough Network Load Balancerdocumentation.
How long established connections can persist when removing a backend fromthe load balancer. For details, see Enabling connectiondraining.

For proxy-based load balancers, as long as the number of healthy backendinstances or endpoints remains constant, and as long as the previously-selectedbackend instance or endpoint is not at capacity, subsequent requests orconnections go to the same backend VM or endpoint. The target capacity of thebalancing mode determines when the backend is at capacity.

The following table shows the session affinity options supported for eachproduct:

**Table:** Supported session affinity settings
Product	Session affinity options
Global external Application Load Balancer Regional external Application Load Balancer	None (`NONE`) Client IP (`CLIENT_IP`) Generated cookie (`GENERATED_COOKIE`) Header field (`HEADER_FIELD`) HTTP cookie (`HTTP_COOKIE`) Also note: The effective default value of the load balancing locality policy(`localityLbPolicy`) changes according to your sessionaffinity settings. If session affinity is not configured—that is, ifsession affinity remains at the default value of `NONE`—thenthe default value for `localityLbPolicy` is `ROUND_ROBIN`.If session affinity is set to a value other than `NONE`, then thedefault value for `localityLbPolicy` is `MAGLEV`. For the global external Application Load Balancer, don't configure session affinity if you're using weighted traffic splitting. If you do, the weighted traffic splitting configuration takes precedence.
Classic Application Load Balancer	None (`NONE`) Client IP (`CLIENT_IP`) Generated cookie (`GENERATED_COOKIE`)
Cross-region internal Application Load Balancer Regional internal Application Load Balancer	None (`NONE`) Client IP (`CLIENT_IP`) Generated cookie (`GENERATED_COOKIE`) Header field (`HEADER_FIELD`) HTTP cookie (`HTTP_COOKIE`) Also note: The effective default value of the load balancing locality policy(`localityLbPolicy`) changes according to your sessionaffinity settings. If session affinity is not configured—that is, ifsession affinity remains at the default value of `NONE`—thenthe default value for `localityLbPolicy` is `ROUND_ROBIN`.If session affinity is set to a value other than `NONE`, then thedefault value for `localityLbPolicy` is `MAGLEV`. For the internal Application Load Balancer, don't configure session affinity if you're using weighted traffic splitting. If you do, the weighted traffic splitting configuration takes precedence.
Internal passthrough Network Load Balancer	None (`NONE`) CLIENT IP, no destination (`CLIENT_IP_NO_DESTINATION`) Client IP, Destination IP (`CLIENT_IP`) Client IP, Destination IP, Protocol (`CLIENT_IP_PROTO`) Client IP, Client Port, Destination IP, Destination Port, Protocol (`CLIENT_IP_PORT_PROTO`) For specific information about the internal passthrough Network Load Balancer and session affinity, see the Internal passthrough Network Load Balancer overview.
External passthrough Network Load Balancer^*	None (`NONE`) Client IP, Destination IP (`CLIENT_IP`) Client IP, Destination IP, Protocol (`CLIENT_IP_PROTO`) Client IP, Client Port, Destination IP, Destination Port, Protocol (`CLIENT_IP_PORT_PROTO`) For specific information about the external passthrough Network Load Balancer and session affinity, see the External TCP/UDP External passthrough Network Load Balancer overview.
Global external proxy Network Load Balancer Classic proxy Network Load Balancer regional external proxy Network Load Balancer Regional internal proxy Network Load Balancer Cross-region internal proxy Network Load Balancer	None (`NONE`) Client IP (`CLIENT_IP`)
Traffic Director	None (`NONE`) Client IP (`CLIENT_IP`) Generated cookie (`GENERATED_COOKIE`) (HTTP protocols only) Header field (`HEADER_FIELD`) (HTTP protocols only) HTTP cookie (`HTTP_COOKIE`) (HTTP protocols only)

^* This table documents session affinities supported by backendservice-basedexternal passthrough Network Load Balancers.Target pool-based external passthrough Network Load Balancersdon't use backend services. Instead, you set session affinity forexternal passthrough Network Load Balancers through the sessionAffinity parameter inTarget Pools.

Keep the following in mind when configuring session affinity:

Don't rely on session affinity for authentication or security purposes.Session affinity is designed to break whenever the number of serving andhealthy backends changes. Activities that result in breaking session affinityinclude:
- Adding backend instance groups or NEGs to the backend service
- Removing backend instance groups or NEGs from the backend service
- Adding instances to an existing backend instance group (which happensautomatically when you enable autoscaling with managed instance groups)
- Removing instances from an existing backend instance group (which happensautomatically when you enable autoscaling with managed instance groups)
- Adding endpoints to an existing backend NEG
- Removing endpoints from an existing backend NEG
- When a healthy backend fails its health check and becomes unhealthy
- When an unhealthy backend passes its health check and becomes healthy
- For pass-through load balancers: during failover and failback, if a failoverpolicy is configured
- For proxy load balancers: when a backend is at or above capacity
Using a session affinity other than None with the UTILIZATIONbalancing mode is not recommended. This is because changes in the instanceutilization can cause the load balancing service to direct new requests orconnections to backend VMs that are less full. This breaks session affinity.Instead, use either the RATE or CONNECTION balancing mode to reduce thechance of breaking session affinity. For more details, see Losingsession affinity.
For external and internal HTTP(S) load balancers, session affinity might bebroken when the intended endpoint or instance exceeds its balancing mode'starget maximum. Consider the following example:
- A load balancer has one NEG and three endpoints.
- Each endpoint has a target capacity of 1 RPS.
- The balancing mode is RATE.
- At the moment, each endpoint is processing 1.1, 0.8, and 1.6 RPS,respectively.
- When an HTTP request with affinity for the last endpoint arrives on the loadbalancer, session affinity claims the endpoint that is processing at 1.6RPS.
- The new request might go to the middle endpoint with 0.8 RPS.
The default values of the --session-affinity and --subsetting-policyflags are both NONE, and only one of them at a time can be set to adifferent value.

The following sections discuss the different types of session affinity.

Client IP, no destination affinity

Client IP, no destination affinity (CLIENT_IP_NO_DESTINATION) directs requestsfrom the same client source IP address to the same backend instance.

When you use client IP, no destination affinity, keep the following in mind:

Client IP, no destination affinity is a one-tuple hash consisting of theclient's source IP address.
If a client moves from one network to another, its IP address changes,resulting in broken affinity.

Client IP, no destination affinity is only an option for internal passthrough Network Load Balancers.

Client IP affinity

Client IP affinity (CLIENT_IP) directs requests from the same client IPaddress to the same backend instance. Client IP affinity is an option for everyGoogle Cloud load balancer that uses backend services.

When you use client IP affinity, keep the following in mind:

Client IP affinity is a two-tuple hash consisting of the client's IP addressand the IP address of the load balancer's forwarding rule that the clientcontacts.
The client IP address as seen by the load balancer might not be theoriginating client if it is behind NAT or makes requests througha proxy. Requests made through NAT or a proxy use the IP address of theNAT router or proxy as the client IP address. This can cause incoming trafficto clump unnecessarily onto the same backend instances.
If a client moves from one network to another, its IP address changes,resulting in broken affinity.

To learn which products support client IP affinity, refer the Table: Supportedsession affinity settings.

Generated cookie affinity

When you set generated cookie affinity, the load balancer issues a cookie on thefirst request. For each subsequent request with the same cookie, the loadbalancer directs the request to the same backend VM or endpoint.

For global external Application Load Balancers, the cookie is namedGCLB.

For Traffic Director, regional external Application Load Balancers, and internal Application Load Balancers, thecookie is named GCILB.

Cookie-based affinity can more accurately identify a client to a load balancer,compared to client IP-based affinity. For example:

With cookie-based affinity, the load balancer can uniquely identify two ormore client systems that share the same source IP address. Using clientIP-based affinity, the load balancer treats all connections from the samesource IP address as if they were from the same client system.
If a client changes its IP address, cookie-based affinity lets the loadbalancer recognize subsequent connections from that client instead oftreating the connection as new. An example of when a client changes its IPaddress is when a mobile device moves from one network another.

When a load balancer creates a cookie for generated cookie-based affinity, itsets the path attribute of the cookie to /. If the URL map's path matcherhas multiple backend service for a hostname, all backend services share thesame session cookie.

The lifetime of the HTTP cookie generated by the load balancer isconfigurable. You can set it to 0 (default), which means the cookie is onlya session cookie. Or you can set the lifetime of the cookie to a value from1 to 86400 seconds (24 hours) inclusive.

To learn which products support generated cookie affinity, refer the Table:Supported session affinity settings.

Header field affinity

The following load balancers use header field affinity:

Traffic Director
Cross-region internal Application Load Balancer
Global external Application Load Balancer
Regional external Application Load Balancer
Regional internal Application Load Balancer

Header field affinity is supported when the followingconditions are true:

The load balancing locality policy is RING_HASH or MAGLEV.
The backend service's consistentHash specifies the name of the HTTP header(httpHeaderName).

To learn which products support header field affinity, refer the Table:Supported session affinity settings.

HTTP cookie affinity

Traffic Director, global external Application Load Balancer, regional external Application Load Balancer,global external proxy Network Load Balancer, regional external proxy Network Load Balancer,cross-region internal Application Load Balancer, and regional internal Application Load Balancer can use HTTP cookieaffinity when both of the following are true:

The load balancing locality policy is RING_HASH or MAGLEV.
The backend service's consistent hash specifies the name of the HTTP cookie.

HTTP cookie affinity routes requests to backend VMs or endpoints in a NEG basedon the HTTP cookie named in the HTTP_COOKIE flag. If the client does not providethe cookie, the proxy generates the cookie and returns it to the client in aSet-Cookie header.

To learn which products support HTTP cookie IP affinity, refer the Table:Supported session affinity settings.

Losing session affinity

Regardless of the type of affinity chosen, a client can lose affinity with abackend in the following situations:

If the backend instance group or zonal NEG runs out of capacity, as defined bythe balancing mode's target capacity. In this situation, Google Clouddirects traffic to a different backend instance group or zonal NEG, which mightbe in a different zone. You can mitigate this by ensuring that you specify thecorrect target capacity for each backend based on your own testing.
Autoscaling adds instances to, or removes instances from, a managed instancegroup. When this happens, the number of instances in the instance groupchanges, so the backend service recomputes hashes for session affinity. Youcan mitigate this by ensuring that the minimum size of the managed instancegroup can handle a typical load. Autoscaling is then only performed duringunexpected increases in load.
If a backend VM or endpoint in a NEG fails health checks, the load balancerdirects traffic to a different healthy backend. Refer to the documentation foreach Google Cloud load balancer for details about how the load balancerbehaves when all of its backends fail health checks.
When the UTILIZATION balancing mode is in effect for backend instance groups,session affinity breaks because of changes in backend utilization. You canmitigate this by using the RATE or CONNECTION balancing mode, whichever issupported by the load balancer's type.

When you use external Application Load Balancers orexternal proxy Network Load Balancers, keep the following additional points in mind:

If the routing path from a client on the internet to Google changes betweenrequests or connections, a different Google Front End (GFE) might be selectedas the proxy. This can break session affinity.
When you use the UTILIZATION balancing mode — especially without adefined target maximum target capacity — session affinity is likely tobreak when traffic to the load balancer is low. Switch to using RATE orCONNECTION balancing mode, as supported by your chosen load balancer.

Backend service timeout

Most Google Cloud load balancers have a backend service timeout. Thedefault value is 30 seconds. The full range of timeout values allowed is1 - 2,147,483,647 seconds.

For external Application Load Balancers and internal Application Load Balancers using the HTTP, HTTPS, orHTTP/2 protocol, the backend service timeout is a request and response timeoutfor HTTP(S) traffic.
For more details about the backend service timeout for each load balancer, seethe following:
- For global external Application Load Balancers and regional external Application Load Balancers, seeTimeouts and retries.
- For internal Application Load Balancers, seeTimeouts and retries.
For external proxy Network Load Balancers, the timeout is an idle timeout.To allow more or less time before the connection is deleted, change thetimeout value. This idle timeout is also used for WebSocket connections.
For internal passthrough Network Load Balancers and external passthrough Network Load Balancers, you can set the value ofthe backend service timeout using gcloud or the API, but the value isignored. Backend service timeout has no meaning for these pass-throughload balancers.

For Traffic Director, the backend service timeout field (specified usingtimeoutSec) is not supported with proxyless gRPC services.For such services, configure the backend service timeout using themaxStreamDuration field. This is because gRPC does not support thesemantics of timeoutSec that specifies the amount of time to wait for abackend to return a full response after the request is sent. gRPC's timeoutspecifies the amount of time to wait from the beginning of the stream untilthe response has been completely processed, including all retries.

Health checks

Each backend service whose backends are instance groups or zonal NEGs musthave an associated health check. Backendservices using a serverless NEG or a global internet NEG as a backend must notreference a health check.

When you create a load balancer using the Google Cloud console, you can create thehealth check, if it is required, when you create the load balancer, or you canreference an existing health check.

When you create a backend service using either instance group or zonal NEGbackends using the Google Cloud CLI or the API, you must reference anexisting health check. Refer to the load balancerguide in the HealthChecks Overview for details about the type and scope of health check required.

For more information, read the following documents:

Health checks overview
Creating health checks
gcloud health check page
REST API health check page

Additional features enabled on the backend service resource

The following optional features are supported by some backend services.

Cloud CDN

Cloud CDN uses Google's global edge network to serve content closer tousers, which accelerates your websites and applications. Cloud CDN isenabled on backend services used byglobal external Application Load Balancers. The load balancerprovides the frontend IP addresses and ports that receive requests, and thebackends that respond to the requests.

For more details, see the Cloud CDN documentation.

Cloud CDN is incompatible with IAP. They can't beenabled on the same backend service.

Google Cloud Armor

If you use one of the following load balancers, you can add additionalprotection to your applications by enabling Google Cloud Armor on the backendservice during load balancer creation:

Global external Application Load Balancer
Classic Application Load Balancer
Global external proxy Network Load Balancer
Classic proxy Network Load Balancer

If you use the Google Cloud console, you can do one of the following:

Select an existingGoogle Cloud Armor security policy.
Accept the configuration of a default Google Cloud Armor rate-limitingsecurity policy with a customizable name, request count, interval, key, andrate limiting parameters. If you use Google Cloud Armor with an upstreamproxy service, such as a CDN provider, Enforce_on_key should be set as anXFF IP address.
Choose to opt out of Google Cloud Armor protection by selecting None.

IAP

IAP lets you establish a centralauthorization layer for applications accessed by HTTPS, so you can use anapplication-level access control model instead of relying on network-levelfirewalls. IAP is supported by certainApplication Load Balancers.

IAP is incompatible with Cloud CDN. They can't beenabled on the same backend service.

Traffic management features

The following features are supported only for some products:

Load balancing policies(localityLbPolicy)except for ROUND_ROBIN
Circuit breaking except for maxRequests field
Outlier detection

These features are supported by the following load balancers:

Global external Application Load Balancer (circuit breaking is not supported)

Regional external Application Load Balancer

Cross-region internal Application Load Balancer

Regional internal Application Load Balancer

Traffic Director (but not supported with proxyless gRPC services)

API and `gcloud` reference

For more information about the properties of the backend service resource,see the following references:

* Global backend service APIresource* Regional backend service APIresource

gcloud compute backend-servicespage, for both global and regional backend services

What's next

For related documentation and information about how backend services are used inload balancing, review the following:

Create custom headers
Create an external Application Load Balancer
External Application Load Balancer overview
Enable connection draining
Encryption in transit in Google Cloud

For related videos:

How to configure backend services for global external Application Load Balancers

Backend services overview | Load Balancing | Google Cloud (2024)