Google Cloud Platform Autoscaling and Load Balancing for Performance (2024)

By: Sergiu Onet |Updated: 2024-02-14 |Comments | Related: > Google Cloud

Problem

We built and hosted our services in Google Cloud Platform (GCP) and want to makesure we can handle increased traffic, scale the apps, and rest assured if one instancegoes down. This tip provides a few examples of where to achieve availability andperformance targets in GCP.

Solution

GCP has load balancing and autoscaling options that can be utilized to protectfrom unexpected instance failures, handle heavy traffic, scale our apps, and distributetraffic closer to our users.

Managed Instance Groups

Let's talk about autoscaling in the context of managed instance groups.A managed instance group is a group of identical VM instances provisioned usinga template and controlled as a single unit. You can set it in a single zone or region,resize the group, add more instances if you need more compute power, and reducethe number of instances if the load decreases. This autoscaling behavior can scaleby CPU usage, metrics, load balancing capacity, and queue-based workload.

A load balancer can be added to a managed instance group to distribute trafficto all the instances inside the group. If an instance goes down, it's recreatedwith the same configuration from the template.

Load Balancer

We can choose from several load balancer options based on traffic type, global/regional,and internal/external.

HTTP(S) Load Balancer (Application Load Balancer)

It's a Layer 7 load balancer that lets you route traffic to backend servicesbased on message content, URL type, or the http/https header. It's a globalresource, has a single anycast IP address, routes traffic to the instance groupclosest to the user, and can also route traffic to specific instances only. It usesa global rule that sends the incoming requests to a target HTTP proxy that usesa URL map to check what backend to use for the requests. This means you can sendspecific requests to certain backends for certain purposes. You can send requeststhat look like www.yoursite.com/video to a backend configured for thispurpose. This backend service is smart enough to direct the request to a specificbackend based on the instance health of the backends, capacity, and zone.

The health checks ensure to send requests only to healthy instances and distributethem in a round-robin mechanism.

Session affinity can be configured to send requests from the same client to thesame VM instance. There's also a timeout setting of 30 seconds when the backendservice waits for the backend before it marks the request as a failure.

The backends have a balancing mode setting based on CPU usage or requests persecond. This helps the balancer know whether the backend is in full load. If yes,it will direct the request to the closest region that can handle requests.

Proxy Network Load Balancer (TCP/SSL Proxy)

This is a Layer 4 load balancer used for TCP traffic. It can use SSL and distributesthe traffic to the closest backend resource located in a VPC or another cloud. Itsupports port remapping where the port specified at the load balancer's forwardingrule can be different than the one used to connect to backends, and it can relaythe client's source IP and port to the load balancer backend.

This load balancer can be deployed as External or Internal:

External Load Balancercan be configured as global, regional, orclassic. The global configuration is in Preview,implemented on Google's front-end infrastructure (GFE), and availablefor the Premium tier. The regional configuration is based on Envoy open-sourceproxy and is available in the Standard tier. Classic configuration is similarto the global configuration but can be configured as a regional resource inthe Standard tier.
This load balancer forwards traffic from the internetto backend resources in a VPC, on-premises, or another cloud. It supportsIPv4 and IPv6, integrates with Cloud Armor for additional protection, and can offloadTLS at the load balancer layer using SSL proxy. You can also set SSL policies tocontrol the negotiation with clients better.
See Also
Load Balancer In GCP - The CloudOps
Internal Load Balanceris based on Envoy open-source proxy, and it forwards trafficto backends on GCP, on-premises, or other clouds. The balancer is availablein the Premium tier, accessible using the internal IP in the region where yourVPC is, and only clients from that VPC have access. If global access is enabled,the balancer is available to clients from any region. The balancer can alsobe available from other networks if you use Cloud VPN, Interconnect, or NetworkPeering to connect the networks. There is also a Cross-region setup where youcan distribute traffic to backends located in different regions. This configurationprotects from a region failure and gives higher availability, or you can forwardtraffic to the closest backend.

Find out more about the architecture of an internal proxy balancer:Internal proxy Network Load Balancer overview.

Passthrough Network Load Balancers

This is also a Layer 4 balancer. It's a regional resource, so the trafficis distributed to backends from the same region as the balancer. This does not actlike a proxy, but instead, the backends will receive packets from the balancer thatwill contain source and destination IP, protocol, and ports if needed. One otherdifference is the response from the backend resources will go straight to the clients,not through the balancer. If you need to preserve the client source IP or balanceTCP, UDP, and/or SSL traffic on ports not supported by other balancers, this couldbe a good choice.

This balancer can be configured as External or Internal:

Using an External setup,clients from the internet or GCP VMs with an external IP can connect to thebalancer, and Cloud Armor can be used for extra protection. This external balancercan be configured as backend service or target pool based.
Target pool is the legacy configuration and can beused with forwarding rules for TCP and UDP traffic only. This setup creates a groupof instances that need to be in the same region. You can have only one health checkfor each pool, and each project can have a maximum of 50 target pools.
In a backend service setup, traffic is distributedto the backend service instances, health checks are performed for SSL and HTTPS,auto-scaling managed instance groups are supported, traffic can be sent to certainbackends, and you can load balance to GKE.
Internal passthrough balancerdistributes traffic to VM instances in the same VPC region.Regional backends are supported, so you can scale inside a region if needed.You can also enable global access to accept connections from any region, loadbalance to GKE, and set the balancer as the next hop (gateway) so you can forwardthe traffic to the final route.

source -https://cloud.google.com/static/load-balancing/images/passthrough-network-load-balancer.svg

Next Steps

There are many choices for load balancing. They share some similaritiesand features but also have some limitations. Carefully choose the right optionfor your needs based on traffic type, external/internal, global/regional, etc.
Google has a good resource that provides an overview of all these aspectsto help you make the right choice:Choose a load balancer

About the author

Sergiu Onet is a SQL Server Database Administrator for the past 10 years and counting, focusing on automation and making SQL Server run faster.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips

Article Last Updated: 2024-02-14