Load Balancing

How Google Cloud Load Balancing distributes traffic across backends using global anycast IPs, regional proxies, and passthrough load balancers for TCP, UDP, and HTTP/S workloads.

Note: This page is current as of May 2026. Load balancer features, pricing, and supported protocols can change. Verify time-sensitive details in the official Google Cloud documentation before making production decisions.

What Is Google Cloud Load Balancing?

Google Cloud Load Balancing is a fully distributed, software-defined service that distributes traffic across backends. Unlike traditional load balancers that run on specific VMs or appliances you manage, Google’s load balancers run on Google’s own infrastructure — no pre-warming, no scaling bottlenecks, no instance management.

The key architectural difference: Google uses a single anycast IP address as the frontend. Global load balancers route traffic from that one IP to the nearest healthy backend region over Google’s private backbone. You get global reachability without managing DNS-based failover or multiple regional IPs.

Underlying technologies:

Technology	Used By	Purpose
Google Front Ends (GFEs)	Classic Application LB, classic proxy Network LB, and the edge layer for global external LBs	Distributed proxies at Google’s edge PoPs worldwide; TLS termination, request routing
Envoy-based GFEs	Global external Application LB, global external proxy Network LB	Global edge proxying with newer traffic-management capabilities
Envoy proxies	Regional and cross-region Application LBs, regional and cross-region proxy Network LBs	Managed proxy data plane for non-classic regional and internal LBs
Maglev + Andromeda	External passthrough Network LB	Distributed passthrough L4 load balancing
Andromeda	Internal passthrough Network LB	Google’s SDN virtualization stack for internal L4 load balancing

In practice: You do not choose these technologies directly. You choose a load balancer type, and Google uses the right backend. What matters is understanding which load balancer type fits your traffic pattern.

Load Balancer Types at a Glance

Google Cloud organizes load balancers into three families:

Family	Layer	Protocols	Connection Handling
Application Load Balancers	L7 (HTTP/HTTPS)	HTTP, HTTPS, HTTP/2, gRPC	Proxy: terminates connections at LB, opens new ones to backends
Proxy Network Load Balancers	L4 (TCP/SSL)	TCP, with optional SSL offload	Proxy: terminates connections at LB, opens new ones to backends
Passthrough Network Load Balancers	L4 (TCP/UDP/other IP protocols)	External: TCP, UDP, ESP, GRE, ICMP, ICMPv6; internal also supports SCTP and AH	Passthrough: packets pass through with original IPs preserved; responses use Direct Server Return

Each family has external (internet-facing) and internal variants, and global or regional scope depending on the specific load balancer.

All Load Balancer Types

Application Load Balancers (Layer 7)

Load Balancer	Scope	Traffic	Network Tier	Load-Balancing Scheme
Global external Application LB	Global	HTTP/HTTPS	Premium	`EXTERNAL_MANAGED`
Regional external Application LB	Regional	HTTP/HTTPS	Premium or Standard	`EXTERNAL_MANAGED`
Regional internal Application LB	Regional	HTTP/HTTPS	Premium	`INTERNAL_MANAGED`
Cross-region internal Application LB	Cross-region	HTTP/HTTPS	Premium	`INTERNAL_MANAGED`

What they do: Route HTTP/S requests based on URL paths, host headers, query parameters, and other HTTP attributes. Support URL maps, traffic splitting, header-based routing, URL rewrites/redirects, Cloud CDN, and Google Cloud Armor.

When to use: Web applications, REST/gRPC APIs, microservices, any workload where you need content-based routing.

Proxy Network Load Balancers (Layer 4)

Load Balancer	Scope	Traffic	Network Tier	Load-Balancing Scheme
Global external proxy Network LB	Global	TCP, optional SSL offload	Premium	`EXTERNAL_MANAGED`
Regional external proxy Network LB	Regional	TCP	Premium or Standard	`EXTERNAL_MANAGED`
Regional internal proxy Network LB	Regional	TCP	Premium	`INTERNAL_MANAGED`
Cross-region internal proxy Network LB	Cross-region	TCP	Premium	`INTERNAL_MANAGED`

What they do: Terminate TCP connections at the load balancer and forward to backends. External variants support SSL offload (TLS termination at the LB). Internal variants do not.

When to use: Non-HTTP TCP workloads (SMTP, database connections, custom protocols), when you want SSL offload without managing TLS at the backend.

Passthrough Network Load Balancers (Layer 4)

Load Balancer	Scope	Traffic	Network Tier	Load-Balancing Scheme
External passthrough Network LB	Regional	TCP, UDP, ESP, GRE, ICMP, ICMPv6	Premium or Standard	`EXTERNAL`
Internal passthrough Network LB	Regional	TCP, UDP, ICMP, ICMPv6, SCTP, ESP, AH, GRE	Premium	`INTERNAL`

What they do: Pass traffic through directly to backends without terminating connections. The original source and destination IP addresses are preserved. Responses go directly from the backend to the client (Direct Server Return), bypassing the load balancer.

When to use: When you need UDP load balancing, when backends must see the real client IP, for legacy protocols that do not work through a proxy, or when you need the highest throughput with minimal latency overhead.

Key Insight: Passthrough Network LBs are the only option for UDP load balancing. Application LBs and proxy Network LBs only handle TCP-based protocols.

Classic vs New Generation

Google has been migrating from the classic load balancers (load-balancing scheme: EXTERNAL) to the new generation (scheme: EXTERNAL_MANAGED). The classic Application Load Balancer and classic proxy Network Load Balancer are the previous generation.

Aspect	Classic (`EXTERNAL`)	New Generation (`EXTERNAL_MANAGED`)
Data plane	GFE	Envoy-based GFE (Application) or Envoy (Network)
Advanced traffic management	Limited	Full support (traffic splitting, URL rewrites, header-based routing, request mirroring)
GKE integration	GKE Ingress controller	GKE Gateway controller
URL map size limit	64 KB	Up to 1 MB (with new quota system)
Header handling	Case preserved	All header keys lowercase
All backends unhealthy	Returns HTTP 502	Returns HTTP 503
Standard Network Tier	Supported	Global variant does not support Standard; regional variant does
New features	No longer receiving advanced features	Actively developed

Migration: Google provides a guided migration from classic to the global external Application LB (GA since May 2025). You can test with a percentage of traffic before fully committing, and rollback is available within 90 days.

Tip: Use the new generation (Global external Application LB or Regional external Application LB) for all new deployments. The classic LB still works but does not receive new features.

Internal vs External (Internet-Facing)

Aspect	External	Internal
Traffic source	Internet	VPC network, peered networks, Cloud Interconnect/VPN
IP address	Externally routable (public)	Internal (private)
Use case	Public-facing web apps, APIs, CDN origins	Internal microservices, database tiers, private API endpoints
Available for	All LB families	Application LBs, proxy Network LBs, passthrough Network LBs
Global access	Always globally reachable	Regional by default; enable global access on the forwarding rule to accept traffic from any region

External load balancers sit at the edge and accept traffic from the internet. They use Google’s global anycast IP (global scope) or a regional external IP (regional scope).

Internal load balancers are reachable only from within your VPC or connected networks. They use internal IP addresses from your subnet. Use them for multi-tier architectures where middle tiers (application servers connecting to databases, internal APIs between services) need load balancing without exposing endpoints to the internet.

Global vs Regional

Aspect	Global	Regional
Backend scope	Multiple regions	Single region
Frontend IP	Single anycast IP worldwide	Regional IP (still globally reachable)
Routing	Routes to nearest healthy region via Google’s backbone	Routes to backends within one region
Network Tier	Premium only	Premium or Standard
Failover	Automatic cross-region failover	No cross-region failover
Available for	Global external Application LB, Global external proxy Network LB, cross-region internal LBs	Regional external Application LB, Regional internal Application LB, all regional proxy Network LBs, all passthrough Network LBs

How Global Routing Works

flowchart LR
    Client["Client"] -->|request to anycast IP| PoP["Google Edge PoP<br/>(nearest to client)"]
    PoP -->|Google private backbone| Region1["Region A<br/>(primary)"]
    PoP -->|failover path| Region2["Region B<br/>(secondary)"]
    Region1 --> Backend1["Backends"]
    Region2 --> Backend2["Backends"]

Client sends a request to the anycast IP
BGP routing directs the request to the nearest Google edge Point of Presence (PoP)
At the PoP, the load balancer picks the best backend region based on health, capacity, and proximity
Traffic travels over Google’s private backbone (Premium Tier) to the selected region
If the nearest region is unhealthy or at capacity, traffic fails over to the next best region

Tip: Use global load balancers for internet-facing workloads that need cross-region redundancy. Use regional load balancers when all your backends are in one region, or when you need Standard Tier networking.

Choosing the Right Load Balancer

flowchart TD
    START["What protocol?"] --> HTTP{HTTP or HTTPS?}
    HTTP -->|Yes| SOURCE{"Traffic source?"}
    HTTP -->|No| TCPUDP{TCP, UDP, or other?}
    SOURCE -->|Internet| SCOPE{"Multi-region<br/>backends?"}
    SOURCE -->|Internal/VPC| INT_HTTP{"Multi-region<br/>internal backends?"}
    SCOPE -->|Yes| GAPP["Global external<br/>Application LB"]
    SCOPE -->|No| RAPP["Regional external<br/>Application LB"]
    INT_HTTP -->|Yes| XRINT["Cross-region internal<br/>Application LB"]
    INT_HTTP -->|No| RINT["Regional internal<br/>Application LB"]
    TCPUDP -->|UDP needed| PASSTHROUGH["Passthrough Network LB"]
    TCPUDP -->|TCP only| NEEDPROXY{"Need SSL offload<br/>or proxy features?"}
    NEEDPROXY -->|Yes| PNLB_SOURCE{"Traffic source?"}
    NEEDPROXY -->|No| PASSTHROUGH
    PNLB_SOURCE -->|Internet| PNLB_SCOPE{"Multi-region<br/>backends?"}
    PNLB_SOURCE -->|Internal/VPC| PNLB_INT{"Multi-region<br/>internal backends?"}
    PNLB_SCOPE -->|Yes| GPNL["Global external<br/>proxy Network LB"]
    PNLB_SCOPE -->|No| RPNL["Regional external<br/>proxy Network LB"]
    PNLB_INT -->|Yes| XRPNL["Cross-region internal<br/>proxy Network LB"]
    PNLB_INT -->|No| RPNL_INT["Regional internal<br/>proxy Network LB"]

Quick Decision Table

I Need…	Use This
HTTP/S load balancing with URL-based routing	Global external Application LB (multi-region) or Regional external Application LB (single-region)
Internal HTTP/S load balancing for microservices	Regional internal Application LB
Load balancing for a public TCP service (not HTTP)	Global external proxy Network LB (multi-region) or Regional external proxy Network LB (single-region)
UDP load balancing	External passthrough Network LB
Backends must see real client IP	Passthrough Network LB
Internal TCP load balancing (e.g., database tier)	Regional internal proxy Network LB or Internal passthrough Network LB
Multi-region internal load balancing	Cross-region internal Application LB (HTTP) or Cross-region internal proxy Network LB (TCP)

Backend Types

Load balancers distribute traffic to backends. Google Cloud supports several backend types:

Backend Type	Description	Available For
Instance groups	MIGs (recommended) or unmanaged instance groups	All load balancers
Network Endpoint Groups (NEGs)	Granular endpoints by IP/port	All load balancers
Zonal NEGs (`GCE_VM_IP_PORT`)	Individual VM endpoints or GKE Pods	All proxy-based LBs
Serverless NEGs	Cloud Run, Cloud Run functions, App Engine, API Gateway	Application LBs
Internet NEGs	Public external endpoints outside Google Cloud	Global/classic external Application LBs; regional internet NEGs also work with regional Application LBs and regional proxy Network LBs
Hybrid NEGs (`NON_GCP_PRIVATE_IP_PORT`)	On-premises or other-cloud endpoints via Interconnect/VPN	Application LBs and proxy Network LBs
Cloud Storage buckets	Static content serving	External Application LBs (global and classic)

Tip: Use MIGs for VM-based workloads. Use zonal NEGs (GCE_VM_IP_PORT) for GKE pod-level load balancing. Use serverless NEGs for Cloud Run and Cloud Run functions.

Health Checks

Health checks determine which backends receive new connections. Google Cloud sends probes from multiple systems (typically 5-10 probers simultaneously) for reliability.

Protocols

Protocol	Success Criteria
HTTP/HTTPS/HTTP/2	HTTP 200 response
TCP	Successful TCP connection
SSL	TCP connection + TLS handshake
gRPC	OK status and SERVING state

Key Parameters

Parameter	Default	Purpose
Check interval	5 seconds	Time between probes from a single prober
Timeout	5 seconds	Time to wait for a response
Healthy threshold	2	Sequential successful probes to mark a backend healthy
Unhealthy threshold	2	Sequential failed probes to mark a backend unhealthy

LB Health Checks vs Autohealing Health Checks

Aspect	LB Health Check	Autohealing Health Check
Purpose	Stop routing traffic to unhealthy backends	Delete and recreate unhealthy VMs
Aggressiveness	Aggressive (quick detection, redirect traffic)	Conservative (avoid unnecessary VM recreation)
Recommended check interval	5-10 seconds	30-60 seconds
Recommended unhealthy threshold	2-3 consecutive failures	5-10 consecutive failures

Key Insight: Use separate health checks for load balancing and autohealing. LB checks should catch struggling instances fast and stop sending traffic. Autohealing checks should be more patient because recreating a VM is disruptive. See Instance Groups for autohealing configuration.

Firewall requirement: You must create ingress allow rules for health check probe source IP ranges (35.191.0.0/16, 130.211.0.0/22, and IPv6 ranges depending on LB type). Without these rules, health checks fail and the LB marks all backends unhealthy.

SSL/TLS Termination

For proxy-based load balancers (Application LBs, proxy Network LBs), TLS is terminated at the load balancer. The connection to backends can be encrypted (HTTPS/SSL) or plaintext (HTTP/TCP). For passthrough Network LBs, TLS is not terminated at the LB — connections pass through to backends unchanged.

Certificate Options

Type	Management	Validation Level	Cost
Google-managed	Google obtains, provisions, and renews automatically	DV only	No charge
Self-managed	You provide and renew certificates	DV, OV, or EV	No charge for the cert; $0.45 per 1M connections for RSA-3072 and RSA-4096 keys

Google-managed certificates support wildcards via Certificate Manager with DNS authorization. Google handles renewal automatically.

Certificate selection: When multiple certificates are configured, the LB uses SNI (Server Name Indication) from the client’s TLS ClientHello to pick the best-matching certificate based on longest suffix match, preferring ECDSA over RSA.

Configuration Methods

Method	Capacity	Best For
Compute Engine SSL certificates on target proxy	Up to 15 certificates	Simple multi-domain setups
Certificate Manager certificate map on target proxy	Thousands to millions of entries	Large multi-domain deployments
Certificate Manager certificates directly on target proxy	Up to 100 certificates	Medium-scale setups

URL Maps and Routing

URL maps configure how Application Load Balancers route HTTP/S requests. They define the mapping between incoming requests and backend services.

Components

URL Map
├── Default backend service (required — catches everything unmatched)
├── Host rule: "api.example.com" → Path Matcher A
├── Host rule: "*.example.net" → Path Matcher B
│
├── Path Matcher A
│   ├── /v1/users/* → user-service backend
│   ├── /v1/orders/* → order-service backend
│   └── /* → api-default backend
│
└── Path Matcher B
    ├── /images/* → image-backend bucket
    └── /* → web-default backend

Routing Types

Type	Based On	Example
Path-based	URL path	`/api/` → API backend, `/static/` → bucket
Host-based	Hostname	`api.example.com` → API service, `www.example.com` → web frontend
Header-based	HTTP request headers	Route based on custom headers like `X-Version: v2`
Query parameter	URL query string	`?env=canary` → canary backend

Advanced Traffic Management (new generation LBs only)

Feature	What It Does
Traffic splitting	Distribute a percentage of traffic to different backends (e.g., 90/10 canary)
URL rewrites	Change the path or host before sending to the backend
URL redirects	Return a redirect response without hitting backends
Header manipulation	Add, remove, or modify request/response headers
Request mirroring	Send a copy of traffic to a secondary backend for testing
Fault injection	Inject errors or delays for resilience testing
Retry policies	Configure automatic retries on backend failures
Timeouts	Set custom connect and request timeouts

Session Affinity

Session affinity sends requests from the same client to the same backend. This is useful for stateful applications that maintain local session data.

Type	How It Works	Works With
None (default)	Requests distributed by load balancing algorithm	All LBs
Client IP	Same client IP → same backend	All LBs that support affinity
Generated cookie	LB sets `Set-Cookie` on first request; cookie routes subsequent requests	Application LBs (HTTP/HTTPS only)
Header field	Uses a configurable HTTP header value for affinity	Application LBs

Note: Session affinity is best-effort. It can break when backends change health status, when autoscaling adds or removes instances, or when capacity limits are reached. Design your application to handle affinity loss gracefully.

Pricing

Google does not charge you for customer-managed load balancer appliances or VMs because the service is fully managed. Billing is still based on load balancing resources and traffic, such as forwarding rules, data processing, and proxy instances for some load balancer types.

External Load Balancers

Item	Price (USD)
Forwarding rules (first 5 per project)	$0.025/hour each
Additional forwarding rules	$0.01/hour each
Data processed by LB (inbound + outbound)	Commonly $0.008/GiB; check the regional pricing table

Internal Application Load Balancers

Item	Price (USD)
Per Envoy proxy instance	$0.025/hour
Minimum proxy instances per forwarding rule	3 ($0.075/hour minimum)
Data processed by LB	Commonly $0.008/GiB; check the regional pricing table

Each internal Application LB proxy handles up to 18 MB/s bandwidth, 600 HTTP (or 150 HTTPS) new connections/sec, 3,000 active connections, and 1,400 requests/sec. Google adds proxy instances automatically as traffic grows.

Cost Optimization

Strategy	How It Helps
Use Cloud CDN for static content	Cached content bypasses LB data processing charges
Use Google Cloud Armor	Blocked requests do not incur data processing charges
Use Regional external Application LB with Standard Tier	Lower egress charges for single-region deployments
Reduce Cloud Logging sampling	More proxy capacity for actual traffic

Common Operations

Create a Global External Application LB

# 1. Create a health check
gcloud compute health-checks create http my-health-check \
  --port=80 \
  --check-interval=10 \
  --timeout=5
 
# 2. Create a backend service
gcloud compute backend-services create my-backend-service \
  --load-balancing-scheme=EXTERNAL_MANAGED \
  --global \
  --health-checks=my-health-check
 
# 3. Add a MIG backend
gcloud compute backend-services add-backend my-backend-service \
  --instance-group=my-mig \
  --instance-group-zone=us-central1-a \
  --global
 
# 4. Create a URL map
gcloud compute url-maps create my-url-map \
  --default-service=my-backend-service
 
# 5. Create a target HTTPS proxy (with Google-managed certificate)
gcloud compute ssl-certificates create my-cert \
  --domains=example.com \
  --global
 
gcloud compute target-https-proxies create my-https-proxy \
  --url-map=my-url-map \
  --ssl-certificates=my-cert \
  --global
 
# 6. Create a global forwarding rule
gcloud compute forwarding-rules create my-forwarding-rule \
  --load-balancing-scheme=EXTERNAL_MANAGED \
  --target-https-proxy=my-https-proxy \
  --ports=443 \
  --global

Create an External Passthrough Network LB (TCP/UDP)

# 1. Create a health check
gcloud compute health-checks create tcp my-tcp-check \
  --port=8080
 
# 2. Create a backend service
gcloud compute backend-services create my-network-lb-backend \
  --load-balancing-scheme=EXTERNAL \
  --region=us-central1 \
  --health-checks=my-tcp-check \
  --protocol=TCP
 
# 3. Add backend
gcloud compute backend-services add-backend my-network-lb-backend \
  --instance-group=my-mig \
  --instance-group-zone=us-central1-a \
  --region=us-central1
 
# 4. Create forwarding rule (TCP)
gcloud compute forwarding-rules create my-tcp-rule \
  --load-balancing-scheme=EXTERNAL \
  --backend-service=my-network-lb-backend \
  --ports=8080 \
  --region=us-central1

Note: For UDP traffic, use --protocol=UDP on the backend service and --ports= with the appropriate port. Passthrough Network LBs are the only option for UDP load balancing.

Best Practices

Practice	Why
Use new generation LBs (`EXTERNAL_MANAGED`) for all new deployments	Classic LBs no longer receive advanced features
Use global external LBs for internet-facing, multi-region workloads	Single anycast IP, automatic cross-region failover
Use regional LBs for single-region workloads	Simpler setup; supports Standard Tier for lower egress costs
Use internal LBs for inter-service communication	No public IP exposure, lower cost, lower latency
Use separate health checks for LB and autohealing	LB checks should be aggressive; autohealing checks should be conservative
Use Google-managed certificates	Automatic renewal, no charge, no operational overhead
Configure session affinity only when needed	Affinity is best-effort and can break during scaling events
Use MIGs as backends for autoscaling	MIGs auto-scale to match LB traffic
Set up Cloud CDN for static content	Reduces LB data processing charges and backend load
Create firewall rules for health check probes	Without them, all backends appear unhealthy and the LB returns errors

TL;DR

Google Cloud Load Balancing is fully managed and software-defined — no instances to manage, no pre-warming, scales automatically.
Three families: Application LBs (L7, HTTP/S), Proxy Network LBs (L4, TCP), and Passthrough Network LBs (L4, TCP/UDP with direct client IP).
Use new generation LBs (EXTERNAL_MANAGED) for all new deployments. Classic LBs still work but receive no new features.
Global LBs use a single anycast IP and route to the nearest healthy region. Regional LBs keep traffic in one region and support Standard Tier.
External LBs accept internet traffic. Internal LBs accept traffic only from your VPC or connected networks.
Passthrough Network LBs are the only option for UDP load balancing. They preserve client IP and use Direct Server Return.
Application LBs support URL maps for path-based, host-based, and header-based routing, plus traffic splitting and request mirroring.
Use separate health checks for load balancing (aggressive) and autohealing (conservative).
Google-managed SSL certificates are free and auto-renew. Use them unless you need OV/EV validation.
Pricing commonly includes forwarding rules, data processed, and proxy instances for some load balancers. Always confirm current regional pricing before production sizing.

Resources

Cloud Load Balancing Overview Official documentation covering all load balancer types, architecture, and feature comparison.

Choose a Load Balancer Decision guide for selecting the right load balancer type.

External Application Load Balancer Setup and configuration for HTTP/S load balancers.

Passthrough Network Load Balancers TCP/UDP load balancer setup for both external and internal variants.

Cloud Load Balancing Pricing Current pricing for forwarding rules, data processing, and proxy instances.

Health Checks Overview Health check protocols, parameters, and firewall requirements.

SSL Certificates Google-managed and self-managed certificate configuration.

IP Addressing Internal vs external IP addresses, static reservations, and how LBs use forwarding rules.

Instance Groups MIGs and unmanaged groups as load balancer backends.

Google Cloud Networking Overview of VPC, subnets, routing, and connectivity options.

Lalit's Cloud & DevOps notes