How Google Cloud Load Balancing distributes traffic across backends using global anycast IPs, regional proxies, and passthrough load balancers for TCP, UDP, and HTTP/S workloads.
Note: This page is current as of May 2026. Load balancer features, pricing, and supported protocols can change. Verify time-sensitive details in the official Google Cloud documentation before making production decisions.
What Is Google Cloud Load Balancing?
Google Cloud Load Balancing is a fully distributed, software-defined service that distributes traffic across backends. Unlike traditional load balancers that run on specific VMs or appliances you manage, Google’s load balancers run on Google’s own infrastructure — no pre-warming, no scaling bottlenecks, no instance management.
The key architectural difference: Google uses a single anycast IP address as the frontend. Global load balancers route traffic from that one IP to the nearest healthy backend region over Google’s private backbone. You get global reachability without managing DNS-based failover or multiple regional IPs.
Underlying technologies:
| Technology | Used By | Purpose |
|---|---|---|
| Google Front Ends (GFEs) | Classic Application LB, classic proxy Network LB, and the edge layer for global external LBs | Distributed proxies at Google’s edge PoPs worldwide; TLS termination, request routing |
| Envoy-based GFEs | Global external Application LB, global external proxy Network LB | Global edge proxying with newer traffic-management capabilities |
| Envoy proxies | Regional and cross-region Application LBs, regional and cross-region proxy Network LBs | Managed proxy data plane for non-classic regional and internal LBs |
| Maglev + Andromeda | External passthrough Network LB | Distributed passthrough L4 load balancing |
| Andromeda | Internal passthrough Network LB | Google’s SDN virtualization stack for internal L4 load balancing |
In practice: You do not choose these technologies directly. You choose a load balancer type, and Google uses the right backend. What matters is understanding which load balancer type fits your traffic pattern.
Load Balancer Types at a Glance
Google Cloud organizes load balancers into three families:
| Family | Layer | Protocols | Connection Handling |
|---|---|---|---|
| Application Load Balancers | L7 (HTTP/HTTPS) | HTTP, HTTPS, HTTP/2, gRPC | Proxy: terminates connections at LB, opens new ones to backends |
| Proxy Network Load Balancers | L4 (TCP/SSL) | TCP, with optional SSL offload | Proxy: terminates connections at LB, opens new ones to backends |
| Passthrough Network Load Balancers | L4 (TCP/UDP/other IP protocols) | External: TCP, UDP, ESP, GRE, ICMP, ICMPv6; internal also supports SCTP and AH | Passthrough: packets pass through with original IPs preserved; responses use Direct Server Return |
Each family has external (internet-facing) and internal variants, and global or regional scope depending on the specific load balancer.
All Load Balancer Types
Application Load Balancers (Layer 7)
| Load Balancer | Scope | Traffic | Network Tier | Load-Balancing Scheme |
|---|---|---|---|---|
| Global external Application LB | Global | HTTP/HTTPS | Premium | EXTERNAL_MANAGED |
| Regional external Application LB | Regional | HTTP/HTTPS | Premium or Standard | EXTERNAL_MANAGED |
| Regional internal Application LB | Regional | HTTP/HTTPS | Premium | INTERNAL_MANAGED |
| Cross-region internal Application LB | Cross-region | HTTP/HTTPS | Premium | INTERNAL_MANAGED |
What they do: Route HTTP/S requests based on URL paths, host headers, query parameters, and other HTTP attributes. Support URL maps, traffic splitting, header-based routing, URL rewrites/redirects, Cloud CDN, and Google Cloud Armor.
When to use: Web applications, REST/gRPC APIs, microservices, any workload where you need content-based routing.
Proxy Network Load Balancers (Layer 4)
| Load Balancer | Scope | Traffic | Network Tier | Load-Balancing Scheme |
|---|---|---|---|---|
| Global external proxy Network LB | Global | TCP, optional SSL offload | Premium | EXTERNAL_MANAGED |
| Regional external proxy Network LB | Regional | TCP | Premium or Standard | EXTERNAL_MANAGED |
| Regional internal proxy Network LB | Regional | TCP | Premium | INTERNAL_MANAGED |
| Cross-region internal proxy Network LB | Cross-region | TCP | Premium | INTERNAL_MANAGED |
What they do: Terminate TCP connections at the load balancer and forward to backends. External variants support SSL offload (TLS termination at the LB). Internal variants do not.
When to use: Non-HTTP TCP workloads (SMTP, database connections, custom protocols), when you want SSL offload without managing TLS at the backend.
Passthrough Network Load Balancers (Layer 4)
| Load Balancer | Scope | Traffic | Network Tier | Load-Balancing Scheme |
|---|---|---|---|---|
| External passthrough Network LB | Regional | TCP, UDP, ESP, GRE, ICMP, ICMPv6 | Premium or Standard | EXTERNAL |
| Internal passthrough Network LB | Regional | TCP, UDP, ICMP, ICMPv6, SCTP, ESP, AH, GRE | Premium | INTERNAL |
What they do: Pass traffic through directly to backends without terminating connections. The original source and destination IP addresses are preserved. Responses go directly from the backend to the client (Direct Server Return), bypassing the load balancer.
When to use: When you need UDP load balancing, when backends must see the real client IP, for legacy protocols that do not work through a proxy, or when you need the highest throughput with minimal latency overhead.
Key Insight: Passthrough Network LBs are the only option for UDP load balancing. Application LBs and proxy Network LBs only handle TCP-based protocols.
Classic vs New Generation
Google has been migrating from the classic load balancers (load-balancing scheme: EXTERNAL) to the new generation (scheme: EXTERNAL_MANAGED). The classic Application Load Balancer and classic proxy Network Load Balancer are the previous generation.
| Aspect | Classic (EXTERNAL) | New Generation (EXTERNAL_MANAGED) |
|---|---|---|
| Data plane | GFE | Envoy-based GFE (Application) or Envoy (Network) |
| Advanced traffic management | Limited | Full support (traffic splitting, URL rewrites, header-based routing, request mirroring) |
| GKE integration | GKE Ingress controller | GKE Gateway controller |
| URL map size limit | 64 KB | Up to 1 MB (with new quota system) |
| Header handling | Case preserved | All header keys lowercase |
| All backends unhealthy | Returns HTTP 502 | Returns HTTP 503 |
| Standard Network Tier | Supported | Global variant does not support Standard; regional variant does |
| New features | No longer receiving advanced features | Actively developed |
Migration: Google provides a guided migration from classic to the global external Application LB (GA since May 2025). You can test with a percentage of traffic before fully committing, and rollback is available within 90 days.
Tip: Use the new generation (Global external Application LB or Regional external Application LB) for all new deployments. The classic LB still works but does not receive new features.
Internal vs External (Internet-Facing)
| Aspect | External | Internal |
|---|---|---|
| Traffic source | Internet | VPC network, peered networks, Cloud Interconnect/VPN |
| IP address | Externally routable (public) | Internal (private) |
| Use case | Public-facing web apps, APIs, CDN origins | Internal microservices, database tiers, private API endpoints |
| Available for | All LB families | Application LBs, proxy Network LBs, passthrough Network LBs |
| Global access | Always globally reachable | Regional by default; enable global access on the forwarding rule to accept traffic from any region |
External load balancers sit at the edge and accept traffic from the internet. They use Google’s global anycast IP (global scope) or a regional external IP (regional scope).
Internal load balancers are reachable only from within your VPC or connected networks. They use internal IP addresses from your subnet. Use them for multi-tier architectures where middle tiers (application servers connecting to databases, internal APIs between services) need load balancing without exposing endpoints to the internet.
Global vs Regional
| Aspect | Global | Regional |
|---|---|---|
| Backend scope | Multiple regions | Single region |
| Frontend IP | Single anycast IP worldwide | Regional IP (still globally reachable) |
| Routing | Routes to nearest healthy region via Google’s backbone | Routes to backends within one region |
| Network Tier | Premium only | Premium or Standard |
| Failover | Automatic cross-region failover | No cross-region failover |
| Available for | Global external Application LB, Global external proxy Network LB, cross-region internal LBs | Regional external Application LB, Regional internal Application LB, all regional proxy Network LBs, all passthrough Network LBs |
How Global Routing Works
flowchart LR Client["Client"] -->|request to anycast IP| PoP["Google Edge PoP<br/>(nearest to client)"] PoP -->|Google private backbone| Region1["Region A<br/>(primary)"] PoP -->|failover path| Region2["Region B<br/>(secondary)"] Region1 --> Backend1["Backends"] Region2 --> Backend2["Backends"]
- Client sends a request to the anycast IP
- BGP routing directs the request to the nearest Google edge Point of Presence (PoP)
- At the PoP, the load balancer picks the best backend region based on health, capacity, and proximity
- Traffic travels over Google’s private backbone (Premium Tier) to the selected region
- If the nearest region is unhealthy or at capacity, traffic fails over to the next best region
Tip: Use global load balancers for internet-facing workloads that need cross-region redundancy. Use regional load balancers when all your backends are in one region, or when you need Standard Tier networking.
Choosing the Right Load Balancer
flowchart TD START["What protocol?"] --> HTTP{HTTP or HTTPS?} HTTP -->|Yes| SOURCE{"Traffic source?"} HTTP -->|No| TCPUDP{TCP, UDP, or other?} SOURCE -->|Internet| SCOPE{"Multi-region<br/>backends?"} SOURCE -->|Internal/VPC| INT_HTTP{"Multi-region<br/>internal backends?"} SCOPE -->|Yes| GAPP["Global external<br/>Application LB"] SCOPE -->|No| RAPP["Regional external<br/>Application LB"] INT_HTTP -->|Yes| XRINT["Cross-region internal<br/>Application LB"] INT_HTTP -->|No| RINT["Regional internal<br/>Application LB"] TCPUDP -->|UDP needed| PASSTHROUGH["Passthrough Network LB"] TCPUDP -->|TCP only| NEEDPROXY{"Need SSL offload<br/>or proxy features?"} NEEDPROXY -->|Yes| PNLB_SOURCE{"Traffic source?"} NEEDPROXY -->|No| PASSTHROUGH PNLB_SOURCE -->|Internet| PNLB_SCOPE{"Multi-region<br/>backends?"} PNLB_SOURCE -->|Internal/VPC| PNLB_INT{"Multi-region<br/>internal backends?"} PNLB_SCOPE -->|Yes| GPNL["Global external<br/>proxy Network LB"] PNLB_SCOPE -->|No| RPNL["Regional external<br/>proxy Network LB"] PNLB_INT -->|Yes| XRPNL["Cross-region internal<br/>proxy Network LB"] PNLB_INT -->|No| RPNL_INT["Regional internal<br/>proxy Network LB"]
Quick Decision Table
| I Need… | Use This |
|---|---|
| HTTP/S load balancing with URL-based routing | Global external Application LB (multi-region) or Regional external Application LB (single-region) |
| Internal HTTP/S load balancing for microservices | Regional internal Application LB |
| Load balancing for a public TCP service (not HTTP) | Global external proxy Network LB (multi-region) or Regional external proxy Network LB (single-region) |
| UDP load balancing | External passthrough Network LB |
| Backends must see real client IP | Passthrough Network LB |
| Internal TCP load balancing (e.g., database tier) | Regional internal proxy Network LB or Internal passthrough Network LB |
| Multi-region internal load balancing | Cross-region internal Application LB (HTTP) or Cross-region internal proxy Network LB (TCP) |
Backend Types
Load balancers distribute traffic to backends. Google Cloud supports several backend types:
| Backend Type | Description | Available For |
|---|---|---|
| Instance groups | MIGs (recommended) or unmanaged instance groups | All load balancers |
| Network Endpoint Groups (NEGs) | Granular endpoints by IP/port | All load balancers |
Zonal NEGs (GCE_VM_IP_PORT) | Individual VM endpoints or GKE Pods | All proxy-based LBs |
| Serverless NEGs | Cloud Run, Cloud Run functions, App Engine, API Gateway | Application LBs |
| Internet NEGs | Public external endpoints outside Google Cloud | Global/classic external Application LBs; regional internet NEGs also work with regional Application LBs and regional proxy Network LBs |
Hybrid NEGs (NON_GCP_PRIVATE_IP_PORT) | On-premises or other-cloud endpoints via Interconnect/VPN | Application LBs and proxy Network LBs |
| Cloud Storage buckets | Static content serving | External Application LBs (global and classic) |
Tip: Use MIGs for VM-based workloads. Use zonal NEGs (
GCE_VM_IP_PORT) for GKE pod-level load balancing. Use serverless NEGs for Cloud Run and Cloud Run functions.
Health Checks
Health checks determine which backends receive new connections. Google Cloud sends probes from multiple systems (typically 5-10 probers simultaneously) for reliability.
Protocols
| Protocol | Success Criteria |
|---|---|
| HTTP/HTTPS/HTTP/2 | HTTP 200 response |
| TCP | Successful TCP connection |
| SSL | TCP connection + TLS handshake |
| gRPC | OK status and SERVING state |
Key Parameters
| Parameter | Default | Purpose |
|---|---|---|
| Check interval | 5 seconds | Time between probes from a single prober |
| Timeout | 5 seconds | Time to wait for a response |
| Healthy threshold | 2 | Sequential successful probes to mark a backend healthy |
| Unhealthy threshold | 2 | Sequential failed probes to mark a backend unhealthy |
LB Health Checks vs Autohealing Health Checks
| Aspect | LB Health Check | Autohealing Health Check |
|---|---|---|
| Purpose | Stop routing traffic to unhealthy backends | Delete and recreate unhealthy VMs |
| Aggressiveness | Aggressive (quick detection, redirect traffic) | Conservative (avoid unnecessary VM recreation) |
| Recommended check interval | 5-10 seconds | 30-60 seconds |
| Recommended unhealthy threshold | 2-3 consecutive failures | 5-10 consecutive failures |
Key Insight: Use separate health checks for load balancing and autohealing. LB checks should catch struggling instances fast and stop sending traffic. Autohealing checks should be more patient because recreating a VM is disruptive. See Instance Groups for autohealing configuration.
Firewall requirement: You must create ingress allow rules for health check probe source IP ranges (35.191.0.0/16, 130.211.0.0/22, and IPv6 ranges depending on LB type). Without these rules, health checks fail and the LB marks all backends unhealthy.
SSL/TLS Termination
For proxy-based load balancers (Application LBs, proxy Network LBs), TLS is terminated at the load balancer. The connection to backends can be encrypted (HTTPS/SSL) or plaintext (HTTP/TCP). For passthrough Network LBs, TLS is not terminated at the LB — connections pass through to backends unchanged.
Certificate Options
| Type | Management | Validation Level | Cost |
|---|---|---|---|
| Google-managed | Google obtains, provisions, and renews automatically | DV only | No charge |
| Self-managed | You provide and renew certificates | DV, OV, or EV | No charge for the cert; $0.45 per 1M connections for RSA-3072 and RSA-4096 keys |
Google-managed certificates support wildcards via Certificate Manager with DNS authorization. Google handles renewal automatically.
Certificate selection: When multiple certificates are configured, the LB uses SNI (Server Name Indication) from the client’s TLS ClientHello to pick the best-matching certificate based on longest suffix match, preferring ECDSA over RSA.
Configuration Methods
| Method | Capacity | Best For |
|---|---|---|
| Compute Engine SSL certificates on target proxy | Up to 15 certificates | Simple multi-domain setups |
| Certificate Manager certificate map on target proxy | Thousands to millions of entries | Large multi-domain deployments |
| Certificate Manager certificates directly on target proxy | Up to 100 certificates | Medium-scale setups |
URL Maps and Routing
URL maps configure how Application Load Balancers route HTTP/S requests. They define the mapping between incoming requests and backend services.
Components
URL Map
├── Default backend service (required — catches everything unmatched)
├── Host rule: "api.example.com" → Path Matcher A
├── Host rule: "*.example.net" → Path Matcher B
│
├── Path Matcher A
│ ├── /v1/users/* → user-service backend
│ ├── /v1/orders/* → order-service backend
│ └── /* → api-default backend
│
└── Path Matcher B
├── /images/* → image-backend bucket
└── /* → web-default backend
Routing Types
| Type | Based On | Example |
|---|---|---|
| Path-based | URL path | /api/* → API backend, /static/* → bucket |
| Host-based | Hostname | api.example.com → API service, www.example.com → web frontend |
| Header-based | HTTP request headers | Route based on custom headers like X-Version: v2 |
| Query parameter | URL query string | ?env=canary → canary backend |
Advanced Traffic Management (new generation LBs only)
| Feature | What It Does |
|---|---|
| Traffic splitting | Distribute a percentage of traffic to different backends (e.g., 90/10 canary) |
| URL rewrites | Change the path or host before sending to the backend |
| URL redirects | Return a redirect response without hitting backends |
| Header manipulation | Add, remove, or modify request/response headers |
| Request mirroring | Send a copy of traffic to a secondary backend for testing |
| Fault injection | Inject errors or delays for resilience testing |
| Retry policies | Configure automatic retries on backend failures |
| Timeouts | Set custom connect and request timeouts |
Session Affinity
Session affinity sends requests from the same client to the same backend. This is useful for stateful applications that maintain local session data.
| Type | How It Works | Works With |
|---|---|---|
| None (default) | Requests distributed by load balancing algorithm | All LBs |
| Client IP | Same client IP → same backend | All LBs that support affinity |
| Generated cookie | LB sets Set-Cookie on first request; cookie routes subsequent requests | Application LBs (HTTP/HTTPS only) |
| Header field | Uses a configurable HTTP header value for affinity | Application LBs |
Note: Session affinity is best-effort. It can break when backends change health status, when autoscaling adds or removes instances, or when capacity limits are reached. Design your application to handle affinity loss gracefully.
Pricing
Google does not charge you for customer-managed load balancer appliances or VMs because the service is fully managed. Billing is still based on load balancing resources and traffic, such as forwarding rules, data processing, and proxy instances for some load balancer types.
External Load Balancers
| Item | Price (USD) |
|---|---|
| Forwarding rules (first 5 per project) | $0.025/hour each |
| Additional forwarding rules | $0.01/hour each |
| Data processed by LB (inbound + outbound) | Commonly $0.008/GiB; check the regional pricing table |
Internal Application Load Balancers
| Item | Price (USD) |
|---|---|
| Per Envoy proxy instance | $0.025/hour |
| Minimum proxy instances per forwarding rule | 3 ($0.075/hour minimum) |
| Data processed by LB | Commonly $0.008/GiB; check the regional pricing table |
Each internal Application LB proxy handles up to 18 MB/s bandwidth, 600 HTTP (or 150 HTTPS) new connections/sec, 3,000 active connections, and 1,400 requests/sec. Google adds proxy instances automatically as traffic grows.
Cost Optimization
| Strategy | How It Helps |
|---|---|
| Use Cloud CDN for static content | Cached content bypasses LB data processing charges |
| Use Google Cloud Armor | Blocked requests do not incur data processing charges |
| Use Regional external Application LB with Standard Tier | Lower egress charges for single-region deployments |
| Reduce Cloud Logging sampling | More proxy capacity for actual traffic |
Common Operations
Create a Global External Application LB
# 1. Create a health check
gcloud compute health-checks create http my-health-check \
--port=80 \
--check-interval=10 \
--timeout=5
# 2. Create a backend service
gcloud compute backend-services create my-backend-service \
--load-balancing-scheme=EXTERNAL_MANAGED \
--global \
--health-checks=my-health-check
# 3. Add a MIG backend
gcloud compute backend-services add-backend my-backend-service \
--instance-group=my-mig \
--instance-group-zone=us-central1-a \
--global
# 4. Create a URL map
gcloud compute url-maps create my-url-map \
--default-service=my-backend-service
# 5. Create a target HTTPS proxy (with Google-managed certificate)
gcloud compute ssl-certificates create my-cert \
--domains=example.com \
--global
gcloud compute target-https-proxies create my-https-proxy \
--url-map=my-url-map \
--ssl-certificates=my-cert \
--global
# 6. Create a global forwarding rule
gcloud compute forwarding-rules create my-forwarding-rule \
--load-balancing-scheme=EXTERNAL_MANAGED \
--target-https-proxy=my-https-proxy \
--ports=443 \
--globalCreate an External Passthrough Network LB (TCP/UDP)
# 1. Create a health check
gcloud compute health-checks create tcp my-tcp-check \
--port=8080
# 2. Create a backend service
gcloud compute backend-services create my-network-lb-backend \
--load-balancing-scheme=EXTERNAL \
--region=us-central1 \
--health-checks=my-tcp-check \
--protocol=TCP
# 3. Add backend
gcloud compute backend-services add-backend my-network-lb-backend \
--instance-group=my-mig \
--instance-group-zone=us-central1-a \
--region=us-central1
# 4. Create forwarding rule (TCP)
gcloud compute forwarding-rules create my-tcp-rule \
--load-balancing-scheme=EXTERNAL \
--backend-service=my-network-lb-backend \
--ports=8080 \
--region=us-central1Note: For UDP traffic, use
--protocol=UDPon the backend service and--ports=with the appropriate port. Passthrough Network LBs are the only option for UDP load balancing.
Best Practices
| Practice | Why |
|---|---|
Use new generation LBs (EXTERNAL_MANAGED) for all new deployments | Classic LBs no longer receive advanced features |
| Use global external LBs for internet-facing, multi-region workloads | Single anycast IP, automatic cross-region failover |
| Use regional LBs for single-region workloads | Simpler setup; supports Standard Tier for lower egress costs |
| Use internal LBs for inter-service communication | No public IP exposure, lower cost, lower latency |
| Use separate health checks for LB and autohealing | LB checks should be aggressive; autohealing checks should be conservative |
| Use Google-managed certificates | Automatic renewal, no charge, no operational overhead |
| Configure session affinity only when needed | Affinity is best-effort and can break during scaling events |
| Use MIGs as backends for autoscaling | MIGs auto-scale to match LB traffic |
| Set up Cloud CDN for static content | Reduces LB data processing charges and backend load |
| Create firewall rules for health check probes | Without them, all backends appear unhealthy and the LB returns errors |
TL;DR
- Google Cloud Load Balancing is fully managed and software-defined — no instances to manage, no pre-warming, scales automatically.
- Three families: Application LBs (L7, HTTP/S), Proxy Network LBs (L4, TCP), and Passthrough Network LBs (L4, TCP/UDP with direct client IP).
- Use new generation LBs (
EXTERNAL_MANAGED) for all new deployments. Classic LBs still work but receive no new features. - Global LBs use a single anycast IP and route to the nearest healthy region. Regional LBs keep traffic in one region and support Standard Tier.
- External LBs accept internet traffic. Internal LBs accept traffic only from your VPC or connected networks.
- Passthrough Network LBs are the only option for UDP load balancing. They preserve client IP and use Direct Server Return.
- Application LBs support URL maps for path-based, host-based, and header-based routing, plus traffic splitting and request mirroring.
- Use separate health checks for load balancing (aggressive) and autohealing (conservative).
- Google-managed SSL certificates are free and auto-renew. Use them unless you need OV/EV validation.
- Pricing commonly includes forwarding rules, data processed, and proxy instances for some load balancers. Always confirm current regional pricing before production sizing.
Resources
Cloud Load Balancing Overview Official documentation covering all load balancer types, architecture, and feature comparison.
Choose a Load Balancer Decision guide for selecting the right load balancer type.
External Application Load Balancer Setup and configuration for HTTP/S load balancers.
Passthrough Network Load Balancers TCP/UDP load balancer setup for both external and internal variants.
Cloud Load Balancing Pricing Current pricing for forwarding rules, data processing, and proxy instances.
Health Checks Overview Health check protocols, parameters, and firewall requirements.
SSL Certificates Google-managed and self-managed certificate configuration.
IP Addressing Internal vs external IP addresses, static reservations, and how LBs use forwarding rules.
Instance Groups MIGs and unmanaged groups as load balancer backends.
Google Cloud Networking Overview of VPC, subnets, routing, and connectivity options.