Services and Load Balancing

Kubernetes Services provide stable networking for ephemeral pods. In GKE, Services integrate directly with Google Cloud Load Balancing, giving you global HTTP(S) load balancing, network load balancing, and internal load balancing out of the box.

Why Services Exist

Pods are ephemeral — they are created, destroyed, and replaced with new IP addresses. A Service provides a stable IP and DNS name that automatically routes traffic to healthy pods, regardless of which pods are currently running.

flowchart LR
    Client["Client"] --> Svc["Service\n(ClusterIP: 10.0.0.5)"]
    Svc --> P1["Pod 1\n(10.4.0.12)"]
    Svc --> P2["Pod 2\n(10.4.0.23)"]
    Svc --> P3["Pod 3\n(10.4.0.34)"]

    P1x["Pod 1 dies"] -.->|replaced by| P4["Pod 4\n(10.4.0.45)"]
    Svc -.-> P4

    style P1x fill:#f44,color:#fff
    style P4 fill:#4CAF50,color:#fff

Service Types

Type	Scope	GKE Integration	Use Case
ClusterIP	Cluster-internal only	None	Inter-service communication
NodePort	External via `<NodeIP>:<Port>`	None	Development, custom load balancers
LoadBalancer	External via Cloud Load Balancer	Google Cloud External TCP/UDP LB	Exposing a service to the internet
Headless	Returns pod IPs directly	None	Stateful apps (databases, Kafka)
ExternalName	DNS CNAME to external service	None	Referencing external services

ClusterIP (Internal)

Default service type. Only accessible from within the cluster.

apiVersion: v1
kind: Service
metadata:
  name: backend
spec:
  type: ClusterIP
  selector:
    app: backend
  ports:
    - port: 80          # Service port
      targetPort: 8080  # Container port

# Access from within the cluster
curl http://backend:80
curl http://backend.default.svc.cluster.local:80

NodePort (External via Node IP)

Exposes the service on a static port on every node. Accessible via <NodeIP>:<NodePort>.

apiVersion: v1
kind: Service
metadata:
  name: backend-np
spec:
  type: NodePort
  selector:
    app: backend
  ports:
    - port: 80
      targetPort: 8080
      nodePort: 30080   # Optional: 30000-32767

Note: NodePort is rarely used directly in production. It’s primarily used when you need a custom load balancer or for development testing.

LoadBalancer (External via Cloud LB)

Creates a Google Cloud External Load Balancer with a public IP address.

apiVersion: v1
kind: Service
metadata:
  name: frontend
spec:
  type: LoadBalancer
  selector:
    app: frontend
  ports:
    - port: 80
      targetPort: 8080

# Check the external IP
kubectl get svc frontend
# NAME       TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)
# frontend   LoadBalancer   10.0.0.123      34.120.45.67    80:30456/TCP

Warning: Each LoadBalancer service creates a separate Cloud Load Balancer, which costs money. For HTTP(S) workloads, use an Ingress to share a single load balancer across multiple services.

Headless Service

Returns the IP addresses of individual pods instead of a single virtual IP. Used for StatefulSets and service discovery patterns.

apiVersion: v1
kind: Service
metadata:
  name: database
spec:
  clusterIP: None        # This makes it headless
  selector:
    app: database
  ports:
    - port: 5432
      targetPort: 5432

# DNS returns all pod IPs
dig +short database.default.svc.cluster.local
# 10.4.0.12
# 10.4.0.23
# 10.4.0.34

Deployments and Services Together

A Deployment manages your pods, and a Service exposes them. They connect via labels and selectors:

flowchart TB
    subgraph Deployment
        selector["selector:\n  matchLabels:\n    app: backend"]
        template["template:\n  labels:\n    app: backend"]
        selector --> template
        template --> P1["Pod: app=backend"]
        template --> P2["Pod: app=backend"]
        template --> P3["Pod: app=backend"]
    end

    subgraph Service
        svcSelector["selector:\n  app: backend"]
        svcSelector --> P1
        svcSelector --> P2
        svcSelector --> P3
    end

Complete Example: Deployment + Service

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
  labels:
    app: web-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web-app
  template:
    metadata:
      labels:
        app: web-app          # <-- This label is the link
    spec:
      containers:
        - name: web
          image: us-docker.pkg.dev/google-samples/containers/gke/hello-app:1.0
          ports:
            - containerPort: 8080
          resources:
            requests:
              cpu: "100m"
              memory: "64Mi"
          readinessProbe:
            httpGet:
              path: /
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
  name: web-app-svc
spec:
  type: LoadBalancer
  selector:
    app: web-app              # <-- Must match the pod label
  ports:
    - port: 80
      targetPort: 8080

# Apply both resources
kubectl apply -f deployment.yaml
 
# Verify
kubectl get deployments
kubectl get pods -l app=web-app
kubectl get svc web-app-svc
 
# Test
curl http://$(kubectl get svc web-app-svc -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

Ingress

Ingress provides HTTP(S) routing to multiple services behind a single load balancer. In GKE, the built-in Ingress controller creates a Google Cloud HTTP(S) Load Balancer.

flowchart LR
    Client["Internet"] --> LB["Google Cloud HTTP(S) LB"]
    LB -->|"/api"| S1["Service: api"]
    LB -->|"/"| S2["Service: frontend"]
    LB -->|"/admin"| S3["Service: admin"]
    S1 --> P1["API Pods"]
    S2 --> P2["Frontend Pods"]
    S3 --> P3["Admin Pods"]

Ingress YAML

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web-ingress
  annotations:
    kubernetes.io/ingress.class: "gce"            # GKE Ingress controller
    kubernetes.io/ingress.global-static-ip-name: "my-static-ip"
spec:
  rules:
    - host: app.example.com
      http:
        paths:
          - path: /api
            pathType: Prefix
            backend:
              service:
                name: api-svc
                port:
                  number: 80
          - path: /
            pathType: Prefix
            backend:
              service:
                name: frontend-svc
                port:
                  number: 80
  tls:
    - hosts:
        - app.example.com
      secretName: tls-secret

Key Insight: In GKE, the Ingress controller automatically configures Google Cloud HTTP(S) Load Balancer, including health checks, backend services, and URL maps. You don’t need to configure these manually.

GKE Ingress Features

Feature	Description
Auto TLS	Google-managed SSL certificates via `Networking.gke.io/managed-certificates` annotation
Static IP	Reserve a static IP with `kubernetes.io/ingress.global-static-ip-name` annotation
Health checks	Auto-configured based on pod readiness probes
Path-based routing	Route to different services based on URL path
Host-based routing	Route to different services based on hostname
BackendConfig	Fine-tune LB settings (timeouts, CDN, IAP) via custom resource

Ingress vs LoadBalancer Service

Aspect	Ingress	LoadBalancer Service
Layer	L7 (HTTP/HTTPS)	L4 (TCP/UDP)
Routing	Path and host-based	None — forwards all traffic
TLS termination	Yes (with auto certificates)	Manual certificate management
Multiple services	Yes (behind one LB)	One service per LB
Cost	One LB for many services	One LB per service
Recommended for	HTTP/HTTPS workloads	Non-HTTP protocols (gRPC, TCP)

Internal Load Balancer

For services that should only be accessible from within your VPC (not the internet):

apiVersion: v1
kind: Service
metadata:
  name: internal-backend
  annotations:
    networking.gke.io/load-balancer-type: "Internal"
spec:
  type: LoadBalancer
  selector:
    app: backend
  ports:
    - port: 80
      targetPort: 8080

Use Case	Service Type
Frontend to internet	Ingress or `type: LoadBalancer`
Backend accessible from VPC	Internal LoadBalancer
Service-to-service within cluster	`type: ClusterIP`

Service Discovery

Kubernetes provides built-in DNS-based service discovery:

# Full DNS format
SERVICE_NAME.NAMESPACE.svc.cluster.local
 
# Examples (within the same namespace)
curl http://backend:80
curl http://backend.default:80
curl http://backend.default.svc.cluster.local:80

DNS Format	When to Use
`backend`	Same namespace (short form)
`backend.staging`	Different namespace
`backend.staging.svc.cluster.local`	Full FQDN (always works)

See Namespaces and Service Discovery for more detail.

Common Commands

Command	Purpose
`kubectl get svc`	List all services
`kubectl describe svc NAME`	Service details and endpoints
`kubectl get endpoints NAME`	Show pod IPs behind a service
`kubectl get ingress`	List all Ingress resources
`kubectl describe ingress NAME`	Ingress details and routing rules
`kubectl apply -f manifest.yaml`	Create/update Deployment + Service
`kubectl expose deployment NAME --port=80 --target-port=8080 --type=LoadBalancer`	Imperatively create a Service
`kubectl delete svc NAME`	Delete a service
`kubectl delete ingress NAME`	Delete an Ingress

Common Pitfalls

Pitfall	Consequence	Fix
Using LoadBalancer per service	High cost (one Cloud LB per service)	Use Ingress for HTTP(S) workloads
Missing readiness probe	Traffic sent to pods that aren’t ready	Define `readinessProbe` on containers
Selector mismatch	Service has zero endpoints	Verify pod labels match service selector
No TLS on external services	Traffic in plaintext	Use Ingress with managed certificates
Using ClusterIP for external access	Service unreachable from outside	Use LoadBalancer or Ingress
Hardcoding pod IPs	Breaks when pods are replaced	Always use Service DNS names

TL;DR

ClusterIP for internal communication (default)
LoadBalancer for exposing non-HTTP services externally
Ingress for HTTP(S) workloads — shares a single Cloud Load Balancer across services
Deployments and Services connect via label selectors — they must match
Always define readiness probes so Services only route to healthy pods
GKE auto-configures health checks, backend services, and URL maps for Ingress

Lalit's Cloud & DevOps notes