Google Kubernetes Engine (GKE)

Google Kubernetes Engine (GKE) is Google Cloud’s managed Kubernetes service that runs containerized applications without needing to install and operate your own Kubernetes control plane. GKE automates cluster management, scaling, and security — letting you focus on application development.

How GKE Fits In

flowchart LR
    A[Your Application] --> B[Container Image]
    B --> C[GKE Cluster]
    C --> D[Google Infrastructure]
    D --> E[Users]

    subgraph "GKE manages"
        C
    end

GKE eliminates the operational overhead of running Kubernetes yourself. Google manages the control plane (API server, scheduler, etcd, controller manager), and depending on your cluster mode, can also manage the worker nodes.

Responsibility	Self-managed K8s	GKE Standard	GKE Autopilot
Control plane	You	Google	Google
Node OS patches	You	You	Google
Node scaling	You	Configurable	Automatic
Security hardening	You	Partial	Default
Billing model	N/A	Per VM + cluster management fee	Pod resource request + cluster management fee

Key Capabilities

Auto-upgrade — Control plane and nodes can be upgraded automatically on a schedule you control
Auto-repair — Unhealthy nodes are automatically detected and replaced
Built-in logging & monitoring — Integration with Cloud Logging and Cloud Monitoring via Managed Prometheus
Workload Identity Federation for GKE — Securely associate Kubernetes service accounts with IAM service accounts
Binary Authorization — Enforce deploy-time policies on container images (signed, verified)
Shielded GKE nodes — Secure boot, integrity monitoring, encrypted boot disk
GKE Enterprise (formerly Anthos) — Multi-cluster management, fleet-level policy, hybrid cloud

GKE Architecture Overview

flowchart TB
    subgraph Control["Control Plane (Google-managed)"]
        API["API Server"]
        ETCD["etcd"]
        SCHED["Scheduler"]
        CTRL["Controller Manager"]
    end

    subgraph NodePool1["Node Pool 1"]
        N1["Node 1"]
        N2["Node 2"]
    end

    subgraph NodePool2["Node Pool 2 (GPU)"]
        N3["Node 3"]
    end

    API --> N1
    API --> N2
    API --> N3

    N1 --> P1["Pods"]
    N2 --> P2["Pods"]
    N3 --> P3["GPU Pods"]

Control Plane

The control plane runs the Kubernetes core components. In GKE, this is fully managed by Google:

API Server — The entry point for all Kubernetes API calls (kubectl, gcloud, client libraries)
etcd — Consistent, highly-available key-value store for all cluster state
Scheduler — Assigns pods to nodes based on resource requirements and constraints
Controller Manager — Runs core controllers (deployment, replica set, node lifecycle)

Nodes and Node Pools

Nodes are Compute Engine VMs that run your workloads. Nodes are organized into node pools — groups of nodes with identical configuration (machine type, labels, taints).

Key Insight: You can have multiple node pools in a cluster to handle different workload types — e.g., one pool for general workloads and another for GPU-accelerated workloads.

GKE Cluster Modes

Feature	Standard	Autopilot
Node management	You manage node pools	Google manages nodes
Billing	Per-VM + $0.10/hr cluster fee	Pod resource requests + $0.10/hr cluster fee
Configuration	Full control over node config	Google-optimized defaults
Idle node behavior	You configure node pool minimums	Can scale down to zero nodes when no workloads are running
Best for	Workloads needing custom node control or unsupported hardware	Most workloads — hands-off

Tip: Start with Autopilot unless you need custom node configuration, unsupported hardware, OS image choice, or kernel modules.

See Autopilot vs Standard for a detailed comparison.

Core Concepts at a Glance

Concept	What It Does	GKE-Specific Notes
Pod	Smallest deployable unit, one or more containers	Gets an internal IP from the VPC-native alias IP range
Deployment	Manages replica sets and rolling updates	Default rollout strategy with zero-downtime deploys
Service	Stable network endpoint for a set of pods	Integrates with Google Cloud Load Balancing
Namespace	Logical cluster partition	Useful for multi-tenant environments
ConfigMap	Non-sensitive configuration data	Can be mounted as env vars or volumes
Secret	Sensitive data (passwords, keys)	Encrypted at rest by default in GKE
Ingress	HTTP(S) routing to services	GKE provides a built-in Ingress controller
HPA	Horizontal Pod Autoscaler	Scales pods based on CPU, memory, or custom metrics
VPA	Vertical Pod Autoscaler	Adjusts pod resource requests based on usage

See Core Kubernetes Concepts for deeper coverage.

Common GKE Operations

# List clusters
gcloud container clusters list
 
# Get credentials for kubectl
gcloud container clusters get-credentials CLUSTER_NAME --region REGION
 
# View cluster details
gcloud container clusters describe CLUSTER_NAME --region REGION
 
# View node pools
gcloud container node-pools list --cluster=CLUSTER_NAME --region REGION
 
# Check cluster status
kubectl get nodes
kubectl get pods -A

Pricing Summary

As of May 2026, for us-central1 pricing:

Component	Cost
Cluster management fee	$0.10/ h o u r p erc l u s t er f or St an d a r d an d A u t o p i l o t . T h e G K E f ree t i er p ro v i d es$ 74.40/month in credits for Autopilot or zonal Standard clusters; it does not apply to regional Standard cluster fees.
Standard nodes	Compute Engine VM pricing for each node
Autopilot pods	Per running pod resource request: $0.0445/ v CP U - h o u r +$ 0.0049225/GiB-hour + ephemeral storage for default general-purpose workloads
Autopilot Spot pods	Spot prices vary and provide 60-91% discounts for interruptible workloads

Note: Autopilot still accrues the GKE cluster management fee. For general-purpose Autopilot workloads, pod compute billing is based on the resource requests in running or creating pods.

Best Practices

Practice	Why	How
Use GKE Autopilot for new clusters	Reduces operational overhead and surprise costs	Default cluster mode in gcloud console
Enable Workload Identity Federation for GKE	Avoids storing service account keys in pods	`gcloud container clusters update --workload-pool=PROJECT.svc.id.goog`
Use Regional clusters	Survives single-zone failures	`--region REGION` instead of `--zone ZONE`
Set resource requests and limits	Ensures fair scheduling and prevents resource starvation	Define `resources.requests` and `resources.limits` in pod specs
Use namespaces for isolation	Separates environments and enforces resource quotas	`kubectl create namespace NAME`
Enable Binary Authorization	Prevents unverified images from running	Policy-based deployment verification
Use Managed Prometheus	Observability without self-managed Prometheus stack	Built into GKE with Cloud Monitoring integration
Keep Kubernetes version current	Security patches and feature updates	Enable auto-upgrade on a maintenance window
Use PodDisruptionBudgets	Prevents too many pods from being evicted during updates	Define `minAvailable` or `maxUnavailable`
Store secrets in Secret Manager	More secure than Kubernetes Secrets	Use the Secret Manager CSI driver

Common Pitfalls and Gotchas

Warning: GKE cluster deletion is irreversible. All workloads, persistent data, and configurations are permanently lost. Always verify before deleting.

Pitfall	What Happens	How to Avoid
No resource requests	Pods get scheduled but may be evicted or cause OOM kills	Always set `resources.requests` and `resources.limits`
Ignoring upgrade notifications	Clusters on deprecated versions may be force-upgraded	Enable auto-upgrade; plan maintenance windows
Using `latest` image tags	Unpredictable deployments, hard to roll back	Use specific version tags or SHA digests
Exposing services with LoadBalancer	Each service creates a Cloud Load Balancer (costly)	Use Ingress for HTTP(S); share load balancers
Not using namespaces	Hard to manage resources in multi-team clusters	Create namespaces per team or environment
Over-provisioning node pools	Paying for idle compute	Use cluster autoscaler or switch to Autopilot
Storing secrets in ConfigMaps	ConfigMaps are non-sensitive API objects and are usually readable to more workloads/users	Use Kubernetes Secrets or Secret Manager
Ignoring network policies	All pods can communicate by default	Enable Network Policy and define restrictive rules
Single-zone clusters	Entire cluster is down if the zone fails	Use regional clusters for production
Not setting up PodDisruptionBudgets	Updates can take down too many pods at once	Define PDBs for critical workloads

Guide Overview

This guide covers GKE from fundamentals to production best practices:

Topic	Description
Creating a GKE Cluster	Step-by-step cluster creation with gcloud and console
Autopilot vs Standard	Cluster mode comparison and decision guide
Core Kubernetes Concepts	Pods, Deployments, ReplicaSets, and their relationships
Nodes and Node Pools	Node architecture, management, and configuration
Services and Load Balancing	Service types, Ingress, and traffic management
Scaling	Manual scaling, autoscaling, HPA, and VPA
ConfigMaps and Secrets	Configuration and sensitive data management
Namespaces and Service Discovery	Resource isolation and DNS-based service communication

TL;DR

GKE is a managed Kubernetes service — Google handles the control plane and can manage nodes (Autopilot)
Start with Autopilot unless you need custom node configuration
Always set resource requests and limits on pods
Use regional clusters for production workloads
Enable Workload Identity Federation for GKE instead of storing service account keys
GKE includes auto-upgrade, auto-repair, integrated logging/monitoring, and security features out of the box

Lalit's Cloud & DevOps notes

Google Kubernetes Engine (GKE)

How GKE Fits In

Key Capabilities

GKE Architecture Overview

Control Plane

Nodes and Node Pools

GKE Cluster Modes

Core Concepts at a Glance

Common GKE Operations

Pricing Summary

Best Practices

Common Pitfalls and Gotchas

Guide Overview

TL;DR

Resources

GKE Autopilot vs Standard

ConfigMaps and Secrets

Core Kubernetes Concepts

Creating a GKE Cluster

Namespaces and Service Discovery

Nodes and Node Pools

GKE Scaling

Services and Load Balancing