Google Kubernetes Engine (GKE) is Google Cloud’s managed Kubernetes service that runs containerized applications without needing to install and operate your own Kubernetes control plane. GKE automates cluster management, scaling, and security — letting you focus on application development.
How GKE Fits In
flowchart LR A[Your Application] --> B[Container Image] B --> C[GKE Cluster] C --> D[Google Infrastructure] D --> E[Users] subgraph "GKE manages" C end
GKE eliminates the operational overhead of running Kubernetes yourself. Google manages the control plane (API server, scheduler, etcd, controller manager), and depending on your cluster mode, can also manage the worker nodes.
| Responsibility | Self-managed K8s | GKE Standard | GKE Autopilot |
|---|---|---|---|
| Control plane | You | ||
| Node OS patches | You | You | |
| Node scaling | You | Configurable | Automatic |
| Security hardening | You | Partial | Default |
| Billing model | N/A | Per VM + cluster management fee | Pod resource request + cluster management fee |
Key Capabilities
- Auto-upgrade — Control plane and nodes can be upgraded automatically on a schedule you control
- Auto-repair — Unhealthy nodes are automatically detected and replaced
- Built-in logging & monitoring — Integration with Cloud Logging and Cloud Monitoring via Managed Prometheus
- Workload Identity Federation for GKE — Securely associate Kubernetes service accounts with IAM service accounts
- Binary Authorization — Enforce deploy-time policies on container images (signed, verified)
- Shielded GKE nodes — Secure boot, integrity monitoring, encrypted boot disk
- GKE Enterprise (formerly Anthos) — Multi-cluster management, fleet-level policy, hybrid cloud
GKE Architecture Overview
flowchart TB subgraph Control["Control Plane (Google-managed)"] API["API Server"] ETCD["etcd"] SCHED["Scheduler"] CTRL["Controller Manager"] end subgraph NodePool1["Node Pool 1"] N1["Node 1"] N2["Node 2"] end subgraph NodePool2["Node Pool 2 (GPU)"] N3["Node 3"] end API --> N1 API --> N2 API --> N3 N1 --> P1["Pods"] N2 --> P2["Pods"] N3 --> P3["GPU Pods"]
Control Plane
The control plane runs the Kubernetes core components. In GKE, this is fully managed by Google:
- API Server — The entry point for all Kubernetes API calls (kubectl, gcloud, client libraries)
- etcd — Consistent, highly-available key-value store for all cluster state
- Scheduler — Assigns pods to nodes based on resource requirements and constraints
- Controller Manager — Runs core controllers (deployment, replica set, node lifecycle)
Nodes and Node Pools
Nodes are Compute Engine VMs that run your workloads. Nodes are organized into node pools — groups of nodes with identical configuration (machine type, labels, taints).
Key Insight: You can have multiple node pools in a cluster to handle different workload types — e.g., one pool for general workloads and another for GPU-accelerated workloads.
GKE Cluster Modes
| Feature | Standard | Autopilot |
|---|---|---|
| Node management | You manage node pools | Google manages nodes |
| Billing | Per-VM + $0.10/hr cluster fee | Pod resource requests + $0.10/hr cluster fee |
| Configuration | Full control over node config | Google-optimized defaults |
| Idle node behavior | You configure node pool minimums | Can scale down to zero nodes when no workloads are running |
| Best for | Workloads needing custom node control or unsupported hardware | Most workloads — hands-off |
Tip: Start with Autopilot unless you need custom node configuration, unsupported hardware, OS image choice, or kernel modules.
See Autopilot vs Standard for a detailed comparison.
Core Concepts at a Glance
| Concept | What It Does | GKE-Specific Notes |
|---|---|---|
| Pod | Smallest deployable unit, one or more containers | Gets an internal IP from the VPC-native alias IP range |
| Deployment | Manages replica sets and rolling updates | Default rollout strategy with zero-downtime deploys |
| Service | Stable network endpoint for a set of pods | Integrates with Google Cloud Load Balancing |
| Namespace | Logical cluster partition | Useful for multi-tenant environments |
| ConfigMap | Non-sensitive configuration data | Can be mounted as env vars or volumes |
| Secret | Sensitive data (passwords, keys) | Encrypted at rest by default in GKE |
| Ingress | HTTP(S) routing to services | GKE provides a built-in Ingress controller |
| HPA | Horizontal Pod Autoscaler | Scales pods based on CPU, memory, or custom metrics |
| VPA | Vertical Pod Autoscaler | Adjusts pod resource requests based on usage |
See Core Kubernetes Concepts for deeper coverage.
Common GKE Operations
# List clusters
gcloud container clusters list
# Get credentials for kubectl
gcloud container clusters get-credentials CLUSTER_NAME --region REGION
# View cluster details
gcloud container clusters describe CLUSTER_NAME --region REGION
# View node pools
gcloud container node-pools list --cluster=CLUSTER_NAME --region REGION
# Check cluster status
kubectl get nodes
kubectl get pods -APricing Summary
As of May 2026, for us-central1 pricing:
| Component | Cost |
|---|---|
| Cluster management fee | 74.40/month in credits for Autopilot or zonal Standard clusters; it does not apply to regional Standard cluster fees. |
| Standard nodes | Compute Engine VM pricing for each node |
| Autopilot pods | Per running pod resource request: 0.0049225/GiB-hour + ephemeral storage for default general-purpose workloads |
| Autopilot Spot pods | Spot prices vary and provide 60-91% discounts for interruptible workloads |
Note: Autopilot still accrues the GKE cluster management fee. For general-purpose Autopilot workloads, pod compute billing is based on the resource requests in running or creating pods.
Best Practices
| Practice | Why | How |
|---|---|---|
| Use GKE Autopilot for new clusters | Reduces operational overhead and surprise costs | Default cluster mode in gcloud console |
| Enable Workload Identity Federation for GKE | Avoids storing service account keys in pods | gcloud container clusters update --workload-pool=PROJECT.svc.id.goog |
| Use Regional clusters | Survives single-zone failures | --region REGION instead of --zone ZONE |
| Set resource requests and limits | Ensures fair scheduling and prevents resource starvation | Define resources.requests and resources.limits in pod specs |
| Use namespaces for isolation | Separates environments and enforces resource quotas | kubectl create namespace NAME |
| Enable Binary Authorization | Prevents unverified images from running | Policy-based deployment verification |
| Use Managed Prometheus | Observability without self-managed Prometheus stack | Built into GKE with Cloud Monitoring integration |
| Keep Kubernetes version current | Security patches and feature updates | Enable auto-upgrade on a maintenance window |
| Use PodDisruptionBudgets | Prevents too many pods from being evicted during updates | Define minAvailable or maxUnavailable |
| Store secrets in Secret Manager | More secure than Kubernetes Secrets | Use the Secret Manager CSI driver |
Common Pitfalls and Gotchas
Warning: GKE cluster deletion is irreversible. All workloads, persistent data, and configurations are permanently lost. Always verify before deleting.
| Pitfall | What Happens | How to Avoid |
|---|---|---|
| No resource requests | Pods get scheduled but may be evicted or cause OOM kills | Always set resources.requests and resources.limits |
| Ignoring upgrade notifications | Clusters on deprecated versions may be force-upgraded | Enable auto-upgrade; plan maintenance windows |
Using latest image tags | Unpredictable deployments, hard to roll back | Use specific version tags or SHA digests |
| Exposing services with LoadBalancer | Each service creates a Cloud Load Balancer (costly) | Use Ingress for HTTP(S); share load balancers |
| Not using namespaces | Hard to manage resources in multi-team clusters | Create namespaces per team or environment |
| Over-provisioning node pools | Paying for idle compute | Use cluster autoscaler or switch to Autopilot |
| Storing secrets in ConfigMaps | ConfigMaps are non-sensitive API objects and are usually readable to more workloads/users | Use Kubernetes Secrets or Secret Manager |
| Ignoring network policies | All pods can communicate by default | Enable Network Policy and define restrictive rules |
| Single-zone clusters | Entire cluster is down if the zone fails | Use regional clusters for production |
| Not setting up PodDisruptionBudgets | Updates can take down too many pods at once | Define PDBs for critical workloads |
Guide Overview
This guide covers GKE from fundamentals to production best practices:
| Topic | Description |
|---|---|
| Creating a GKE Cluster | Step-by-step cluster creation with gcloud and console |
| Autopilot vs Standard | Cluster mode comparison and decision guide |
| Core Kubernetes Concepts | Pods, Deployments, ReplicaSets, and their relationships |
| Nodes and Node Pools | Node architecture, management, and configuration |
| Services and Load Balancing | Service types, Ingress, and traffic management |
| Scaling | Manual scaling, autoscaling, HPA, and VPA |
| ConfigMaps and Secrets | Configuration and sensitive data management |
| Namespaces and Service Discovery | Resource isolation and DNS-based service communication |
TL;DR
- GKE is a managed Kubernetes service — Google handles the control plane and can manage nodes (Autopilot)
- Start with Autopilot unless you need custom node configuration
- Always set resource requests and limits on pods
- Use regional clusters for production workloads
- Enable Workload Identity Federation for GKE instead of storing service account keys
- GKE includes auto-upgrade, auto-repair, integrated logging/monitoring, and security features out of the box