Google Kubernetes Engine (GKE) is Google Cloud’s managed Kubernetes service that runs containerized applications without needing to install and operate your own Kubernetes control plane. GKE automates cluster management, scaling, and security — letting you focus on application development.

How GKE Fits In

flowchart LR
    A[Your Application] --> B[Container Image]
    B --> C[GKE Cluster]
    C --> D[Google Infrastructure]
    D --> E[Users]

    subgraph "GKE manages"
        C
    end

GKE eliminates the operational overhead of running Kubernetes yourself. Google manages the control plane (API server, scheduler, etcd, controller manager), and depending on your cluster mode, can also manage the worker nodes.

ResponsibilitySelf-managed K8sGKE StandardGKE Autopilot
Control planeYouGoogleGoogle
Node OS patchesYouYouGoogle
Node scalingYouConfigurableAutomatic
Security hardeningYouPartialDefault
Billing modelN/APer VM + cluster management feePod resource request + cluster management fee

Key Capabilities

  • Auto-upgrade — Control plane and nodes can be upgraded automatically on a schedule you control
  • Auto-repair — Unhealthy nodes are automatically detected and replaced
  • Built-in logging & monitoring — Integration with Cloud Logging and Cloud Monitoring via Managed Prometheus
  • Workload Identity Federation for GKE — Securely associate Kubernetes service accounts with IAM service accounts
  • Binary Authorization — Enforce deploy-time policies on container images (signed, verified)
  • Shielded GKE nodes — Secure boot, integrity monitoring, encrypted boot disk
  • GKE Enterprise (formerly Anthos) — Multi-cluster management, fleet-level policy, hybrid cloud

GKE Architecture Overview

flowchart TB
    subgraph Control["Control Plane (Google-managed)"]
        API["API Server"]
        ETCD["etcd"]
        SCHED["Scheduler"]
        CTRL["Controller Manager"]
    end

    subgraph NodePool1["Node Pool 1"]
        N1["Node 1"]
        N2["Node 2"]
    end

    subgraph NodePool2["Node Pool 2 (GPU)"]
        N3["Node 3"]
    end

    API --> N1
    API --> N2
    API --> N3

    N1 --> P1["Pods"]
    N2 --> P2["Pods"]
    N3 --> P3["GPU Pods"]

Control Plane

The control plane runs the Kubernetes core components. In GKE, this is fully managed by Google:

  • API Server — The entry point for all Kubernetes API calls (kubectl, gcloud, client libraries)
  • etcd — Consistent, highly-available key-value store for all cluster state
  • Scheduler — Assigns pods to nodes based on resource requirements and constraints
  • Controller Manager — Runs core controllers (deployment, replica set, node lifecycle)

Nodes and Node Pools

Nodes are Compute Engine VMs that run your workloads. Nodes are organized into node pools — groups of nodes with identical configuration (machine type, labels, taints).

Key Insight: You can have multiple node pools in a cluster to handle different workload types — e.g., one pool for general workloads and another for GPU-accelerated workloads.

GKE Cluster Modes

FeatureStandardAutopilot
Node managementYou manage node poolsGoogle manages nodes
BillingPer-VM + $0.10/hr cluster feePod resource requests + $0.10/hr cluster fee
ConfigurationFull control over node configGoogle-optimized defaults
Idle node behaviorYou configure node pool minimumsCan scale down to zero nodes when no workloads are running
Best forWorkloads needing custom node control or unsupported hardwareMost workloads — hands-off

Tip: Start with Autopilot unless you need custom node configuration, unsupported hardware, OS image choice, or kernel modules.

See Autopilot vs Standard for a detailed comparison.

Core Concepts at a Glance

ConceptWhat It DoesGKE-Specific Notes
PodSmallest deployable unit, one or more containersGets an internal IP from the VPC-native alias IP range
DeploymentManages replica sets and rolling updatesDefault rollout strategy with zero-downtime deploys
ServiceStable network endpoint for a set of podsIntegrates with Google Cloud Load Balancing
NamespaceLogical cluster partitionUseful for multi-tenant environments
ConfigMapNon-sensitive configuration dataCan be mounted as env vars or volumes
SecretSensitive data (passwords, keys)Encrypted at rest by default in GKE
IngressHTTP(S) routing to servicesGKE provides a built-in Ingress controller
HPAHorizontal Pod AutoscalerScales pods based on CPU, memory, or custom metrics
VPAVertical Pod AutoscalerAdjusts pod resource requests based on usage

See Core Kubernetes Concepts for deeper coverage.

Common GKE Operations

# List clusters
gcloud container clusters list
 
# Get credentials for kubectl
gcloud container clusters get-credentials CLUSTER_NAME --region REGION
 
# View cluster details
gcloud container clusters describe CLUSTER_NAME --region REGION
 
# View node pools
gcloud container node-pools list --cluster=CLUSTER_NAME --region REGION
 
# Check cluster status
kubectl get nodes
kubectl get pods -A

Pricing Summary

As of May 2026, for us-central1 pricing:

ComponentCost
Cluster management fee74.40/month in credits for Autopilot or zonal Standard clusters; it does not apply to regional Standard cluster fees.
Standard nodesCompute Engine VM pricing for each node
Autopilot podsPer running pod resource request: 0.0049225/GiB-hour + ephemeral storage for default general-purpose workloads
Autopilot Spot podsSpot prices vary and provide 60-91% discounts for interruptible workloads

Note: Autopilot still accrues the GKE cluster management fee. For general-purpose Autopilot workloads, pod compute billing is based on the resource requests in running or creating pods.

Best Practices

PracticeWhyHow
Use GKE Autopilot for new clustersReduces operational overhead and surprise costsDefault cluster mode in gcloud console
Enable Workload Identity Federation for GKEAvoids storing service account keys in podsgcloud container clusters update --workload-pool=PROJECT.svc.id.goog
Use Regional clustersSurvives single-zone failures--region REGION instead of --zone ZONE
Set resource requests and limitsEnsures fair scheduling and prevents resource starvationDefine resources.requests and resources.limits in pod specs
Use namespaces for isolationSeparates environments and enforces resource quotaskubectl create namespace NAME
Enable Binary AuthorizationPrevents unverified images from runningPolicy-based deployment verification
Use Managed PrometheusObservability without self-managed Prometheus stackBuilt into GKE with Cloud Monitoring integration
Keep Kubernetes version currentSecurity patches and feature updatesEnable auto-upgrade on a maintenance window
Use PodDisruptionBudgetsPrevents too many pods from being evicted during updatesDefine minAvailable or maxUnavailable
Store secrets in Secret ManagerMore secure than Kubernetes SecretsUse the Secret Manager CSI driver

Common Pitfalls and Gotchas

Warning: GKE cluster deletion is irreversible. All workloads, persistent data, and configurations are permanently lost. Always verify before deleting.

PitfallWhat HappensHow to Avoid
No resource requestsPods get scheduled but may be evicted or cause OOM killsAlways set resources.requests and resources.limits
Ignoring upgrade notificationsClusters on deprecated versions may be force-upgradedEnable auto-upgrade; plan maintenance windows
Using latest image tagsUnpredictable deployments, hard to roll backUse specific version tags or SHA digests
Exposing services with LoadBalancerEach service creates a Cloud Load Balancer (costly)Use Ingress for HTTP(S); share load balancers
Not using namespacesHard to manage resources in multi-team clustersCreate namespaces per team or environment
Over-provisioning node poolsPaying for idle computeUse cluster autoscaler or switch to Autopilot
Storing secrets in ConfigMapsConfigMaps are non-sensitive API objects and are usually readable to more workloads/usersUse Kubernetes Secrets or Secret Manager
Ignoring network policiesAll pods can communicate by defaultEnable Network Policy and define restrictive rules
Single-zone clustersEntire cluster is down if the zone failsUse regional clusters for production
Not setting up PodDisruptionBudgetsUpdates can take down too many pods at onceDefine PDBs for critical workloads

Guide Overview

This guide covers GKE from fundamentals to production best practices:

TopicDescription
Creating a GKE ClusterStep-by-step cluster creation with gcloud and console
Autopilot vs StandardCluster mode comparison and decision guide
Core Kubernetes ConceptsPods, Deployments, ReplicaSets, and their relationships
Nodes and Node PoolsNode architecture, management, and configuration
Services and Load BalancingService types, Ingress, and traffic management
ScalingManual scaling, autoscaling, HPA, and VPA
ConfigMaps and SecretsConfiguration and sensitive data management
Namespaces and Service DiscoveryResource isolation and DNS-based service communication

TL;DR

  • GKE is a managed Kubernetes service — Google handles the control plane and can manage nodes (Autopilot)
  • Start with Autopilot unless you need custom node configuration
  • Always set resource requests and limits on pods
  • Use regional clusters for production workloads
  • Enable Workload Identity Federation for GKE instead of storing service account keys
  • GKE includes auto-upgrade, auto-repair, integrated logging/monitoring, and security features out of the box

Resources