Nodes are the Compute Engine VMs that run your Kubernetes workloads. In GKE, nodes are organized into node pools — groups of nodes with identical configuration. Understanding node pools is essential for optimizing cost, performance, and reliability.
Node Architecture
flowchart TB subgraph Node["GKE Node (Compute Engine VM)"] KUBELET["kubelet"] KUBEPROXY["kube-proxy"] CRI["Container Runtime (containerd)"] OS["Node OS (COS / Ubuntu)"] subgraph Pods["Running Pods"] P1["Pod 1"] P2["Pod 2"] P3["Pod 3"] end end CP["Control Plane"] --> KUBELET KUBELET --> CRI CRI --> Pods KUBEPROXY --> Pods
Each node runs several critical components:
| Component | Role |
|---|---|
| kubelet | Agent that manages pod lifecycle on the node, reports status to control plane |
| kube-proxy | Maintains network rules for Service routing on the node |
| Container runtime | Runs containers (containerd in GKE) |
| Node OS | Container-Optimized OS (COS) or Ubuntu — managed by GKE auto-upgrade |
Node Pools
A node pool is a group of nodes within a cluster that share the same configuration:
- Machine type (CPU, memory)
- OS image
- Labels and taints
- Disk type and size
- GPU attachments
- Network tags
flowchart TB subgraph Cluster["GKE Cluster"] subgraph NP1["Default Pool (e2-medium)"] N1["Node 1"] N2["Node 2"] N3["Node 3"] end subgraph NP2["GPU Pool (n1-standard-4 + T4)"] N4["Node 4"] N5["Node 5"] end subgraph NP3["Spot Pool (e2-medium, Spot)"] N6["Node 6"] N7["Node 7"] end end
Key Insight: Standard clusters create a first node pool, commonly called the default node pool, unless you explicitly remove or skip it. You can add more pools for different workload types. In Autopilot, Google manages nodes for you — you define workload resource requests instead of managing node pools.
Default Node Pool vs Additional Pools
| Aspect | Default Pool | Additional Pools |
|---|---|---|
| Created with cluster | Yes | No — added separately |
| Can be deleted | Yes (but cluster needs at least one pool) | Yes |
| Configuration | Set during cluster creation | Independent configuration |
| Workload targeting | Generic workloads | Use taints/tolerations or node selectors |
Managing Node Pools
Creating Node Pools
# Add a node pool to an existing Standard cluster
gcloud container node-pools create gpu-pool \
--cluster=my-cluster \
--zone=us-central1-a \
--machine-type=n1-standard-4 \
--accelerator=type=nvidia-tesla-t4,count=1 \
--num-nodes=2 \
--spot \
--enable-autoupgrade \
--enable-autorepair \
--node-labels=workload=gpu,gpu-type=t4 \
--node-taints=nvidia.com/gpu=present:NoScheduleKey Node Pool Flags
| Flag | Purpose | Example |
|---|---|---|
--machine-type | VM type for nodes | e2-medium, n1-standard-4 |
--num-nodes | Initial node count per zone | 3 |
--disk-type | Boot disk type | pd-ssd, pd-balanced, pd-standard |
--disk-size | Boot disk size | 100GB |
--image-type | Node OS image | COS_CONTAINERD, UBUNTU_CONTAINERD |
--spot | Use Spot VMs (cheaper, evictible) | — |
--preemptible | Use Preemptible VMs (24hr max) | — |
--accelerator | Attach GPUs | type=nvidia-tesla-t4,count=1 |
--node-labels | Labels for scheduling | workload=gpu |
--node-taints | Taints to repel non-matching pods | nvidia.com/gpu=present:NoSchedule |
--enable-autoupgrade | Auto-upgrade node OS and K8s | — |
--enable-autorepair | Auto-replace unhealthy nodes | — |
--max-pods-per-node | Limit pods per node | 110 (default) |
--tags | Network tags for firewall rules | backend,ssh-allowed |
Listing and Inspecting Node Pools
# List all node pools in a cluster
gcloud container node-pools list --cluster=my-cluster --zone=us-central1-a
# Describe a specific node pool
gcloud container node-pools describe gpu-pool --cluster=my-cluster --zone=us-central1-a
# View nodes with their pool membership
kubectl get nodes -o wide
# View nodes in a specific pool
kubectl get nodes -l cloud.google.com/gke-nodepool=gpu-poolResizing and Deleting Node Pools
# Resize a node pool
gcloud container clusters resize my-cluster \
--node-pool=gpu-pool \
--zone=us-central1-a \
--num-nodes=5
# Delete a node pool (pods will be evicted)
gcloud container node-pools delete gpu-pool \
--cluster=my-cluster \
--zone=us-central1-aWarning: Deleting a node pool evicts all pods running on those nodes. Ensure pods can be rescheduled elsewhere before deleting a pool.
Machine Type Selection
Common machine types for GKE nodes:
| Machine Type | vCPUs | Memory | Use Case |
|---|---|---|---|
e2-medium | 2 | 4 GB | Development, small workloads |
e2-standard-4 | 4 | 16 GB | General-purpose production |
e2-standard-8 | 8 | 32 GB | Medium production workloads |
e2-highmem-4 | 4 | 32 GB | Memory-intensive applications |
e2-highcpu-8 | 8 | 8 GB | CPU-intensive batch processing |
n1-standard-4 | 4 | 15 GB | GPU-attached workloads |
n2d-standard-16 | 16 | 64 GB | AMD-based compute workloads |
t2a-standard-4 | 4 | 16 GB | Arm-based workloads (cost-effective) |
Tip: Use E2 machine types for most general-purpose workloads. For accelerator workloads, choose a machine series and zone that support the GPU or TPU you need. Use T2A (Arm) for further cost savings if your containers support Arm.
Scheduling Workloads to Specific Node Pools
Node Selectors
The simplest way to target pods to specific nodes:
spec:
nodeSelector:
workload: gpu # matches --node-labels=workload=gpuTaints and Tolerations
Taints repel pods that don’t tolerate them. This is how you reserve a GPU pool for GPU workloads:
# Node pool created with: --node-taints=nvidia.com/gpu=present:NoSchedule
# Pod that tolerates the taint (can be scheduled on GPU nodes)
spec:
tolerations:
- key: nvidia.com/gpu
operator: Equal
value: "present"
effect: NoScheduleTaint Effects
| Effect | Behavior |
|---|---|
NoSchedule | Pod will not be scheduled unless it has a matching toleration |
PreferNoSchedule | Scheduler tries to avoid the node, but will use it if needed |
NoExecute | Pod is evicted if it doesn’t have a matching toleration |
Key Insight: Use taints + tolerations for exclusive pools (GPU, Spot). Use node selectors for preferences. Use both together for strict targeting.
Node Affinity (Advanced)
For more complex scheduling rules:
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: cloud.google.com/gke-nodepool
operator: In
values:
- gpu-pool
- high-mem-poolSpot and Preemptible VMs
Reduce costs by using short-lived VMs for fault-tolerant workloads:
| Feature | Spot VMs | Preemptible VMs |
|---|---|---|
| Max lifetime | No fixed limit (until reclaimed) | 24 hours max |
| Reclaim notice | 30-second warning via metadata | 30-second warning via metadata |
| Availability | Subject to capacity | Subject to capacity |
| Pricing discount | Up to 91% off on-demand | Up to 91% off on-demand |
| Recommended | Yes (newer, more flexible) | No (legacy, use Spot instead) |
# Create a Spot node pool
gcloud container node-pools create spot-pool \
--cluster=my-cluster \
--zone=us-central1-a \
--machine-type=e2-medium \
--num-nodes=3 \
--spot \
--node-labels=cloud.google.com/gke-spot=trueWarning: Spot nodes can be reclaimed at any time. Only run fault-tolerant, interruptible workloads on them (batch jobs, CI/CD, stateless workers). Always pair with a PodDisruptionBudget.
Node Best Practices
| Practice | Why | How |
|---|---|---|
| Use E2 machine types | Best price-performance for most workloads | --machine-type=e2-standard-4 |
| Separate workloads by pool | Different hardware needs, cost optimization | Create dedicated pools with taints |
| Use Spot pools for batch workloads | Up to 91% cost savings | --spot flag + fault-tolerant pods |
| Enable auto-upgrade | Security patches without manual intervention | --enable-autoupgrade |
| Enable auto-repair | Unhealthy nodes replaced automatically | --enable-autorepair |
| Set resource requests/limits | Ensures fair scheduling and prevents noisy neighbors | Define in pod specs |
| Use PodDisruptionBudgets | Protect critical workloads during node drains | Define minAvailable or maxUnavailable |
| Monitor node resource usage | Detect under/over-provisioning | kubectl top nodes + Cloud Monitoring |
| Use Shielded GKE nodes | Secure boot and integrity monitoring | --enable-shielded-nodes (default) |
| Use containerd OS images | Required for modern GKE features | COS_CONTAINERD or UBUNTU_CONTAINERD |
Common Pitfalls
| Pitfall | Consequence | Fix |
|---|---|---|
| All workloads on default pool | Cannot scale or cost-optimize independently | Create separate pools per workload type |
| No taints on GPU pool | Non-GPU pods scheduled on expensive GPU nodes | Use taints + tolerations |
| Over-provisioned nodes | Paying for idle resources | Right-size based on kubectl top nodes |
| No Spot fallback | Batch jobs fail when Spot VMs are reclaimed | Design for interruption + use PDBs |
| Ignoring auto-repair | Unhealthy nodes stay in cluster indefinitely | Enable auto-repair on all pools |
| Using Preemptible instead of Spot | Fixed 24-hour termination, less flexible | Use Spot VMs (--spot) for new pools |
TL;DR
- Nodes are Compute Engine VMs; node pools group nodes with identical configuration
- Use multiple node pools to separate workloads by hardware need (GPU, Spot, high-memory)
- Target workloads to pools with taints + tolerations (exclusive) or node selectors (preference)
- Use Spot VMs for fault-tolerant workloads to save up to 91%
- Always enable auto-upgrade and auto-repair on node pools
- In Autopilot, Google manages nodes — you define pod resource requests instead of node pools