Nodes and Node Pools

Nodes are the Compute Engine VMs that run your Kubernetes workloads. In GKE, nodes are organized into node pools — groups of nodes with identical configuration. Understanding node pools is essential for optimizing cost, performance, and reliability.

Node Architecture

flowchart TB
    subgraph Node["GKE Node (Compute Engine VM)"]
        KUBELET["kubelet"]
        KUBEPROXY["kube-proxy"]
        CRI["Container Runtime (containerd)"]
        OS["Node OS (COS / Ubuntu)"]

        subgraph Pods["Running Pods"]
            P1["Pod 1"]
            P2["Pod 2"]
            P3["Pod 3"]
        end
    end

    CP["Control Plane"] --> KUBELET
    KUBELET --> CRI
    CRI --> Pods
    KUBEPROXY --> Pods

Each node runs several critical components:

Component	Role
kubelet	Agent that manages pod lifecycle on the node, reports status to control plane
kube-proxy	Maintains network rules for Service routing on the node
Container runtime	Runs containers (containerd in GKE)
Node OS	Container-Optimized OS (COS) or Ubuntu — managed by GKE auto-upgrade

Node Pools

A node pool is a group of nodes within a cluster that share the same configuration:

Machine type (CPU, memory)
OS image
Labels and taints
Disk type and size
GPU attachments
Network tags

flowchart TB
    subgraph Cluster["GKE Cluster"]
        subgraph NP1["Default Pool (e2-medium)"]
            N1["Node 1"]
            N2["Node 2"]
            N3["Node 3"]
        end

        subgraph NP2["GPU Pool (n1-standard-4 + T4)"]
            N4["Node 4"]
            N5["Node 5"]
        end

        subgraph NP3["Spot Pool (e2-medium, Spot)"]
            N6["Node 6"]
            N7["Node 7"]
        end
    end

Key Insight: Standard clusters create a first node pool, commonly called the default node pool, unless you explicitly remove or skip it. You can add more pools for different workload types. In Autopilot, Google manages nodes for you — you define workload resource requests instead of managing node pools.

Default Node Pool vs Additional Pools

Aspect	Default Pool	Additional Pools
Created with cluster	Yes	No — added separately
Can be deleted	Yes (but cluster needs at least one pool)	Yes
Configuration	Set during cluster creation	Independent configuration
Workload targeting	Generic workloads	Use taints/tolerations or node selectors

Managing Node Pools

Creating Node Pools

# Add a node pool to an existing Standard cluster
gcloud container node-pools create gpu-pool \
  --cluster=my-cluster \
  --zone=us-central1-a \
  --machine-type=n1-standard-4 \
  --accelerator=type=nvidia-tesla-t4,count=1 \
  --num-nodes=2 \
  --spot \
  --enable-autoupgrade \
  --enable-autorepair \
  --node-labels=workload=gpu,gpu-type=t4 \
  --node-taints=nvidia.com/gpu=present:NoSchedule

Key Node Pool Flags

Flag	Purpose	Example
`--machine-type`	VM type for nodes	`e2-medium`, `n1-standard-4`
`--num-nodes`	Initial node count per zone	`3`
`--disk-type`	Boot disk type	`pd-ssd`, `pd-balanced`, `pd-standard`
`--disk-size`	Boot disk size	`100GB`
`--image-type`	Node OS image	`COS_CONTAINERD`, `UBUNTU_CONTAINERD`
`--spot`	Use Spot VMs (cheaper, evictible)	—
`--preemptible`	Use Preemptible VMs (24hr max)	—
`--accelerator`	Attach GPUs	`type=nvidia-tesla-t4,count=1`
`--node-labels`	Labels for scheduling	`workload=gpu`
`--node-taints`	Taints to repel non-matching pods	`nvidia.com/gpu=present:NoSchedule`
`--enable-autoupgrade`	Auto-upgrade node OS and K8s	—
`--enable-autorepair`	Auto-replace unhealthy nodes	—
`--max-pods-per-node`	Limit pods per node	`110` (default)
`--tags`	Network tags for firewall rules	`backend,ssh-allowed`

Listing and Inspecting Node Pools

# List all node pools in a cluster
gcloud container node-pools list --cluster=my-cluster --zone=us-central1-a
 
# Describe a specific node pool
gcloud container node-pools describe gpu-pool --cluster=my-cluster --zone=us-central1-a
 
# View nodes with their pool membership
kubectl get nodes -o wide
 
# View nodes in a specific pool
kubectl get nodes -l cloud.google.com/gke-nodepool=gpu-pool

Resizing and Deleting Node Pools

# Resize a node pool
gcloud container clusters resize my-cluster \
  --node-pool=gpu-pool \
  --zone=us-central1-a \
  --num-nodes=5
 
# Delete a node pool (pods will be evicted)
gcloud container node-pools delete gpu-pool \
  --cluster=my-cluster \
  --zone=us-central1-a

Warning: Deleting a node pool evicts all pods running on those nodes. Ensure pods can be rescheduled elsewhere before deleting a pool.

Machine Type Selection

Common machine types for GKE nodes:

Machine Type	vCPUs	Memory	Use Case
`e2-medium`	2	4 GB	Development, small workloads
`e2-standard-4`	4	16 GB	General-purpose production
`e2-standard-8`	8	32 GB	Medium production workloads
`e2-highmem-4`	4	32 GB	Memory-intensive applications
`e2-highcpu-8`	8	8 GB	CPU-intensive batch processing
`n1-standard-4`	4	15 GB	GPU-attached workloads
`n2d-standard-16`	16	64 GB	AMD-based compute workloads
`t2a-standard-4`	4	16 GB	Arm-based workloads (cost-effective)

Tip: Use E2 machine types for most general-purpose workloads. For accelerator workloads, choose a machine series and zone that support the GPU or TPU you need. Use T2A (Arm) for further cost savings if your containers support Arm.

Scheduling Workloads to Specific Node Pools

Node Selectors

The simplest way to target pods to specific nodes:

spec:
  nodeSelector:
    workload: gpu        # matches --node-labels=workload=gpu

Taints and Tolerations

Taints repel pods that don’t tolerate them. This is how you reserve a GPU pool for GPU workloads:

# Node pool created with: --node-taints=nvidia.com/gpu=present:NoSchedule
 
# Pod that tolerates the taint (can be scheduled on GPU nodes)
spec:
  tolerations:
    - key: nvidia.com/gpu
      operator: Equal
      value: "present"
      effect: NoSchedule

Taint Effects

Effect	Behavior
`NoSchedule`	Pod will not be scheduled unless it has a matching toleration
`PreferNoSchedule`	Scheduler tries to avoid the node, but will use it if needed
`NoExecute`	Pod is evicted if it doesn’t have a matching toleration

Key Insight: Use taints + tolerations for exclusive pools (GPU, Spot). Use node selectors for preferences. Use both together for strict targeting.

Node Affinity (Advanced)

For more complex scheduling rules:

spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: cloud.google.com/gke-nodepool
                operator: In
                values:
                  - gpu-pool
                  - high-mem-pool

Spot and Preemptible VMs

Reduce costs by using short-lived VMs for fault-tolerant workloads:

Feature	Spot VMs	Preemptible VMs
Max lifetime	No fixed limit (until reclaimed)	24 hours max
Reclaim notice	30-second warning via metadata	30-second warning via metadata
Availability	Subject to capacity	Subject to capacity
Pricing discount	Up to 91% off on-demand	Up to 91% off on-demand
Recommended	Yes (newer, more flexible)	No (legacy, use Spot instead)

# Create a Spot node pool
gcloud container node-pools create spot-pool \
  --cluster=my-cluster \
  --zone=us-central1-a \
  --machine-type=e2-medium \
  --num-nodes=3 \
  --spot \
  --node-labels=cloud.google.com/gke-spot=true

Warning: Spot nodes can be reclaimed at any time. Only run fault-tolerant, interruptible workloads on them (batch jobs, CI/CD, stateless workers). Always pair with a PodDisruptionBudget.

Node Best Practices

Practice	Why	How
Use E2 machine types	Best price-performance for most workloads	`--machine-type=e2-standard-4`
Separate workloads by pool	Different hardware needs, cost optimization	Create dedicated pools with taints
Use Spot pools for batch workloads	Up to 91% cost savings	`--spot` flag + fault-tolerant pods
Enable auto-upgrade	Security patches without manual intervention	`--enable-autoupgrade`
Enable auto-repair	Unhealthy nodes replaced automatically	`--enable-autorepair`
Set resource requests/limits	Ensures fair scheduling and prevents noisy neighbors	Define in pod specs
Use PodDisruptionBudgets	Protect critical workloads during node drains	Define `minAvailable` or `maxUnavailable`
Monitor node resource usage	Detect under/over-provisioning	`kubectl top nodes` + Cloud Monitoring
Use Shielded GKE nodes	Secure boot and integrity monitoring	`--enable-shielded-nodes` (default)
Use containerd OS images	Required for modern GKE features	`COS_CONTAINERD` or `UBUNTU_CONTAINERD`

Common Pitfalls

Pitfall	Consequence	Fix
All workloads on default pool	Cannot scale or cost-optimize independently	Create separate pools per workload type
No taints on GPU pool	Non-GPU pods scheduled on expensive GPU nodes	Use taints + tolerations
Over-provisioned nodes	Paying for idle resources	Right-size based on `kubectl top nodes`
No Spot fallback	Batch jobs fail when Spot VMs are reclaimed	Design for interruption + use PDBs
Ignoring auto-repair	Unhealthy nodes stay in cluster indefinitely	Enable auto-repair on all pools
Using Preemptible instead of Spot	Fixed 24-hour termination, less flexible	Use Spot VMs (`--spot`) for new pools

TL;DR

Nodes are Compute Engine VMs; node pools group nodes with identical configuration
Use multiple node pools to separate workloads by hardware need (GPU, Spot, high-memory)
Target workloads to pools with taints + tolerations (exclusive) or node selectors (preference)
Use Spot VMs for fault-tolerant workloads to save up to 91%
Always enable auto-upgrade and auto-repair on node pools
In Autopilot, Google manages nodes — you define pod resource requests instead of node pools

Lalit's Cloud & DevOps notes