Overview of Google Cloud’s managed compute services across IaaS, PaaS, containers, and serverless — what each one does, and when to use it.
The Abstraction Spectrum
Google Cloud offers compute services at different levels of abstraction. The less infrastructure you manage, the more the platform handles for you.
flowchart LR IaaS["<b>IaaS</b><br/>You manage OS,<br/>runtime, app"] --> PaaS["<b>PaaS</b><br/>You manage<br/>app code only"] PaaS --> Containers["<b>Containers</b><br/>You manage<br/>container images"] Containers --> Serverless["<b>Serverless</b><br/>You write code,<br/>platform handles rest"]
| Aspect | IaaS | PaaS | Containers | Serverless |
|---|---|---|---|---|
| What you manage | OS, runtime, app | App code | Container image | Function/app code |
| What Google manages | Physical hardware | OS, runtime, scaling | Orchestration, scaling | Everything |
| Scaling | Manual or configured autoscaling | Automatic | Configurable (HPA, autoscaling) | Automatic, scale to zero |
| Billing model | Per-resource (per-second) | Per-instance-class (per-second) | Per-node or per-pod (per-second) | Per-request and per-use |
| Cold starts | None | Possible (Standard) / minimal (Flexible) | None (running pods) | Possible on first request |
| Best for | Legacy apps, full OS control | Web apps with simple deployment | Microservices, portability | Event-driven, intermittent workloads |
No single option is universally better. Many production environments use services from multiple categories together.
IaaS — Infrastructure as a Service
You get virtual machines or physical hardware and manage everything above the hypervisor: OS, runtime, middleware, and application.
Key Services
| Service | What It Does | When to Use |
|---|---|---|
| Compute Engine | Virtual machines on Google’s infrastructure. Full OS control (Linux/Windows), per-second billing, live migration, custom machine types. | Legacy app migrations, workloads needing specific OS or kernel, databases, HPC, gaming servers. Start here when you need a traditional VM. |
| Bare Metal Solution | Dedicated physical servers in Google Cloud regions with low-latency access to GCP services. | Oracle databases, specialized licensed software, workloads requiring a custom hypervisor or direct hardware access. |
Key Insight: Compute Engine is the most flexible compute option on GCP because you control the entire software stack. It is also the most operationally demanding.
Note: Bare Metal Solution is not a self-service VM. Google provisions and maintains the physical hardware, and you connect it to your VPC. It is designed for specialized workloads, not general-purpose computing.
PaaS — Platform as a Service
You provide application code and the platform handles server provisioning, OS maintenance, runtime, scaling, and networking.
Key Services
| Service | What It Does | When to Use |
|---|---|---|
| App Engine Standard | Fully managed serverless platform for web apps. Supports Python, Java, Node.js, Go, PHP, Ruby. Automatic scaling, including scale to zero. | Simple web applications, mobile backends, APIs where you want to deploy source code without thinking about infrastructure. |
| App Engine Flexible | Runs Docker containers on Compute Engine VMs managed by App Engine. Supports custom runtimes and any language. | Apps that need custom runtimes, native libraries, or longer request timeouts than App Engine Standard automatic scaling. |
Tip: For new stateless apps, compare Cloud Run before choosing App Engine. Cloud Run accepts any container, supports source-based deployments, and offers more deployment flexibility. App Engine remains a good fit if you want the App Engine service/version model.
| Aspect | App Engine Standard | App Engine Flexible |
|---|---|---|
| Deploy from | Source code (no Dockerfile) | Docker container |
| Scaling | Automatic, manual, or basic; can scale to zero with automatic scaling | Automatic or manual; does not scale to zero |
| Cold starts | Possible | Minimal because instances stay running |
| Request timeout | 10 minutes with automatic scaling; 24 hours with manual/basic scaling | Up to 60 minutes |
| Runtime support | Python, Java, Node.js, Go, PHP, Ruby | Any (via custom Docker image) |
| Pricing | Per-instance-hour | Per-instance-hour (underlying VMs) |
Containers and Container Orchestration
You package your application as a container image. The platform handles deployment, scaling, networking, and orchestration.
Key Services
| Service | What It Does | When to Use |
|---|---|---|
| Google Kubernetes Engine (GKE) | Managed Kubernetes clusters. Two modes: Autopilot (fully managed, per-pod billing, Google manages nodes) and Standard (you manage node pools, more control). | Microservices architectures, teams already using Kubernetes, workloads needing advanced orchestration (canary deploys, service mesh, GPU scheduling). |
| Artifact Registry | Managed storage for container images, language packages (npm, Maven, Python), and Helm charts. Replaces the deprecated Container Registry. | Storing and versioning container images and packages used in your CI/CD pipeline. |
| Cloud Build | Serverless CI/CD platform. Builds container images from source code, runs tests, and deploys to GKE, Cloud Run, or other targets. | Automating build-test-deploy pipelines without managing build servers. |
Key Insight: GKE Autopilot is the recommended mode for most teams. You pay per pod resource request instead of per node, and Google manages the underlying infrastructure. Use Standard mode when you need node-level control, custom node pools, or GPU and machine choices beyond what Autopilot supports.
GKE Autopilot vs Standard
| Aspect | Autopilot | Standard |
|---|---|---|
| Node management | Google manages nodes | You manage node pools |
| Billing | Per pod resource request | Per VM (node) |
| Kubernetes control plane | Managed by Google | Managed by Google |
| Scaling | Automatic | Manual or cluster autoscaler |
| Security hardening | Enabled by default | Requires manual configuration |
| Best for | Most workloads, including many GPU workloads | Workloads needing node-level control, broader GPU or machine choices, or custom node config |
Serverless
You write code (functions or containerized apps) and the platform handles everything: provisioning, scaling, availability, and infrastructure. You pay only when your code runs.
Key Services
| Service | What It Does | When to Use |
|---|---|---|
| Cloud Run | Runs stateless containers in a fully managed environment. Supports HTTP/s, WebSockets, and event-driven triggers (via Eventarc). Scales from zero to thousands of instances. | Web APIs, websites, data processing apps, webhooks, any containerized workload that should scale to zero when idle. |
| Cloud Run functions (formerly Cloud Functions) | Event-driven, single-purpose functions. Triggers from HTTP, Pub/Sub, Cloud Storage, Eventarc, and more. Supports Node.js, Python, Go, Java, .NET, Ruby, PHP. | Lightweight automation: resize images on upload, process Pub/Sub messages, send notifications, run scheduled tasks. |
Note: In August 2024, Google renamed Cloud Functions to Cloud Run functions. The v2 runtime runs on the same infrastructure as Cloud Run. For new functions, use the v2 runtime (Cloud Run functions). The v1 runtime (original Cloud Functions) has limited feature support.
| Aspect | Cloud Run | Cloud Run functions |
|---|---|---|
| Deploy from | Container image | Source code or container |
| Execution model | Request-based (HTTP/s, gRPC, WebSockets) | Event-driven (triggers from Pub/Sub, Storage, HTTP, etc.) |
| Max request duration | Up to 60 minutes | 60 minutes for HTTP functions; 30 minutes for scheduled/task queue functions; 9 minutes for event-driven functions |
| Scaling | Automatic, scale to zero | Automatic, scale to zero |
| Concurrency | Up to 1000 requests per instance | Configurable for 2nd gen; 1st gen handles one request per instance |
| Best for | Containerized apps, APIs, web services | Single-purpose functions reacting to events |
Serverless-Adjacent Services
These are not compute services but operate on a serverless model:
| Service | What It Does |
|---|---|
| BigQuery | Serverless data warehouse. Run SQL queries over petabytes without managing infrastructure. |
| Dataflow | Serverless stream and batch data processing (Apache Beam). |
| Eventarc | Unified eventing platform that routes events from GCP services and custom sources to Cloud Run, Cloud Run functions, and GKE. |
Decision Guide — When to Use What
| You Need… | Use This | Why |
|---|---|---|
| Full OS control, specific kernel, or licensed software | Compute Engine | You manage the entire stack; closest to on-premises. |
| Simple web app, deploy source code, no Docker | App Engine Standard | Fastest path from code to running app. |
| Custom runtime, longer timeouts, but still want platform management | App Engine Flexible | Docker flexibility with platform-managed scaling. |
| Microservices, service mesh, advanced orchestration | GKE | Full Kubernetes power with managed control plane. |
| Containerized app that should scale to zero | Cloud Run | Serverless containers, any language, any runtime. |
| Function reacting to events (file upload, message queue) | Cloud Run functions | Lightweight, event-driven, single-purpose. |
| Oracle database or workloads needing direct hardware access | Bare Metal Solution | Dedicated hardware with GCP network integration. |
| CI/CD pipeline for building containers | Cloud Build | Serverless build system, pay per build minute. |
Tip: You can combine these services. A common pattern: Cloud Build builds container images, pushes them to Artifact Registry, and deploys to Cloud Run or GKE. Compute Engine VMs handle stateful workloads while Cloud Run handles stateless APIs.
TL;DR
- GCP offers compute services across four abstraction levels: IaaS (Compute Engine, Bare Metal Solution), PaaS (App Engine), Containers (GKE, Artifact Registry, Cloud Build), and Serverless (Cloud Run, Cloud Run functions).
- Start with the highest level of abstraction that meets your requirements. Move down only when you need more control.
- For new projects, consider Cloud Run first for stateless workloads. Use GKE Autopilot for complex microservices. Use Compute Engine when you need full OS control.
- Many production environments use multiple services together. No single option covers every use case.
Resources
Google Cloud Compute Options Official overview of all Google Cloud compute services.
Where Should I Run My Stuff? Google Cloud blog post comparing compute options by use case, abstraction level, and billing model.
App Engine Documentation Official documentation for App Engine Standard and Flexible environments.
GKE Documentation Official documentation for Google Kubernetes Engine, including Autopilot and Standard modes.
Cloud Run Documentation Official documentation for Cloud Run and Cloud Run functions.
Artifact Registry Documentation Official documentation for container image and package management.
Cloud Build Documentation Official documentation for serverless CI/CD.
Google Cloud Overview Foundational overview of Google Cloud, its history, and market position.
Google Compute Engine Deep dive into GCE — VMs, machine types, instance groups, and more.