Understanding the core building blocks of App Engine — instances, services, versions, and how requests are routed between them.
Architecture Overview
App Engine organizes your application in a hierarchy:
flowchart TD subgraph App["App Engine Application"] subgraph S1["Service: default"] V1a["Version: v1<br/>Traffic: 90%"] V1b["Version: v2<br/>Traffic: 10%"] end subgraph S2["Service: api"] V2a["Version: v1<br/>Traffic: 100%"] end subgraph S3["Service: worker"] V3a["Version: v1<br/>Traffic: 100%"] end end V1a --> I1["Instance 1"] V1a --> I2["Instance 2"] V1b --> I3["Instance 1"] V2a --> I4["Instance 1"] V3a --> I5["Instance 1"] V3a --> I6["Instance 2"]
The hierarchy is: Application > Service > Version > Instance.
Instances
Instances are the basic compute units that run your application code. Each instance has memory, CPU, and a runtime environment. App Engine creates and destroys instances based on your scaling configuration.
Instance Lifecycle
An instance moves through these states:
| State | Description |
|---|---|
| Pending | Starting up, loading your application code |
| Running | Actively serving requests |
| Stopped | Not running, not consuming resources (manual/basic scaling only) |
Lifecycle Events
| Event | Trigger | Notes |
|---|---|---|
| Startup | Instance created | /_ah/start request sent (manual/basic scaling) |
| Warmup | Pre-creating instances | /_ah/warmup request (automatic scaling only) |
| Loading request | First request to new instance | Takes longer due to initialization |
| Shutdown | Instance being terminated | SIGTERM sent, ~2 seconds to clean up before SIGKILL |
Tip: Use warmup requests (
inbound_services: [warmup]in app.yaml) to pre-load your application before real traffic arrives. This reduces cold-start latency for automatically scaled services.
Instance Classes
Automatic scaling (F-class):
| Class | Memory | CPU | Price/Hour |
|---|---|---|---|
| F1 | 128 MB | 600 MHz | $0.05 |
| F2 | 256 MB | 1.2 GHz | $0.10 |
| F4 | 512 MB | 2.4 GHz | $0.20 |
| F4_1G | 1024 MB | 2.4 GHz | $0.30 |
Basic and Manual scaling (B-class):
| Class | Memory | CPU | Price/Hour |
|---|---|---|---|
| B1 | 128 MB | 600 MHz | $0.05 |
| B2 | 256 MB | 1.2 GHz | $0.10 |
| B4 | 512 MB | 2.4 GHz | $0.20 |
| B8 | 1024 MB | 4.8 GHz | $0.40 |
Note: F-class instances are for automatic scaling. B-class instances are for basic and manual scaling. You cannot mix them — the instance class must match the scaling type.
Services
A service (formerly called a “module”) is a logical component of your application. Each service has its own app.yaml, its own scaling configuration, and can even use a different runtime.
Key facts
- Free apps can deploy up to 5 services; paid apps can deploy up to 210 services
- The first service deployed is always called
defaultand cannot be deleted - Each service can use a different runtime and environment (Standard or Flexible)
- Services are independently deployable and scalable
Service addressing
Each service gets its own URL:
https://SERVICE_ID-dot-PROJECT_ID.REGION_ID.r.appspot.comExamples:
defaultservice:https://my-project.uc.r.appspot.comapiservice:https://api-dot-my-project.uc.r.appspot.comworkerservice:https://worker-dot-my-project.uc.r.appspot.com
Common service patterns
| Pattern | Description |
|---|---|
default | Web frontend or primary application |
api | REST or gRPC API backend |
worker | Background task processing |
admin | Internal admin panel |
cron | Scheduled task handlers |
Versions
A version is a specific deployment of a service’s code and configuration. Each time you deploy, you create a new version.
Key facts
- A service can have multiple versions deployed simultaneously
- Only versions receiving traffic consume instance resources
- Versions that are stopped still exist and consume storage — delete them explicitly
- Version IDs are auto-generated (e.g.,
20240501t120000) or specified with--versionflag
Version addressing
Each version gets its own URL:
https://VERSION_ID-dot-SERVICE_ID-dot-PROJECT_ID.REGION_ID.r.appspot.comExample: https://v2-dot-api-dot-my-project.uc.r.appspot.com
Version lifecycle
# Deploy a new version
gcloud app deploy --version v2
# Deploy without sending traffic
gcloud app deploy --no-promote --version v3
# List all versions
gcloud app versions list
# Stop a version (stops all its instances)
gcloud app versions stop v1 --service=default
# Delete a version
gcloud app versions delete v1 --service=defaultWarning: Stopped versions still exist and use storage. Delete old versions you no longer need to avoid unnecessary storage costs.
Request Routing
Requests to your application are routed to the correct service and version based on URL patterns.
Routing methods
| Method | URL Pattern | Routes To |
|---|---|---|
| Default | PROJECT_ID.appspot.com | Default service, traffic-serving version |
| Service-targeted | SERVICE_ID-dot-PROJECT_ID.appspot.com | Specific service, traffic-serving version |
| Version-targeted | VERSION_ID-dot-SERVICE_ID-dot-PROJECT_ID.appspot.com | Specific version of a specific service |
dispatch.yaml
For more control, use a dispatch.yaml file to define routing rules based on URL patterns:
dispatch:
- url: "example.com/api/*"
service: api
- url: "*/tasks/*"
service: worker
- url: "*/*"
service: defaultRules are evaluated in order — first match wins. Maximum 20 dispatch rules per application.
# Deploy dispatch rules
gcloud app deploy dispatch.yamlThe default Service Constraint
Every App Engine application must have a default service. It’s the first service you deploy and cannot be deleted.
| Rule | Detail |
|---|---|
| Must exist | The default service is created with your first deployment |
| Cannot be deleted | Only non-default services can be deleted |
| Receives unrouted traffic | Requests that don’t match any dispatch rule go to default |
| Created first | Deploy the default service before any other service |
Note: If you delete all versions of the
defaultservice, you must deploy a new version before the app can serve requests.
TL;DR
- App Engine is organized as: Application > Services > Versions > Instances.
- Instances run your code. F-class for automatic scaling, B-class for basic/manual.
- Services group related functionality. Limits are 5 per free app or 210 per paid app; each service has its own config and runtime.
- Versions are deployments of your code. Multiple can coexist; traffic is split between them.
- Routing uses URL patterns: default, targeted, or dispatch.yaml rules.
- The
defaultservice must always exist and cannot be deleted.
Resources
How Instances are Managed Instance lifecycle, classes, and scaling behavior.
How Requests are Routed Request routing to services and versions.
dispatch.yaml Reference Configuration reference for dispatch routing rules.
Scaling Options Detailed guide to automatic, basic, and manual scaling.
Services and Versions Managing services, versions, and zero-downtime deployments.