Complete guide to AWS Lambda — Understanding serverless compute, pricing, event-driven architecture, and best practices.
What is AWS Lambda?
AWS Lambda is a serverless compute service that runs your code in response to events and automatically manages the underlying compute resources for you.
In practice: With Lambda, you write a function and define a trigger. AWS runs and scales the infrastructure, and you pay only for execution time.
Lambda Architecture
flowchart TD subgraph Sources["Event Sources"] APIGW["API Gateway"] S3["Amazon S3"] SNS["Amazon SNS"] EBR["EventBridge"] SQS["Amazon SQS"] end subgraph LambdaSvc["Lambda Service"] ESM["Event Source Mapping"] F1["Function 1<br/>Python 3.11 | 512 MB | 30s"] F2["Function 2<br/>Node.js 20 | 1024 MB | 60s"] FN["Function N<br/>Java 21 | 256 MB | 15 min"] end APIGW --> F1 S3 --> F1 SNS --> F2 EBR --> F2 SQS --> ESM --> FN subgraph Targets["Downstream Services"] DDB["DynamoDB"] S3Out["Amazon S3"] SQSOut["Amazon SQS"] end F1 --> DDB F2 --> S3Out FN --> SQSOut
Lambda Function Structure
Handler Function
def handler_name(event, context):
# event = Input data (trigger-specific)
# context = Runtime information (request ID, memory, time remaining)
# Your code here
return response # Optional (for sync invocations)Event Object
Contains data from the triggering service.
| Source | Event Structure Example |
|---|---|
| API Gateway | {body, pathParameters, queryStringParameters, headers} |
| S3 | {Records: [{bucket, key, size, ...}]} |
| SNS | {Records: [{Sns: {Message, Subject, ...}}]} |
| DynamoDB Streams | {Records: [{eventName, dynamodb: {Keys, NewImage}}]} |
| Scheduled | {time, version, account} |
Context Object
Provides runtime information.
| Property | Description |
|---|---|
function_name | Function name |
function_version | Function version |
invoked_function_arn | ARN of invoked function |
memory_limit_in_mb | Configured memory |
aws_request_id | Request ID |
log_group_name | CloudWatch log group |
log_stream_name | CloudWatch log stream |
get_remaining_time_in_millis() | Time before execution timeout |
Lambda Execution Model
Cold Start vs Warm Start
flowchart LR Req["Invocation Request"] --> Reuse{"Warm execution environment available?"} Reuse -->|No| Init["INIT phase<br/>1) Create environment<br/>2) Load code/layers<br/>3) Start runtime<br/>4) Run init code"] Reuse -->|Yes| Run["RUN phase<br/>Execute handler"] Init --> Run Run --> Freeze["Environment frozen and kept for reuse (best effort)"]
| Path | Phases Executed | Practical Latency Impact |
|---|---|---|
| Cold start | INIT + RUN | Highest latency; can be noticeable (often 100ms to seconds) |
| Warm start | RUN only | Lowest latency; no initialization penalty |
Important: When Lambda scales out to new concurrent instances, each new instance has its own
INITonce.
What Affects Cold Start Time?
| Factor | Impact |
|---|---|
| Language | Java/Node.js slower than Python/Go |
| Memory | More memory = faster cold starts (more CPU) |
| Package Size | Larger packages = slower download |
| VPC | VPC functions have additional ENI setup |
| Provisioned Concurrency | Eliminates cold starts (costs more) |
Lambda Pricing (reference)
Pricing Components
| Component | Rate | Notes |
|---|---|---|
| Requests | $0.20 per 1M requests | First 1M free |
| Compute Time | $0.0000166667 per GB-second (x86 baseline) | First 400K GB-seconds free; rate varies by architecture/region |
| INIT Phase | Same as compute | NEW: Charged starting Aug 1, 2025 |
| Ephemeral Storage | $0.0000000309 per GB-second | 512 MB free |
| Data Transfer | Standard EC2 rates | First 1 GB free |
INIT Phase Billing (Critical Update)
Starting August 1, 2025: AWS charges for the INIT phase of Lambda functions.
| Phase | Description | Previously | Now (Aug 2025+) |
|---|---|---|---|
| INIT | Provision container, download code, start runtime | Free | Charged |
| RUN | Execute your handler | Charged | Charged |
Impact:
- Functions with frequent cold starts will cost more
- Long-running functions: minimal impact (INIT is small portion)
- Short-running functions with frequent cold starts: significant impact
Mitigation Strategies:
- Use Provisioned Concurrency (eliminates cold starts)
- Keep functions warm with scheduled pings
- Optimize package size (faster INIT = lower INIT cost)
- Use SnapStart (where supported) and reduce initialization work
Pricing Examples
Example 1: Web API Function
- Memory: 512 MB
- Average execution: 500ms
- Requests: 10 million/month
| Component | Calculation | Cost |
|---|---|---|
| Requests | (10M - 1M free) × $0.20/M | $1.80 |
| Compute | 10M × 0.5s × 0.5 GB = 2.5M GB-s (2.1M billable after free tier) | ~$35.00 |
| INIT (with cold starts) | Assume 20% cold rate: 2M × 0.5s × 0.5 GB = 0.5M GB-s | ~$8.33 |
| Total | Requests + Compute + INIT | ~$45.13/month |
Example 2: Image Processing
- Memory: 2048 MB
- Average execution: 10 seconds
- Requests: 100,000/month
| Component | Calculation | Cost |
|---|---|---|
| Requests | All in free tier | $0 |
| Compute | 100K × 10s × 2 GB = 2M GB-s | ~$33.33 |
| INIT | Assume 100% cold: 100K × ~1s × 2 GB = 0.2M GB-s | ~$3.33 |
| Total | After 400K GB-s free tier offset | ~$30.00/month |
Note: Free tier covers first 1M requests and 400K GB-seconds.
Lambda Concurrency
What is Concurrency?
Number of simultaneously executing function instances.
Example:
- 1,000 requests arrive simultaneously
- Each request takes 1 second
- Concurrency = 1,000 (1,000 functions running at once)
Concurrency Limits
| Limit Type | Default | Can Increase |
|---|---|---|
| Account Concurrency | 1,000 | Yes (quota increase) |
| Reserved Concurrency | 0 (none reserved) | Up to account limit |
| Provisioned Concurrency | 0 (none provisioned) | Yes (bounded by account concurrency and function settings) |
Concurrency Behaviors
| Setting | Behavior |
|---|---|
| No limits | Functions scale up to account limit, then throttled |
| Reserved Concurrency | Guaranteed concurrency for specific function |
| Provisioned Concurrency | Pre-warmed instances (no cold starts, costs apply) |
Provisioned Concurrency
| Aspect | Details |
|---|---|
| Purpose | Eliminate cold starts |
| Cost | Pay for provisioned concurrency even when idle |
| Billing | $0.0000041667 per GB-second (higher than on-demand) |
| Use Cases | Latency-sensitive applications |
Lambda Event Sources
Push-Based (Synchronous)
Events directly invoke your function.
| Source | Description |
|---|---|
| API Gateway | HTTP API endpoint |
| Lambda URL | Dedicated HTTPS endpoint |
| SNS | Pub/sub messaging |
| CloudFront | Edge computing (Lambda@Edge) |
| Cognito | User authentication triggers |
| Alexa | Voice skill backend |
Pull-Based (Asynchronous)
Lambda polls for events.
| Source | Description |
|---|---|
| DynamoDB Streams | Database changes |
| Kinesis | Streaming data |
| SQS | Message queue |
| MSK | Kafka-compatible streaming |
Scheduled
| Source | Description |
|---|---|
| EventBridge | Scheduled (cron) events |
| EventBridge Scheduler | One-time or scheduled events |
Lambda Networking
VPC Configuration
| Configuration | Behavior |
|---|---|
| No VPC | Direct internet access, no private resources |
| VPC Attached (Hyperplane ENIs) | Private resource access; internet egress requires NAT GW or VPC endpoints |
Lambda in VPC
flowchart TD Internet["🌐 Internet"] subgraph VPC["VPC"] subgraph Public["Public Subnet"] NAT["🔀 NAT Gateway"] end subgraph Private1["Private Subnet"] Lambda["⚡ Lambda"] end subgraph Private2["Private Subnet"] RDS["🗄️ RDS DB"] end end Internet --> NAT NAT --> Lambda Lambda --> RDS
Note: Lambda in VPC needs ENIs (Elastic Network Interfaces) which can cause cold start delays.
Lambda Versions and Aliases
Versions
Immutable versions of your function.
$Latest ─────▶ Development version
│
├─ 1 ───────▶ Production version (stable)
│
└─ 2 ───────▶ Staging version (testing)
Aliases
Pointers to specific versions.
live ─────▶ Points to version 1
test ─────▶ Points to $Latest
Use Case: Deploy new version, test with “test” alias, then switch “live” alias when ready.
Lambda Layers
Shared libraries and dependencies that can be used across multiple functions.
| Benefit | Description |
|---|---|
| Code Reuse | Share libraries across functions |
| Smaller Packages | Reduce deployment package size |
| Version Control | Manage library versions independently |
Layer Limits:
- Up to 5 layers per function
- Combined unzipped size of function package + layers + custom runtime must be ⇐ 250 MB
- Lambda processes layers in order (overwrites files with same name)
Lambda Best Practices
| Practice | Why |
|---|---|
| Minimize package size | Faster cold starts |
| Use environment variables | Configuration without code changes |
| Set appropriate timeouts | Prevent incomplete executions |
| Use Provisioned Concurrency for latency-sensitive apps | Eliminate cold starts |
| Implement dead-letter queues (DLQ) | Catch failed invocations |
| Monitor with CloudWatch | Visibility into function health |
| Use X-Ray for tracing | Debug performance issues |
| Use retries and idempotency | Handle transient failures |
| Optimize memory | More memory = more CPU = faster |
| Keep stateless | Functions should not rely on local state |
Lambda Use Case Patterns
1. Web APIs
flowchart LR Client["Client"] --> APIGW["API Gateway"] APIGW --> LAPI["Lambda API Handler"] LAPI --> DDB["DynamoDB"] LAPI --> S3["Amazon S3"] LAPI --> RDSP["RDS Proxy / Aurora"]
- Best for: low-latency request/response APIs and microservices.
- Watchouts: cold starts on spiky traffic; use Provisioned Concurrency for strict latency SLOs.
2. File Processing
flowchart LR Upload["S3 Object Upload"] --> S3Event["S3 Event Notification"] S3Event --> LFile["Lambda File Processor"] LFile --> S3Out["S3 Processed Output"] LFile --> Meta["DynamoDB Metadata"] LFile --> Notify["SNS Notification"]
- Best for: image/document transforms, thumbnail generation, metadata extraction.
- Watchouts: make processing idempotent because retries can happen.
3. Stream Processing
flowchart LR Kin["Kinesis Stream"] --> ESM["Event Source Mapping (batch)"] ESM --> LStream["Lambda Stream Consumer"] LStream --> DDB["DynamoDB"] LStream --> S3Agg["S3 Aggregated Output"] LStream --> CWM["CloudWatch Metrics"]
- Best for: near-real-time analytics and enrichment pipelines.
- Watchouts: tune batch size/window and handle partial batch failures correctly.
4. Scheduled Jobs
flowchart LR Sched["EventBridge Rule / Scheduler"] --> LJob["Lambda Scheduled Job"] LJob --> Cleanup["RDS Cleanup"] LJob --> Lifecycle["S3 Lifecycle Tasks"] LJob --> Health["API Health Checks"]
- Best for: cron-like automation, periodic cleanups, and maintenance tasks.
- Watchouts: if job runtime can exceed 15 minutes, move to Step Functions or ECS/Fargate.
TL;DR
Lambda Decision Tree
Task Characteristics
├── Short duration (< 15 min)
├── Event-driven or API-based
├── Can be stateless
└── Need automatic scaling
✓ Use Lambda
├── Long-running (> 15 min)
├── Need full OS control
├── Stateful application
└── Specific hardware requirements
✓ Use EC2
Key Points
- Lambda = Serverless compute (run code without managing servers)
- Pricing = Per request ($0.20/million) + compute time (GB-seconds) + INIT phase (NEW Aug 2025)
- Free Tier = 1M requests + 400K GB-seconds/month
- Cold Start = Provision container + start runtime (can cause delay)
- Provisioned Concurrency = Eliminates cold starts (costs more)
- Triggers = API Gateway, S3, SNS, DynamoDB Streams, Kinesis, EventBridge
- Memory = 128 MB - 10 GB (also affects CPU)
- Timeout = Max 15 minutes
- VPC = Functions need ENIs (slower cold starts)
- Best For = Event-driven, short tasks, microservices
Resources
AWS Lambda Documentation Complete Lambda developer guide.
Lambda Pricing Detailed pricing breakdown.
Lambda Runtimes Supported runtimes.
Lambda Limits Service limits.
Lambda Handlers Handler function syntax by language.