Complete guide to Amazon S3 — Understanding object storage, all storage classes, security, and best practices.
What is Amazon S3?
Amazon S3 (Simple Storage Service) is object storage built to store and retrieve any amount of data from anywhere on the internet. Unlike traditional file systems, S3 stores data as objects consisting of:
- Data: The actual file content
- Key: The unique identifier (acts like filename)
- Metadata: Information about the object (content type, creation date, custom tags)
Key Insight: S3 uses a flat object namespace. Folder-like paths are just key names with slashes. For example,
photos/2024/january.jpgis stored as a single object key.
S3 Architecture
flowchart LR subgraph Account["☁️ AWS Account"] direction TB BA["📦 Bucket: app-media (us-east-1)"] BB["📦 Bucket: analytics-data (eu-west-2)"] A1["Object key: images/file1.jpg"] A2["Object key: images/banner.png"] B1["Object key: raw/2026-02-14/events.json"] B2["Object key: curated/daily-report.parquet"] BA --> A1 BA --> A2 BB --> B1 BB --> B2 end Focus["Selected object<br/>images/file1.jpg"] subgraph Anatomy["Object Anatomy"] direction TB Data["Data (bytes)"] Key["Key (unique in bucket)"] Meta["Metadata"] Tags["Tags (up to 10)"] Class["Storage class"] Version["Version ID (if versioning enabled)"] end A1 --> Focus Focus --> Data Focus --> Key Focus --> Meta Focus --> Tags Focus --> Class Focus --> Version
Object anatomy is shown for the selected object (
images/file1.jpg). Each object belongs to exactly one bucket. Bucket names are globally unique, but bucket contents are isolated per bucket.
S3 Components
Bucket
| Attribute | Description |
|---|---|
| Name | Globally unique across ALL AWS accounts |
| Region | Choose geographic location |
| Quota | Up to 10,000 general purpose buckets per account (default) |
| Objects | Unlimited number of objects per bucket |
| Size | No maximum total bucket size |
Object
| Attribute | Limit |
|---|---|
| Size | Up to 50 TB per object (multipart upload path; practical multipart limit is ~53.7 TB) |
| Metadata | Up to 2 KB of system/user metadata |
| Tags | Up to 10 tags per object |
Note: For objects larger than 100 MB, use multipart upload.
S3 Storage Classes Deep Dive
Quick Comparison Table
| Storage Class | Availability | Minimum Storage | Minimum Charge | Retrieval Fee | Cost (per GB/month) |
|---|---|---|---|---|---|
| S3 Express One Zone | 99.95% | None | None | None | Varies by region |
| S3 Standard | 99.99% | None | None | None | $0.023 |
| S3 Intelligent-Tiering | 99.9% | None | None | None | $0.023 |
| S3 Standard-IA | 99.9% | 30 days | 30 days | $0.01/GB | $0.0125 |
| S3 One Zone-IA | 99.5% | 30 days | 30 days | $0.01/GB | $0.01 |
| Glacier Instant Retrieval | 99.9% | 90 days | 90 days | Yes (per-GB retrieval charges) | $0.004 |
| Glacier Flexible Retrieval | 99.99% (after restore) | 90 days | 90 days | Yes (varies by retrieval option) | $0.0036 |
| Glacier Deep Archive | 99.99% (after restore) | 180 days | 180 days | Yes (varies by retrieval option) | $0.00099 |
⚠️ Pricing Disclaimer: AWS pricing varies by region and is subject to change. Always verify current pricing at the official S3 pricing page.
1. S3 Express One Zone
Best For: Latency-sensitive, frequently accessed data requiring single-digit millisecond performance
| Characteristic | Value |
|---|---|
| Availability | 99.95% (single AZ) |
| Durability | High (within single AZ) |
| Retrieval | Single-digit milliseconds |
| Minimum Storage | None |
| Use Cases | ML training, real-time analytics, AI inferencing, high-frequency trading |
Features:
- 10x faster data access compared to S3 Standard
- 80% lower request costs compared to S3 Standard
- Single Availability Zone storage (co-locate with compute)
- Uses directory buckets (different bucket type)
- Handles up to 2 million requests per second
- Integrates with SageMaker, Athena, EMR, Glue
Note: S3 Express One Zone stores data in a single AZ. For critical data, maintain a copy in another location.
2. S3 Standard
Best For: Frequently accessed data
| Characteristic | Value |
|---|---|
| Availability | 99.99% |
| Durability | 99.999999999% (11 nines) |
| Retrieval | Milliseconds |
| Minimum Storage | None |
| Use Cases | Active workloads, websites, analytics, mobile apps |
Features:
- Designed for 99.99% availability
- Sustains concurrent data loss in 2 AZs
- Cross-region replication available
3. S3 Intelligent-Tiering
Best For: Data with unknown or changing access patterns
| Characteristic | Value |
|---|---|
| Availability | 99.9% |
| Retrieval | Milliseconds |
| Minimum Storage | None |
| Monitoring Fee | $0.0025 per 1,000 objects/month |
How It Works — S3 Intelligent-Tiering Flow:
flowchart TD FA["📁 Frequent Access Tier<br/>$0.023/GB"] IA["📁 Infrequent Access Tier<br/>$0.0125/GB (no retrieval fee)"] AR["📁 Archive Access Tier<br/>$0.0036/GB + retrieval"] FA -->|"30 days no access"| IA IA -->|"90 days no access"| AR
Key Benefit: Automatic cost optimization with no performance impact or retrieval fees for frequent access tier.
4. S3 Standard-IA (Infrequent Access)
Best For: Data accessed less frequently but requires rapid access
| Characteristic | Value |
|---|---|
| Availability | 99.9% |
| Durability | 99.999999999% (11 nines) |
| Retrieval | Milliseconds |
| Minimum Storage | 30 days |
| Minimum Charge | 30 days |
| Retrieval Fee | $0.01 per GB |
Use Cases:
- Backup snapshots
- Older data that needs quick access
- Disaster recovery files
5. S3 One Zone-IA
Best For: Infrequently accessed data where resilience is not critical
| Characteristic | Value |
|---|---|
| Availability | 99.5% |
| Durability | 99.999999999% (within single AZ) |
| Retrieval | Milliseconds |
| Minimum Storage | 30 days |
| Minimum Charge | 30 days |
| Retrieval Fee | $0.01 per GB |
Use Cases:
- Secondary backup copies
- Reproducible data
- Cost-sensitive workloads
Warning: Data stored in a single AZ — if the AZ is destroyed, data is lost.
6. S3 Glacier Instant Retrieval
Best For: Long-term data that needs immediate access
| Characteristic | Value |
|---|---|
| Availability | 99.9% |
| Retrieval | Milliseconds |
| Minimum Storage | 90 days |
| Minimum Charge | 90 days |
| Retrieval Fee | Per-GB retrieval charges apply |
Use Cases:
- Medical imaging (immediate access required)
- Legal documents with quick retrieval needs
- Older analytics data
7. S3 Glacier Flexible Retrieval (formerly S3 Glacier)
Best For: Archive data that can wait minutes to hours for retrieval
| Characteristic | Value |
|---|---|
| Availability | 99.99% (after restore) |
| Retrieval Options | Expedite (1-5 min), Standard (3-5 hours), Bulk (5-12 hours) |
| Minimum Storage | 90 days |
| Minimum Charge | 90 days |
| Retrieval Fee | Varies by retrieval option |
Retrieval Options:
| Option | Time | Cost | Use Case |
|---|---|---|---|
| Expedite | 1-5 minutes | Highest | Urgent data needs |
| Standard | 3-5 hours | Medium | Normal archive retrieval |
| Bulk | 5-12 hours | Lowest | Large data migrations |
Use Cases:
- Compliance archives
- Digital preservation
- Older backups
8. S3 Glacier Deep Archive
Best For: Data archived for 7+ years with rare access needs
| Characteristic | Value |
|---|---|
| Availability | 99.99% |
| Retrieval Time | 12 hours (standard), 48 hours (bulk) |
| Minimum Storage | 180 days |
| Minimum Charge | 180 days |
| Retrieval Fee | Varies by retrieval option and region |
Use Cases:
- Regulatory compliance (7-10 year retention)
- Historical records
- Data that MUST be kept but rarely accessed
Cost Comparison: Deep Archive costs ~$1 per TB per month, making it extremely cost-effective for long-term storage.
S3 Versioning
What It Does
Versioning keeps multiple versions of an object in the same bucket.
Benefits
| Benefit | Description |
|---|---|
| Protection | Recover from accidental deletion or overwrite |
| Audit Trail | Track all changes to objects |
| Rollback | Restore previous versions of objects |
How It Works
With Versioning
flowchart TB V1["PUT file.jpg (v1)"] --> V2["PUT file.jpg (v2)"] --> V3["PUT file.jpg (v3)"] V3 --> DelNoVid["DELETE (no versionId)"] DelNoVid --> Marker["DeleteMarker created<br/>current object hidden"] Marker --> Recover["Older versions (v1/v2/v3)<br/>still recoverable"] V3 --> DelVid["DELETE with versionId=v3"] DelVid --> Removed["Only v3 permanently removed"]
Without Versioning
flowchart LR NV1["file.jpg"] --> NVDel["DELETE"] --> NVGone["Permanently removed"]
Enabling Versioning
# Enable versioning on a bucket
aws s3api put-bucket-versioning \
--bucket my-bucket \
--versioning-configuration Status=Enabled
# List versions of an object
aws s3api list-object-versions \
--bucket my-bucket \
--prefix file.jpg
# Delete specific version
aws s3api delete-object \
--bucket my-bucket \
--key file.jpg \
--version-id 123456789abcdefCost Consideration
You pay for ALL versions stored. Use lifecycle policies to delete old versions.
S3 Lifecycle Policies
What They Do
Automatically transition objects to cheaper storage classes or expire/delete objects.
Common Rules
{
"Rules": [
{
"Id": "TransitionToIA",
"Status": "Enabled",
"Filter": {"Prefix": "logs/"},
"Transitions": [
{
"Days": 30,
"StorageClass": "STANDARD_IA"
},
{
"Days": 90,
"StorageClass": "GLACIER"
},
{
"Days": 180,
"StorageClass": "DEEP_ARCHIVE"
}
],
"Expiration": {"Days": 365}
}
]
}Example Policy Flow
Day 0: Create object (Standard)
│
▼
Day 30: Transition to Standard-IA
│
▼
Day 90: Transition to Glacier Flexible Retrieval
│
▼
Day 180: Transition to Glacier Deep Archive
│
▼
Day 365: Expire (delete object)
S3 Security
1. Bucket Policies
JSON-based policies attached to buckets to control access.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {"AWS": "arn:aws:iam::123456789012:user/bob"},
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::my-bucket/*"
}
]
}2. Block Public Access
| Setting | What It Blocks | Important Note |
|---|---|---|
| BlockPublicAcls | New public bucket/object ACLs and public ACL updates | Existing public ACLs are not removed automatically |
| IgnorePublicAcls | Public access granted through ACLs | ACLs can still exist, but S3 ignores public grants |
| BlockPublicPolicy | New bucket policies that make the bucket public | Existing public bucket policies are not deleted automatically |
| RestrictPublicBuckets | Public/cross-account access from a public bucket policy | If a bucket policy is public, access is limited to the bucket owner’s account and AWS service principals |
Recommendation: Enable all four Block Public Access settings at the account level, then create explicit bucket-level exceptions only when required.
3. Encryption Options
| Type | Key Management | Description |
|---|---|---|
| SSE-S3 | AWS-managed | Simplest, AES-256 |
| SSE-KMS | AWS KMS | Use KMS keys for encryption |
| SSE-C | Customer-provided | You provide the encryption key |
| Client-side | Your application | Encrypt before sending to S3 |
4. Access Control Lists (ACLs)
Note: ACLs are legacy. Use bucket policies and IAM policies instead. ACLs are disabled by default on new buckets (Object Ownership: Bucket Owner Enforced).
S3 Event Notifications
Supported Destinations
| Destination | Use Case |
|---|---|
| SNS | Fan-out to multiple subscribers |
| SQS | Queue-based processing |
| Lambda | Trigger serverless processing |
| EventBridge | Integration with other AWS services |
Event Types
| Event | Description |
|---|---|
s3:ObjectCreated:* | PUT, POST, Copy, CompleteMultipartUpload |
s3:ObjectRemoved:* | Delete |
s3:ObjectRestore:* | Glacier restore completed/expired |
{
"LambdaFunctionConfigurations": [
{
"Id": "ProcessUploads",
"LambdaFunctionArn": "arn:aws:lambda:us-east-1:123456789012:function:ProcessS3",
"Events": ["s3:ObjectCreated:*"],
"Filter": {
"Key": {
"FilterRules": [
{"Name": "prefix", "Value": "uploads/"},
{"Name": "suffix", "Value": ".jpg"}
]
}
}
}
]
}S3 Replication
Cross-Region Replication (CRR)
Replicates objects to a bucket in a different region.
| Use Case | Benefit |
|---|---|
| Disaster Recovery | Data survives regional outage |
| Compliance | Data residency requirements |
| Latency | Access data from closer region |
Same-Region Replication (SRR)
Replicates objects to a bucket in the same region.
| Use Case | Benefit |
|---|---|
| Log Aggregation | Copy logs to centralized bucket |
| Security | Copy to secured account |
| Testing | Replicate to test environment |
Presigned URLs
What They Are
Time-limited URLs that grant temporary access to private objects without requiring IAM permissions.
Use Cases
| Use Case | Description |
|---|---|
| Private downloads | Share files without making them public |
| App uploads | Allow users to upload directly to S3 |
| Temporary access | Grant access for limited time |
Generate Presigned URL
# CLI - 1 hour validity
aws s3 presign s3://my-bucket/private-file.txt --expires-in 3600
# Output:
# https://my-bucket.s3.amazonaws.com/private-file.txt?...
# ...&Signature=...&Expires=1706840400# Python (boto3)
import boto3
s3 = boto3.client('s3')
url = s3.generate_presigned_url(
'get_object',
Params={'Bucket': 'my-bucket', 'Key': 'file.pdf'},
ExpiresIn=3600 # 1 hour
)S3 Select and Glacier Select
Query data directly from S3 or Glacier without downloading entire objects.
Availability note: S3 Select and Glacier Select are not available to new customers. Existing customers can continue to use them.
| Feature | Use Case | Benefit |
|---|---|---|
| S3 Select | CSV, JSON, Parquet | Up to 400% faster, lower cost |
| Glacier Select | Query archives | No full retrieval needed |
S3 Performance
Performance Guidelines
| Guideline | Benefit |
|---|---|
| No prefix randomization required | S3 automatically scales request rates across prefixes |
| Use CloudFront | Cache content at edge |
| Use multipart upload | Faster uploads, better reliability |
| Use Transfer Acceleration | Faster long-distance uploads |
| S3 Transfer Manager | Automatic optimization |
Prefix Best Practices
- Modern S3 automatically scales request rates per prefix.
- You generally do not need to randomize key prefixes for performance.
- Prefer key naming that matches operational access patterns, for example:
bucket/log/2024/01/01/file1.log,bucket/log/2024/01/01/file2.log - Use partitioned naming only when it helps query/layout patterns, for example:
bucket/year=2026/month=02/day=08/file1.parquet
TL;DR
S3 Storage Classes Decision Tree
Need immediate access?
YES
├── Access pattern unknown? → S3 Intelligent-Tiering
├── Frequently accessed? → S3 Standard
└── Infrequently accessed? → S3 Standard-IA
NO (Can wait)
├── Need within minutes? → Glacier Instant Retrieval
├── Can wait hours? → Glacier Flexible Retrieval
└── Can wait 12+ hours? → Glacier Deep Archive
Key Points
- S3 = Object storage (not filesystem)
- Buckets = Globally unique containers in specific regions
- Objects = Data + Key + Metadata (up to 50 TB via multipart upload; practical multipart limit is ~53.7 TB)
- Storage Classes = Trade off access frequency vs cost
- Versioning = Keep multiple versions, recover from deletions
- Lifecycle Policies = Auto-transition or expire objects
- Encryption = SSE-S3, SSE-KMS, SSE-C, or client-side
- Security = Bucket policies + IAM + Block Public Access
- Replication = Cross-region (CRR) or same-region (SRR)
- Presigned URLs = Temporary access to private objects
Resources
Amazon S3 Documentation Complete S3 user guide.
S3 Storage Classes Detailed comparison of all storage classes.
S3 Pricing Detailed pricing breakdown.
S3 Security Best Practices Official security recommendations.