S3

Complete guide to Amazon S3 — Understanding object storage, all storage classes, security, and best practices.

What is Amazon S3?

Amazon S3 (Simple Storage Service) is object storage built to store and retrieve any amount of data from anywhere on the internet. Unlike traditional file systems, S3 stores data as objects consisting of:

Data: The actual file content
Key: The unique identifier (acts like filename)
Metadata: Information about the object (content type, creation date, custom tags)

Key Insight: S3 uses a flat object namespace. Folder-like paths are just key names with slashes. For example, photos/2024/january.jpg is stored as a single object key.

S3 Architecture

flowchart LR
    subgraph Account["☁️ AWS Account"]
        direction TB
        BA["📦 Bucket: app-media (us-east-1)"]
        BB["📦 Bucket: analytics-data (eu-west-2)"]

        A1["Object key: images/file1.jpg"]
        A2["Object key: images/banner.png"]
        B1["Object key: raw/2026-02-14/events.json"]
        B2["Object key: curated/daily-report.parquet"]

        BA --> A1
        BA --> A2
        BB --> B1
        BB --> B2
    end

    Focus["Selected object<br/>images/file1.jpg"]

    subgraph Anatomy["Object Anatomy"]
        direction TB
        Data["Data (bytes)"]
        Key["Key (unique in bucket)"]
        Meta["Metadata"]
        Tags["Tags (up to 10)"]
        Class["Storage class"]
        Version["Version ID (if versioning enabled)"]
    end

    A1 --> Focus
    Focus --> Data
    Focus --> Key
    Focus --> Meta
    Focus --> Tags
    Focus --> Class
    Focus --> Version

Object anatomy is shown for the selected object (images/file1.jpg). Each object belongs to exactly one bucket. Bucket names are globally unique, but bucket contents are isolated per bucket.

S3 Components

Bucket

Attribute	Description
Name	Globally unique across ALL AWS accounts
Region	Choose geographic location
Quota	Up to 10,000 general purpose buckets per account (default)
Objects	Unlimited number of objects per bucket
Size	No maximum total bucket size

Object

Attribute	Limit
Size	Up to 50 TB per object (multipart upload path; practical multipart limit is ~53.7 TB)
Metadata	Up to 2 KB of system/user metadata
Tags	Up to 10 tags per object

Note: For objects larger than 100 MB, use multipart upload.

S3 Storage Classes Deep Dive

Quick Comparison Table

Storage Class	Availability	Minimum Storage	Minimum Charge	Retrieval Fee	Cost (per GB/month)
S3 Express One Zone	99.95%	None	None	None	Varies by region
S3 Standard	99.99%	None	None	None	$0.023
S3 Intelligent-Tiering	99.9%	None	None	None	$0.023
S3 Standard-IA	99.9%	30 days	30 days	$0.01/GB	$0.0125
S3 One Zone-IA	99.5%	30 days	30 days	$0.01/GB	$0.01
Glacier Instant Retrieval	99.9%	90 days	90 days	Yes (per-GB retrieval charges)	$0.004
Glacier Flexible Retrieval	99.99% (after restore)	90 days	90 days	Yes (varies by retrieval option)	$0.0036
Glacier Deep Archive	99.99% (after restore)	180 days	180 days	Yes (varies by retrieval option)	$0.00099

⚠️ Pricing Disclaimer: AWS pricing varies by region and is subject to change. Always verify current pricing at the official S3 pricing page.

1. S3 Express One Zone

Best For: Latency-sensitive, frequently accessed data requiring single-digit millisecond performance

Characteristic	Value
Availability	99.95% (single AZ)
Durability	High (within single AZ)
Retrieval	Single-digit milliseconds
Minimum Storage	None
Use Cases	ML training, real-time analytics, AI inferencing, high-frequency trading

Features:

10x faster data access compared to S3 Standard
80% lower request costs compared to S3 Standard
Single Availability Zone storage (co-locate with compute)
Uses directory buckets (different bucket type)
Handles up to 2 million requests per second
Integrates with SageMaker, Athena, EMR, Glue

Note: S3 Express One Zone stores data in a single AZ. For critical data, maintain a copy in another location.

2. S3 Standard

Best For: Frequently accessed data

Characteristic	Value
Availability	99.99%
Durability	99.999999999% (11 nines)
Retrieval	Milliseconds
Minimum Storage	None
Use Cases	Active workloads, websites, analytics, mobile apps

Features:

Designed for 99.99% availability
Sustains concurrent data loss in 2 AZs
Cross-region replication available

3. S3 Intelligent-Tiering

Best For: Data with unknown or changing access patterns

Characteristic	Value
Availability	99.9%
Retrieval	Milliseconds
Minimum Storage	None
Monitoring Fee	$0.0025 per 1,000 objects/month

How It Works — S3 Intelligent-Tiering Flow:

flowchart TD
    FA["📁 Frequent Access Tier<br/>$0.023/GB"]
    IA["📁 Infrequent Access Tier<br/>$0.0125/GB (no retrieval fee)"]
    AR["📁 Archive Access Tier<br/>$0.0036/GB + retrieval"]
    
    FA -->|"30 days no access"| IA
    IA -->|"90 days no access"| AR

Key Benefit: Automatic cost optimization with no performance impact or retrieval fees for frequent access tier.

4. S3 Standard-IA (Infrequent Access)

Best For: Data accessed less frequently but requires rapid access

Characteristic	Value
Availability	99.9%
Durability	99.999999999% (11 nines)
Retrieval	Milliseconds
Minimum Storage	30 days
Minimum Charge	30 days
Retrieval Fee	$0.01 per GB

Use Cases:

Backup snapshots
Older data that needs quick access
Disaster recovery files

5. S3 One Zone-IA

Best For: Infrequently accessed data where resilience is not critical

Characteristic	Value
Availability	99.5%
Durability	99.999999999% (within single AZ)
Retrieval	Milliseconds
Minimum Storage	30 days
Minimum Charge	30 days
Retrieval Fee	$0.01 per GB

Use Cases:

Secondary backup copies
Reproducible data
Cost-sensitive workloads

Warning: Data stored in a single AZ — if the AZ is destroyed, data is lost.

6. S3 Glacier Instant Retrieval

Best For: Long-term data that needs immediate access

Characteristic	Value
Availability	99.9%
Retrieval	Milliseconds
Minimum Storage	90 days
Minimum Charge	90 days
Retrieval Fee	Per-GB retrieval charges apply

Use Cases:

Medical imaging (immediate access required)
Legal documents with quick retrieval needs
Older analytics data

7. S3 Glacier Flexible Retrieval (formerly S3 Glacier)

Best For: Archive data that can wait minutes to hours for retrieval

Characteristic	Value
Availability	99.99% (after restore)
Retrieval Options	Expedite (1-5 min), Standard (3-5 hours), Bulk (5-12 hours)
Minimum Storage	90 days
Minimum Charge	90 days
Retrieval Fee	Varies by retrieval option

Retrieval Options:

Option	Time	Cost	Use Case
Expedite	1-5 minutes	Highest	Urgent data needs
Standard	3-5 hours	Medium	Normal archive retrieval
Bulk	5-12 hours	Lowest	Large data migrations

Use Cases:

Compliance archives
Digital preservation
Older backups

8. S3 Glacier Deep Archive

Best For: Data archived for 7+ years with rare access needs

Characteristic	Value
Availability	99.99%
Retrieval Time	12 hours (standard), 48 hours (bulk)
Minimum Storage	180 days
Minimum Charge	180 days
Retrieval Fee	Varies by retrieval option and region

Use Cases:

Regulatory compliance (7-10 year retention)
Historical records
Data that MUST be kept but rarely accessed

Cost Comparison: Deep Archive costs ~$1 per TB per month, making it extremely cost-effective for long-term storage.

S3 Versioning

What It Does

Versioning keeps multiple versions of an object in the same bucket.

Benefits

Benefit	Description
Protection	Recover from accidental deletion or overwrite
Audit Trail	Track all changes to objects
Rollback	Restore previous versions of objects

How It Works

With Versioning

flowchart TB
    V1["PUT file.jpg (v1)"] --> V2["PUT file.jpg (v2)"] --> V3["PUT file.jpg (v3)"]
    V3 --> DelNoVid["DELETE (no versionId)"]
    DelNoVid --> Marker["DeleteMarker created<br/>current object hidden"]
    Marker --> Recover["Older versions (v1/v2/v3)<br/>still recoverable"]

    V3 --> DelVid["DELETE with versionId=v3"]
    DelVid --> Removed["Only v3 permanently removed"]

Without Versioning

flowchart LR
    NV1["file.jpg"] --> NVDel["DELETE"] --> NVGone["Permanently removed"]

Enabling Versioning

# Enable versioning on a bucket
aws s3api put-bucket-versioning \
  --bucket my-bucket \
  --versioning-configuration Status=Enabled
 
# List versions of an object
aws s3api list-object-versions \
  --bucket my-bucket \
  --prefix file.jpg
 
# Delete specific version
aws s3api delete-object \
  --bucket my-bucket \
  --key file.jpg \
  --version-id 123456789abcdef

Cost Consideration

You pay for ALL versions stored. Use lifecycle policies to delete old versions.

S3 Lifecycle Policies

What They Do

Automatically transition objects to cheaper storage classes or expire/delete objects.

Common Rules

{
  "Rules": [
    {
      "Id": "TransitionToIA",
      "Status": "Enabled",
      "Filter": {"Prefix": "logs/"},
      "Transitions": [
        {
          "Days": 30,
          "StorageClass": "STANDARD_IA"
        },
        {
          "Days": 90,
          "StorageClass": "GLACIER"
        },
        {
          "Days": 180,
          "StorageClass": "DEEP_ARCHIVE"
        }
      ],
      "Expiration": {"Days": 365}
    }
  ]
}

Example Policy Flow

Day 0: Create object (Standard)
         │
         ▼
Day 30: Transition to Standard-IA
         │
         ▼
Day 90: Transition to Glacier Flexible Retrieval
         │
         ▼
Day 180: Transition to Glacier Deep Archive
         │
         ▼
Day 365: Expire (delete object)

S3 Security

1. Bucket Policies

JSON-based policies attached to buckets to control access.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {"AWS": "arn:aws:iam::123456789012:user/bob"},
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-bucket/*"
    }
  ]
}

2. Block Public Access

Setting	What It Blocks	Important Note
BlockPublicAcls	New public bucket/object ACLs and public ACL updates	Existing public ACLs are not removed automatically
IgnorePublicAcls	Public access granted through ACLs	ACLs can still exist, but S3 ignores public grants
BlockPublicPolicy	New bucket policies that make the bucket public	Existing public bucket policies are not deleted automatically
RestrictPublicBuckets	Public/cross-account access from a public bucket policy	If a bucket policy is public, access is limited to the bucket owner’s account and AWS service principals

Recommendation: Enable all four Block Public Access settings at the account level, then create explicit bucket-level exceptions only when required.

3. Encryption Options

Type	Key Management	Description
SSE-S3	AWS-managed	Simplest, AES-256
SSE-KMS	AWS KMS	Use KMS keys for encryption
SSE-C	Customer-provided	You provide the encryption key
Client-side	Your application	Encrypt before sending to S3

4. Access Control Lists (ACLs)

Note: ACLs are legacy. Use bucket policies and IAM policies instead. ACLs are disabled by default on new buckets (Object Ownership: Bucket Owner Enforced).

S3 Event Notifications

Supported Destinations

Destination	Use Case
SNS	Fan-out to multiple subscribers
SQS	Queue-based processing
Lambda	Trigger serverless processing
EventBridge	Integration with other AWS services

Event Types

Event	Description
`s3:ObjectCreated:*`	PUT, POST, Copy, CompleteMultipartUpload
`s3:ObjectRemoved:*`	Delete
`s3:ObjectRestore:*`	Glacier restore completed/expired

{
  "LambdaFunctionConfigurations": [
    {
      "Id": "ProcessUploads",
      "LambdaFunctionArn": "arn:aws:lambda:us-east-1:123456789012:function:ProcessS3",
      "Events": ["s3:ObjectCreated:*"],
      "Filter": {
        "Key": {
          "FilterRules": [
            {"Name": "prefix", "Value": "uploads/"},
            {"Name": "suffix", "Value": ".jpg"}
          ]
        }
      }
    }
  ]
}

S3 Replication

Cross-Region Replication (CRR)

Replicates objects to a bucket in a different region.

Use Case	Benefit
Disaster Recovery	Data survives regional outage
Compliance	Data residency requirements
Latency	Access data from closer region

Same-Region Replication (SRR)

Replicates objects to a bucket in the same region.

Use Case	Benefit
Log Aggregation	Copy logs to centralized bucket
Security	Copy to secured account
Testing	Replicate to test environment

Presigned URLs

What They Are

Time-limited URLs that grant temporary access to private objects without requiring IAM permissions.

Use Cases

Use Case	Description
Private downloads	Share files without making them public
App uploads	Allow users to upload directly to S3
Temporary access	Grant access for limited time

Generate Presigned URL

# CLI - 1 hour validity
aws s3 presign s3://my-bucket/private-file.txt --expires-in 3600
 
# Output:
# https://my-bucket.s3.amazonaws.com/private-file.txt?...
#     ...&Signature=...&Expires=1706840400

# Python (boto3)
import boto3
 
s3 = boto3.client('s3')
url = s3.generate_presigned_url(
    'get_object',
    Params={'Bucket': 'my-bucket', 'Key': 'file.pdf'},
    ExpiresIn=3600  # 1 hour
)

S3 Select and Glacier Select

Query data directly from S3 or Glacier without downloading entire objects.

Availability note: S3 Select and Glacier Select are not available to new customers. Existing customers can continue to use them.

Feature	Use Case	Benefit
S3 Select	CSV, JSON, Parquet	Up to 400% faster, lower cost
Glacier Select	Query archives	No full retrieval needed

S3 Performance

Performance Guidelines

Guideline	Benefit
No prefix randomization required	S3 automatically scales request rates across prefixes
Use CloudFront	Cache content at edge
Use multipart upload	Faster uploads, better reliability
Use Transfer Acceleration	Faster long-distance uploads
S3 Transfer Manager	Automatic optimization

Prefix Best Practices

Modern S3 automatically scales request rates per prefix.
You generally do not need to randomize key prefixes for performance.
Prefer key naming that matches operational access patterns, for example: bucket/log/2024/01/01/file1.log, bucket/log/2024/01/01/file2.log
Use partitioned naming only when it helps query/layout patterns, for example: bucket/year=2026/month=02/day=08/file1.parquet

TL;DR

S3 Storage Classes Decision Tree

Need immediate access?
    YES
    ├── Access pattern unknown? → S3 Intelligent-Tiering
    ├── Frequently accessed? → S3 Standard
    └── Infrequently accessed? → S3 Standard-IA
    NO (Can wait)
    ├── Need within minutes? → Glacier Instant Retrieval
    ├── Can wait hours? → Glacier Flexible Retrieval
    └── Can wait 12+ hours? → Glacier Deep Archive

Key Points

S3 = Object storage (not filesystem)
Buckets = Globally unique containers in specific regions
Objects = Data + Key + Metadata (up to 50 TB via multipart upload; practical multipart limit is ~53.7 TB)
Storage Classes = Trade off access frequency vs cost
Versioning = Keep multiple versions, recover from deletions
Lifecycle Policies = Auto-transition or expire objects
Encryption = SSE-S3, SSE-KMS, SSE-C, or client-side
Security = Bucket policies + IAM + Block Public Access
Replication = Cross-region (CRR) or same-region (SRR)
Presigned URLs = Temporary access to private objects

Resources

Amazon S3 Documentation Complete S3 user guide.

S3 Storage Classes Detailed comparison of all storage classes.

S3 Pricing Detailed pricing breakdown.

S3 Security Best Practices Official security recommendations.

Lalit's Cloud & DevOps notes

S3

What is Amazon S3?

S3 Architecture

S3 Components

Bucket

Object

S3 Storage Classes Deep Dive

Quick Comparison Table

1. S3 Express One Zone

2. S3 Standard

3. S3 Intelligent-Tiering

4. S3 Standard-IA (Infrequent Access)

5. S3 One Zone-IA

6. S3 Glacier Instant Retrieval

7. S3 Glacier Flexible Retrieval (formerly S3 Glacier)

8. S3 Glacier Deep Archive

S3 Versioning

What It Does

Benefits

How It Works

Enabling Versioning

Cost Consideration

S3 Lifecycle Policies

What They Do

Common Rules

Example Policy Flow

S3 Security

1. Bucket Policies

2. Block Public Access

3. Encryption Options

4. Access Control Lists (ACLs)

S3 Event Notifications

Supported Destinations

Event Types

S3 Replication

Cross-Region Replication (CRR)

Same-Region Replication (SRR)

Presigned URLs

What They Are

Use Cases

Generate Presigned URL

S3 Select and Glacier Select

S3 Performance

Performance Guidelines

Prefix Best Practices

TL;DR

S3 Storage Classes Decision Tree

Key Points

Resources

Graph View

Table of Contents

Backlinks