EC2

Complete guide to Amazon EC2 — Understanding instance types, pricing models, storage, and best practices.

What is Amazon EC2?

Amazon EC2 (Elastic Compute Cloud) provides scalable computing capacity in the AWS Cloud. It eliminates the need to invest in hardware up front, so you can develop and deploy applications faster.

In practice: EC2 gives you a virtual machine in the cloud with full OS-level control, including software installation and sizing changes as your workload evolves.

EC2 Architecture

flowchart TD
    subgraph EC2["EC2 Service"]
        I1["🖥️ Instance 1<br/>AMI: ami-01<br/>Type: m5.large<br/>State: running"]
        I2["🖥️ Instance 2<br/>AMI: ami-02<br/>Type: t3.medium<br/>State: stopped"]
        IN["🖥️ Instance N<br/>AMI: ami-01<br/>Type: c5.xlarge<br/>State: running"]
        
        SG["🔐 Security Group"]
        EBS["💾 EBS Volumes"]
    end
    
    I1 --> SG
    I2 --> SG
    IN --> SG
    I1 --> EBS
    I2 --> EBS
    IN --> EBS

EC2 Components

1. AMI (Amazon Machine Image)

An AMI is a template for instances. It contains:

AMIs exist so you can launch servers in a repeatable, controlled way. Instead of manually installing OS packages and app dependencies every time, you launch from a known image and get consistent results across environments.

Component	Description
Root Volume Template	Operating system (Linux, Windows, etc.)
Launch Permissions	Which AWS accounts can use the AMI
Block Device Mapping	Volumes to attach at launch

AMI Types:

Type	Source	Use Case
AWS-Provided	Amazon	Quick start, tested, supported
Community	Other AWS users	Verified, community AMIs
Marketplace	Third-party vendors	Pre-configured software
Custom	You create	Your specific configuration

Note: AMIs are region-specific. To use in another region, copy the AMI. When to use what: Start with AWS-provided AMIs for learning and early builds, move to custom AMIs (golden images) for production standardization, and use Marketplace AMIs when you need pre-licensed or vendor-managed software.

2. Instance Types

Common Pattern: {family-token}.{size}

Instance types exist because workloads fail for different reasons: some are CPU-bound, some memory-bound, some network-bound. Picking the right family and size improves both performance and cost efficiency.

Example: m5.large
         │   │
         │   └─ Size (nano, micro, small, medium, large, xlarge, 2xlarge, etc.)
         └───── Family token (family + generation + options; e.g., m5, c7g, c7gn)

Instance Families

Family	Optimized For	vCPU:Memory Ratio	Use Cases
General Purpose (m)	Balanced	1:4	Web servers, app servers
Burstable (t)	Low cost, burstable	Variable	Dev/test, microservices
Compute Optimized (c)	High CPU	1:2	Batch processing, gaming
Memory Optimized (r)	High memory	1:8/1:16	Databases, caches, in-memory
Accelerated (p, g, inf)	GPUs/FP GAs	Variable	ML, graphics, HPC
Storage Optimized (i, d, h)	High disk I/O	Variable	Data warehouses, file servers
Graviton (t4g, m6g, c6g)	ARM-based, cost-effective	Variable	Linux workloads

Popular Instance Sizes

Size	vCPUs	Memory	Network	Example Uses
nano	2	0.5 GB	Low	Microservices
micro	1	1 GB	Low	Dev environments
small	1	2 GB	Low to moderate	Small apps
medium	2	4 GB	Low to moderate	Web servers
large	2	8 GB	Low to moderate	App servers
xlarge	4	16 GB	Moderate	Production apps
2xlarge	8	32 GB	Moderate	High-performance apps
4xlarge	16	64 GB	High	Databases
8xlarge	32	128 GB	High	Big data
16xlarge	64	256 GB	High	ML training

Generation Comparison

Family	Gen 4	Gen 5	Gen 6
General	m4	m5	m6i/m6g
Compute	c4	c5	c6i/c6g
Memory	r4	r5	r6i/r6g

Rule: Newer generations typically offer better price-performance. Selection tip: Choose family first (workload profile), then size (capacity), then generation (price-performance). Start slightly above your expected baseline and right-size using CloudWatch metrics.

EC2 Pricing Models

EC2 pricing models exist to let you trade flexibility vs commitment. The more predictable your usage, the more you can usually save.

1. On-Demand Instances

Attribute	Value
Commitment	None
Upfront Cost	None
Savings	None (base price)
Term	Pay by second/hour
Best For	Short-term, spiky, unpredictable workloads

Use On-Demand when requirements are still changing. It is the safest default for new systems because there is no lock-in, and later you can move steady portions to Savings Plans or RIs.

Advantages:

No commitment
Easy to start/stop
Highest cost

2. Reserved Instances (RI)

Attribute	Value
Commitment	1 or 3 years
Upfront Cost	All, partial, or no upfront
Savings	Up to 75%
Term	Fixed term
Best For	Steady-state, predictable usage

RIs exist to reward long-term commitment on stable workloads. Use them only after you have measured a consistent baseline; otherwise you risk paying for unused commitment.

RI Purchase Options:

Option	Upfront	Savings	Use Case
All Upfront	100%	Highest (up to 75%)	Have budget, know usage
Partial Upfront	~50%	High (~60-70%)	Balance upfront vs monthly
No Upfront	0%	Moderate (~40-50%)	No upfront budget

RI Scope:

Scope	Description	Flexibility
Regional	Discount applies to any instance in region (within family)	Highest
Zonal	Discount applies to specific AZ	Lower
Dedicated	Tenancy dedicated	Specific use case

Note: Reserved Instances are a billing discount, not a physical instance reservation.

3. Spot Instances

Attribute	Value
Commitment	None
Upfront Cost	None
Savings	Up to 90%
Term	Can be interrupted with 2-minute warning
Best For	Fault-tolerant, flexible workloads

Spot exists so AWS can sell spare capacity at deep discounts. Use Spot when interruption is acceptable and your workload can retry, checkpoint, or restart safely.

How Spot Works:

Spot Price = Supply/Demand

Your Max Price ≥ Spot Price → Instance runs
Your Max Price < Spot Price → Instance interrupted (2-min warning)

Spot Best Practices:

Use with Auto Scaling Groups
Implement graceful shutdown
Use Spot Instance Interruption Notices
Don’t use for stateful workloads

Use Cases:

Batch processing
Big data processing
CI/CD
Containerized workloads
Web crawling

4. Savings Plans

Attribute	Value
Commitment	1 or 3 years
Upfront Cost	None or flexible
Savings	Up to 72%
Scope	Compute or instance family
Best For	Flexible compute commitment

Savings Plans are often the easiest way to reduce cost without tightly locking instance attributes. They are usually a better first optimization than RIs for teams that change instance families or move between EC2, Fargate, and Lambda.

Savings Plan Types:

Type	Scope	Savings	Flexibility
Compute SP	All EC2, Fargate, Lambda	Up to 66%	Highest
EC2 Instance SP	Specific instance family in region	Up to 72%	Lower

Savings Plans vs RI: Savings Plans offer more flexibility; RIs offer higher savings for specific instances.

5. Dedicated Hosts

Attribute	Value
Commitment	None for On-Demand hosts; 1 or 3 years with Dedicated Host Reservations
Upfront Cost	Varies
Savings	Varies
Term	On-demand or reservation term based
Best For	Compliance, licensing (BYOL)

Dedicated Hosts exist for physical isolation and software licensing controls. Use them when regulations or BYOL contracts require visibility or affinity to a specific host.

Use Cases:

Per-core software licensing
Strong compliance requirements
Full control over instance placement

EC2 Storage Options

Storage options exist because compute and data have different lifecycles. Some data must survive restarts and replacement; some data is temporary and only needs speed.

EBS (Elastic Block Store)

Network-attached block storage for EC2 instances.

EBS helps by decoupling storage from a single host so data can persist even when an instance is stopped, replaced, or recovered.

Volume Type	Use Case	IOPS	Throughput
gp2/gp3	General purpose SSD	Up to 16,000	Up to 1,000 MB/s
io1/io2	High-performance SSD	Up to 64,000/256,000	Up to 1,000/4,000 MB/s
st1	Throughput-optimized HDD	~500	Up to 500 MB/s
sc1	Cold HDD	~250	Up to 250 MB/s

In practice, gp3 is the default for most application and boot volumes. Use io1/io2 for latency-sensitive databases with consistently high IOPS needs. Use st1/sc1 only for large sequential workloads where low cost per GB is more important than latency.

Instance Store

Physically-attached ephemeral storage.

Characteristic	Value
Persistence	Persists for the instance lifetime on that host (lost on stop/hibernate/terminate or host failure)
Performance	Very high
Cost	Included with instance
Use Cases	Temporary data, caching, buffers

Instance Store exists for very fast local disk with no separate volume billing. Use it for scratch space, caches, temporary buffers, or intermediate processing outputs. Avoid it for irreplaceable data, because data is lost when the instance stops/terminates or host fails.

Quick decision: If data must survive instance replacement, use EBS. If data is disposable and speed matters most, use Instance Store.

EC2 Networking

Security Groups

Virtual firewall at instance level.

Security Groups exist to enforce least-privilege network access close to the workload. They help by making allowed traffic explicit and stateful, which simplifies return traffic handling.

Feature	Value
Type	Stateful
Rules	Allow only
Scope	Instance level
Return Traffic	Automatic

Use separate Security Groups by role (for example, web, app, db) and allow traffic between groups instead of broad CIDR ranges whenever possible.

Elastic IP

Static public IP address.

Elastic IPs exist so public endpoints can stay stable when instances are replaced. This is helpful for legacy allowlists and fixed external integrations, but use them sparingly because public IPv4 addresses are billed.

Attribute	Value
Cost (In Use)	Charged hourly for public IPv4 addresses (including Elastic IPs)
Cost (Idle/Unattached)	Charged hourly (same public IPv4 charge model)
Limit	5 per region (soft limit, can increase)

For most production apps, prefer an ALB/NLB, Route 53, or Global Accelerator rather than attaching many Elastic IPs directly to instances.

Placement Groups

Logical grouping of instances.

Placement Groups exist to control how EC2 places instances on hardware, which lets you optimize for either performance or fault isolation.

Type	Description	Use Case	Limit
Cluster	Low latency, high bandwidth	HPC, ML	Single AZ
Partition	Spread across partitions (reduced failure blast radius)	Hadoop, Kafka	7 partitions per group
Spread	Distinct underlying hardware per instance for failure isolation	Critical apps	Up to 7 running instances per AZ per group

Choose Cluster for low-latency east-west traffic, Partition for distributed systems that need blast-radius separation, and Spread for small sets of critical instances that must not share hardware.

EC2 Lifecycle

Instance States

flowchart LR
    pending["⏳ pending"] --> running["▶️ running"]
    running --> stopping["⏸️ stopping"]
    stopping --> stopped["⏹️ stopped"]
    stopped --> pending
    running --> shutting["🔄 shutting-down"]
    stopped --> shutting
    shutting --> terminated["❌ terminated"]

State	Billing	Data Persistence
pending	No	—
running	Yes	EBS yes, Instance Store yes (persists across reboot)
stopping	No (except hibernation)	—
stopped	No	EBS yes, Instance Store no (data lost on stop)
shutting-down	No	—
terminated	No	EBS default delete (can change), Instance Store no

Lifecycle states matter because they directly affect both billing and data durability. Stopping an instance can reduce compute cost while preserving EBS data, whereas termination is intended for permanent removal.

EC2 Features

Auto Scaling

Automatically scale EC2 capacity based on conditions.

Auto Scaling exists to keep performance stable under changing load while avoiding overprovisioning. Use target tracking first for most workloads, then layer scheduled scaling when traffic patterns are predictable.

Scaling Policy	Description
Target Tracking	Maintain target metric (e.g., CPU 50%)
Simple Scaling	Add/remove instances based on threshold
Step Scaling	Scale based on step adjustments
Scheduled Scaling	Scale at specific times

Elastic Load Balancing (ELB)

Distribute incoming traffic across multiple targets.

ELB helps by decoupling traffic entry from individual instances, enabling health checks, rolling deployments, and horizontal scaling.

Type	Use Case	Features
Application LB	HTTP/HTTPS	Content-based routing, path-based routing
Network LB	TCP/UDP/TLS	Extreme performance, static IPs
Gateway LB	Layer 3 applications	IP protocol routing

Use ALB for web applications and microservices, NLB for high-throughput low-latency TCP/UDP, and GWLB when inserting virtual network appliances.

User Data

Script executed at first launch.

User Data exists for bootstrap automation so instances become usable immediately after launch. It is best for lightweight setup; for complex configuration, pair it with configuration management or image-based workflows.

#!/bin/bash
# Install updates
yum update -y
 
# Install Apache
yum install -y httpd
 
# Start Apache
systemctl start httpd
systemctl enable httpd
 
# Create webpage
echo "<h1>Hello from $(hostname -f)</h1>" > /var/www/html/index.html

Instance Metadata

Retrieve instance information from within the instance.

Instance metadata helps applications discover runtime context (instance ID, IAM role credentials, region/AZ) without hardcoding values.

# Get instance ID
curl http://169.254.169.254/latest/meta-data/instance-id
 
# Get availability zone
curl http://169.254.169.254/latest/meta-data/placement/availability-zone
 
# Get local IP
curl http://169.254.169.254/latest/meta-data/local-ipv4
 
# Get IAM role credentials
curl http://169.254.169.254/latest/meta-data/iam/security-credentials/

For security, require IMDSv2 and block unnecessary metadata access from untrusted processes.

EC2 vs Lambda: When to Use

Use EC2 When…	Use Lambda When…
Long-running processes	Event-driven workloads
Full OS control needed	Stateless functions
Complex applications	Simple tasks
Consistent performance needed	Spiky/unpredictable traffic
Need direct hardware access	Want to avoid managing servers
Running containers (without ECS/EKS)	Quick deployments
Need specialized hardware (GPU)	Short execution time

EC2 Best Practices

Practice	Why
Use AMIs for consistency	Reproducible deployments
Use Security Groups, not security by obscurity	Explicit access control
Tag your instances	Cost allocation, organization
Use IAM roles, not access keys	No credential management
Monitor with CloudWatch	Visibility and alerting
Use Auto Scaling	Cost optimization + availability
Use placement groups for performance	Lower latency
Use Reserved Instances for steady workloads	Cost savings
Use Spot for fault-tolerant workloads	Maximum savings
Enable detailed monitoring	1-minute metrics

TL;DR

EC2 Pricing Decision Tree

Workload Type
    ├── Predictable, steady-state
    │   └── Reserved Instances (up to 75% savings)
    ├── Flexible, changing compute needs
    │   └── Savings Plans (up to 72% savings)
    ├── Fault-tolerant, can be interrupted
    │   └── Spot Instances (up to 90% savings)
    └── Short-term, unpredictable
        └── On-Demand (base price)

Key Points

EC2 = Virtual machines in AWS (IaaS)
AMI = Template (OS + software)
Instance Types = CPU/memory/storage combo (e.g., m5.large)
Pricing = On-Demand (base), Reserved (save up to 75%), Spot (save up to 90%), Savings Plans (flexible)
EBS = Persistent network storage; Instance Store = High-speed local storage (ephemeral across stop/terminate)
Security Groups = Stateful firewall for instances
Auto Scaling = Automatic scaling based on demand
ELB = Distribute traffic across instances

Resources

Amazon EC2 Documentation Complete EC2 user guide.

EC2 Instance Types Detailed comparison of all instance types.

EC2 Pricing Detailed pricing for all purchasing options.

EC2 AMI Catalog Browse and select AMIs in the AWS Console.

Lalit's Cloud & DevOps notes

EC2