Amazon Macie — A fully managed data security and data privacy service that uses machine learning and pattern matching to discover and protect sensitive data in AWS.
Overview
Macie automatically discovers and classifies sensitive data (PII, PHI, financial data) in your S3 buckets, helping you meet compliance requirements like GDPR, HIPAA, and PCI DSS.
Key Insight: Macie is like an automated data auditor — it scans your S3 buckets to find sensitive information you might not even know exists, helping prevent data leaks.
Core Macie Concepts
| Concept | Description | Key Point |
|---|---|---|
| Sensitive Data | PII, PHI, financial data, credentials | Automatically discovered |
| Data Classification | Categorizes data by sensitivity | ML-powered detection |
| Finding | Discovered sensitive data issue | Severity levels |
| Managed Data Identifiers | Pre-built patterns for common data types | PII, credentials, financial |
| Custom Data Identifiers | Your own regex patterns for specific data | Company-specific formats |
| S3 Bucket Inventory | List of all buckets with sensitivity scores | Prioritize remediation |
How Macie Works
flowchart TD subgraph Buckets["Your S3 Buckets"] B1["Bucket A"] B2["Bucket B"] B3["Bucket C"] BN["Bucket N"] end Scan["Macie scans S3 objects"] Detect["ML + pattern matching identify sensitive data"] Classify["Classify by data type<br/>(PII, PHI, financial, credentials)"] Findings["Generate findings with severity and context"] B1 --> Scan B2 --> Scan B3 --> Scan BN --> Scan Scan --> Detect --> Classify --> Findings Findings --> Hub["Security Hub"] Findings --> SNS["SNS alerts"] Findings --> EB["EventBridge automation"]
Data Types Macie Detects
Managed Data Identifiers
| Category | Examples |
|---|---|
| PII | Names, addresses, phone numbers, email addresses |
| Financial | Credit card numbers, bank account numbers, tax IDs |
| Credentials | AWS keys, API keys, passwords |
| Health | Medical record numbers, diagnoses |
| Legal | Passport numbers, driver’s license numbers |
Recent Enhancement (January 2025)
Macie now uses Amazon Textract to detect sensitive data in images stored in S3 (PDFs, photos, scanned documents).
Key Features
| Feature | Description |
|---|---|
| Automated Discovery | Continuously scans S3 for sensitive data |
| Machine Learning | Improves detection accuracy over time |
| PII Detection | Finds personally identifiable information |
| Bucket Inventory | View all buckets with sensitivity scores |
| Findings | Detailed alerts with severity and remediation |
| Integration with Security Hub | Centralized findings management |
Finding Severity Levels
| Severity | Description | Example |
|---|---|---|
| High | Large amount of sensitive data exposed publicly | 10,000+ credit card numbers in public bucket |
| Medium | Sensitive data in accessible location | PII in bucket with known access |
| Low | Small amount or limited exposure | Few records in private bucket |
| Informational | Data discovery, not necessarily a risk | Internal bucket contains PII |
Use Cases
| Use Case | Description |
|---|---|
| GDPR Compliance | Discover and protect EU citizen data |
| HIPAA Compliance | Find protected health information (PHI) |
| Data Loss Prevention | Prevent accidental data exposure |
| Data Inventory | Know what sensitive data you have |
| Incident Response | Investigate potential data breaches |
Pricing
| Component | Price | Free Trial |
|---|---|---|
| Data Classification | Per GB evaluated | 30-day free trial |
| Bucket Monitoring | Per bucket per month | Included in free tier |
⚠️ Pricing Disclaimer: AWS pricing is subject to change. Always verify current pricing at the official Macie pricing page.
Macie vs Other AWS Security Services
| Service | Focus | Complementarity |
|---|---|---|
| Macie | Sensitive data discovery | Finds WHAT sensitive data exists |
| GuardDuty | Threat detection | Detects active attacks/anomalies |
| Config | Configuration compliance | Ensures S3 is properly secured |
| Shield | DDoS protection | Protects against attacks |
TL;DR
- Amazon Macie = ML-powered sensitive data discovery for S3
- Scans = S3 buckets for PII, PHI, financial data, credentials
- Detection = Machine learning + pattern matching
- Enhanced 2025 = Now scans images with Amazon Textract
- Pricing = Per GB evaluated; 30-day free trial
- Use Cases = GDPR/HIPAA compliance, data loss prevention, data inventory
- Integrates with = Security Hub (centralized findings)
Resources
Amazon Macie Documentation Complete Macie user guide.
Macie Pricing Detailed pricing breakdown.
Macie Data Identifiers List of all managed data identifiers.