Amazon Transcribe — Speech-to-text API with 100+ language support, speaker diarization, and real-time transcription.
What is Amazon Transcribe?
Amazon Transcribe is an automatic speech recognition (ASR) service that converts speech to text. It uses a multi-billion parameter speech foundation model to deliver high accuracy for streaming and recorded audio in over 100 languages.
Key Insight: Transcribe converts speech to text and also supports speaker identification, automatic punctuation, content filtering, and sentiment analysis.
Key Features
| Feature | Description |
|---|---|
| 100+ Languages | Support for global languages and dialects |
| Automatic Punctuation | Adds proper punctuation and capitalization |
| Speaker Diarization | Identifies and separates different speakers (“Speaker 1”, “Speaker 2”) |
| Custom Vocabulary | Add domain-specific terms (product names, acronyms) |
| Language Identification | Auto-detects language spoken in audio |
| Content Redaction | Redacts PII (credit cards, SSNs) from transcripts |
| Toxicity Detection | Detects toxic or inappropriate content in audio |
| Call Analytics | Extracts sentiment, call categories, summaries for contact centers |
| Streaming | Real-time transcription for live events |
| Batch Processing | Transcribe pre-recorded audio files |
Use Cases
Contact Centers
Transcribe customer calls for quality assurance, compliance, and agent training. Use Call Analytics to extract sentiment and call summaries.
Meeting & Conference Transcription
Generate real-time captions for meetings, webinars, and broadcasts for accessibility.
Media & Broadcasting
Create subtitles for video content, podcasts, and on-demand videos.
Clinical Documentation
Use Transcribe Medical (HIPAA-eligible) to transcribe doctor-patient conversations into EHR systems.
Content Moderation
Detect toxic language in gaming, social media, and peer-to-peer conversations.
How It Works
1. Upload Audio: MP3, WAV, FLAC, or stream audio
2. Choose API:
StartTranscriptionJob— Batch processingStartStreamTranscription— Real-time streaming
3. Receive Transcript:
{
"transcript": "[Speaker 1] Hello, how can I help you today? [Speaker 2] I'm calling about my order.",
"speakers": 2,
"items": [...]
}4. Use Results: Store in databases, analyze with Comprehend, display as captions
Pricing & Free Tier
| Aspect | Details |
|---|---|
| Free Tier (first 12 months) | 60 minutes/month for batch transcription |
| Batch Transcription | $0.024 per minute (under 1 hour audio) |
| Streaming | $0.024 per minute + data transfer costs |
| Speaker Diarization | +$0.003 per minute |
| Call Analytics | $0.04 per minute |
| Transcribe Medical | $0.075 per minute |
Cost Tip: Use free tier for development. Most operations cost ~$0.024/minute (about $1.44/hour).
⚠️ Pricing Disclaimer: AWS pricing is subject to change. Always verify current pricing at the official Amazon Transcribe pricing page.
When to Use Transcribe
| Use | Don’t Use |
|---|---|
| Audio/video transcription | Text generation (use LLMs) |
| Call center analytics | Real-time translation (use Translate) |
| Subtitles & captions | Voice commands (use Lex) |
| Meeting documentation | Very low latency (<50ms) |
Transcribe vs Amazon Lex
| Aspect | Transcribe | Lex |
|---|---|---|
| Purpose | Speech-to-text | Conversational AI |
| Output | Transcript | Responses/actions |
| Use Case | Transcription | Chatbots/voice assistants |
| Together? | Commonly combined in voice-bot architectures (Transcribe for ASR, Lex for dialog) |
Important Notes
- Speech Foundation Model: Now powered by multi-billion parameter model for higher accuracy
- Call Analytics: Extracts sentiment, categories, and generative AI-powered summaries
- Transcribe Medical: HIPAA-eligible, trained on medical terminology
- Toxicity Detection: Available for gaming and social media use cases
TL;DR
- Transcribe = Speech-to-text API (ASR)
- Features: 100+ languages, speaker diarization, punctuation, custom vocabulary, Call Analytics
- Free Tier: 60 minutes/month for first 12 months
- Pricing: ~$0.024 per minute (batch and streaming)
- Best for: Call center analytics, subtitles, meeting transcription, clinical documentation
- Special: Transcribe Medical for healthcare (HIPAA-eligible)
Resources
Amazon Transcribe Official product page and overview.
Transcribe Documentation Complete API reference and guides.
Transcribe Pricing Detailed pricing breakdown.