Deep dive into fine-tuning foundation models in Amazon Bedrock — why, what’s possible, limitations, and pricing.
What is Fine-Tuning?
Fine-tuning adapts a pre-trained foundation model to your specific use case by training it on your own data. The result is a custom model version that performs better on your tasks while retaining the base model’s general capabilities.
Why Fine-Tune?
| Scenario | Solution |
|---|---|
| Model doesn’t know your domain terminology | Fine-tune on domain documents |
| Inconsistent output format/style | Fine-tune on examples with desired format |
| Task-specific performance needed | Fine-tune on labeled examples |
| Prompt engineering isn’t enough | Fine-tuning provides deeper customization |
Fine-Tuning vs Alternatives
| Approach | Effort | Customization | When to Use |
|---|---|---|---|
| Prompt Engineering | Low | Surface-level | Try first — often sufficient |
| RAG (Knowledge Bases) | Medium | Adds knowledge | Model needs access to your data |
| Fine-Tuning | High | Deep behavior change | Need consistent style/format/domain expertise |
| Continued Pre-Training | Highest | Domain adaptation | Model needs to “speak” your industry language |
Types of Customization in Bedrock
1. Continued Pre-Training
- Train on unlabeled domain-specific data
- Model learns domain vocabulary and patterns
- Example: Training on medical literature so model understands clinical terms
2. Fine-Tuning (Supervised)
- Train on labeled prompt-completion pairs
- Model learns specific task behavior
- Example: Training on customer support tickets to generate consistent responses
Supported Models for Fine-Tuning
⚠️ Not all Bedrock models support fine-tuning. Always check the Bedrock Model Support page.
| Provider | Model | Fine-Tuning Support |
|---|---|---|
| Amazon | Titan Text | ✅ Supported |
| Amazon | Titan Image Generator G1 | ✅ Supported (style/brand adaptation) |
| Amazon | Titan Embeddings | ❌ Not supported |
| Meta | Llama 2 | ✅ Supported |
| Meta | Llama 3.1 (8B, 70B) | ✅ Supported (128K context) |
| Meta | Llama 3.2 (1B, 3B, 11B, 90B) | ✅ Supported (multimodal for 11B/90B) |
| Anthropic | Claude | ❌ Not supported via Bedrock |
| Mistral | Mistral models | ✅ Some supported |
| Cohere | Command | ✅ Supported |
Multimodal Fine-Tuning
Llama 3.2 11B and 90B are multimodal — fine-tune for:
- Visual question answering
- Image captioning
- Document analysis with images
Fine-Tuning Process
1. Prepare Data → 2. Upload to S3 → 3. Create Job → 4. Training → 5. Deploy → 6. Inference
Step-by-Step
| Step | Details |
|---|---|
| 1. Prepare training data | JSONL format with prompt-completion pairs |
| 2. Upload to S3 | Training data in your S3 bucket |
| 3. Configure job | Select base model, hyperparameters, output location |
| 4. Training runs | AWS manages compute, typically hours to complete |
| 5. Custom model created | Stored in your account |
| 6. Deploy and use | Invoke like any Bedrock model |
Important Point: Fine-tuned/custom models require Provisioned Throughput for deployment. You cannot use On-Demand mode with custom models — you must purchase reserved capacity to test and deploy them.
Training Data Format
{"prompt": "Summarize this ticket:", "completion": "Customer requests refund for..."}
{"prompt": "Summarize this ticket:", "completion": "User reports login issue..."}Limitations & Constraints
| Limitation | Details |
|---|---|
| Model availability | Only specific models support fine-tuning |
| Minimum data | Typically need hundreds to thousands of examples |
| Training time | Hours to complete (varies by data size, model) |
| Region availability | Fine-tuning may not be available in all regions |
| No real-time updates | Can’t update model incrementally — retrain fully |
| Storage costs | Custom models incur storage fees |
| Higher inference cost | Fine-tuned models cost more per token than base models |
Data Requirements
| Consideration | Recommendation |
|---|---|
| Quality over quantity | Clean, consistent examples matter more than volume |
| Format consistency | Use consistent prompt/completion structure |
| Diversity | Cover edge cases and variations |
| Validation set | Hold out data for evaluation |
Pricing
Fine-tuning has three cost components:
| Component | Pricing Basis |
|---|---|
| Training | Per token processed during training |
| Storage | Per GB-month for custom model storage |
| Inference | Per token (higher than base model) |
Cost Considerations
| Factor | Impact |
|---|---|
| Training data size | More data = higher training cost |
| Number of epochs | More passes = higher cost, potentially better results |
| Model size | Larger models cost more to train |
| Inference volume | Consider if cost premium is worth the improvement |
Cost Tip: Start with prompt engineering and RAG. Only fine-tune when those approaches aren’t sufficient — fine-tuning is the most expensive customization option.
When to Fine-Tune (and When Not To)
✅ Good Use Cases
- Consistent output format across all responses
- Domain-specific terminology (legal, medical, technical)
- Brand voice and style consistency
- Task-specific optimization (classification, extraction)
- Multimodal tasks with your image data
❌ When to Avoid
- Just need to add knowledge → Use RAG instead
- Simple formatting needs → Use system prompts
- Small dataset (<100 examples) → Likely won’t help
- Rapidly changing information → RAG is more flexible
- Budget constraints → Explore cheaper options first
TL;DR
- Fine-tuning = train a base model on your data for better task-specific performance
- Supported models: Titan, Llama 2, Llama 3.1 (8B/70B), Llama 3.2 (1B/3B/11B/90B), Cohere Command, some Mistral
- Not supported: Claude (via Bedrock), Llama 4 (not yet confirmed)
- Process: Prepare JSONL data → Upload to S3 → Create training job → Deploy custom model
- Costs: Training (per token) + Storage (per GB) + Inference (premium over base)
- Try first: Prompt engineering → RAG → Fine-tuning (in that order)
Resources
Bedrock Model Customization
Official documentation for fine-tuning and continued pre-training.Supported Models for Fine-Tuning
Check which models support customization.