Features

Deep dive into Amazon Bedrock’s core capabilities — Knowledge Bases, Agents, Guardrails, Fine-tuning, and Model Evaluation.

Model Inference

Basic usage — send prompts, get responses:

Mode	Description	Best For
On-demand	Pay per token, no commitment	Variable workloads, experimentation
Provisioned Throughput	Reserved capacity, consistent performance	High-volume production workloads
Batch Inference	Process large datasets offline	Bulk processing, lower cost

Knowledge Bases (RAG)

Retrieval-Augmented Generation — ground model responses in your data.

How RAG Works

User Query → Retrieve relevant chunks from vector store → 
Include in prompt → Model generates grounded response

Supported Data Sources

Source Type	Examples
Storage	Amazon S3
Web	Web crawlers
Enterprise	Confluence, SharePoint, Salesforce

Supported Vector Stores

Vector Store	Notes
Amazon OpenSearch Serverless	Fully managed, default option
Amazon OpenSearch Managed Cluster	Self-managed option (March 2025)
Amazon Aurora PostgreSQL	pgvector extension
Pinecone	Third-party
MongoDB Atlas	Third-party
Redis Enterprise	Third-party

Advanced Knowledge Base Features

Feature	Description
Automatic chunking	Parses and splits documents automatically
Multimodal	Process text + images in documents
GraphRAG	Combine RAG with graph databases (Neptune) — GA March 2025
Structured data	Query data warehouses with natural language (text-to-SQL)
Rerank API	Re-score retrieved chunks for better relevance
Custom connectors	Direct ingestion without full sync
Data Automation	Process video content (AVI, MKV, WEBM) for knowledge bases

Important Point: Knowledge Bases handle the entire RAG pipeline — chunking, embedding, storage, retrieval. You just provide the data.

Important Point: For RAG vector stores, OpenSearch Serverless is the default and recommended choice — it’s purpose-built for similarity search and vector embeddings. Don’t confuse with DynamoDB (key-value), Aurora (relational), or DocumentDB (document store).

Agents

Build autonomous, multi-step workflows that can reason and take actions.

Agent Components

Component	Description
Foundation Model	The LLM that powers reasoning
Instructions	System prompt defining agent behavior
Action Groups	Lambda functions or APIs the agent can call
Knowledge Bases	Data the agent can query
Guardrails	Safety policies (optional)

Agent Capabilities

Capability	Description
Multi-step reasoning	Agent plans and executes complex tasks
Tool use	Calls external APIs and Lambda functions
Memory	Maintains conversation context
Multi-agent collaboration	Multiple specialized agents work together

Multi-Agent Collaboration (GA March 2025)

Enterprise-grade agent deployment:

Multiple specialized agents work together on complex tasks
Memory management across agent sessions
Identity controls (IAM integration)
Tool integration framework
Observability and monitoring

Use Case Example: An agent that retrieves customer data (knowledge base), checks order status (API), and schedules a callback (action) — all from a single user request.

Guardrails

Implement responsible AI policies — filter both inputs and outputs.

Policy Types

Policy	What It Does
Content filters	Block hate, violence, sexual content, insults, misconduct
Denied topics	Define topics the model should refuse (e.g., competitor info)
Word filters	Block specific offensive words
PII filtering	Detect and redact personally identifiable information
Prompt attacks	Detect jailbreak/injection attempts
Multimodal toxicity	Filter harmful image + text content
Automated Reasoning	Prevent hallucinations using logical verification

Key Points

Guardrails apply to both input and output
Can be attached to models, agents, and knowledge bases
Automated Reasoning checks use mathematical verification to prevent factual errors (GA August 2025)
Natural language test Q&A generation for Automated Reasoning (November 2025)
Multimodal toxicity can evaluate images (GA April 2025)
85% price reduction announced December 2024

Important Point: Know that Guardrails can filter PII, block topics, and prevent prompt injection attacks.

Fine-Tuning & Customization

Customize models for your specific use case:

Method	Description	When to Use
Prompt engineering	Craft system prompts for consistent behavior	Quick, no training needed
Continued Pre-training	Train on domain-specific unlabeled data	Domain adaptation (legal, medical)
Fine-tuning	Train on labeled examples (prompt-response pairs)	Task-specific optimization

Fine-Tuning Process

Prepare training data (JSONL format)
Upload to S3
Create fine-tuning job in Bedrock
Deploy custom model version
Use via same API

Note: Not all models support fine-tuning. Check model documentation.

Model Evaluation

Compare model outputs before deployment:

Evaluation Type	Description
Automatic metrics	Accuracy, toxicity, robustness scores
Human evaluation	Set up human review workflows
LLM-as-a-judge	Use another model to evaluate outputs
Custom criteria	Define your own evaluation metrics
RAG evaluation	Evaluate retrieval + generation quality

Use case: Compare Claude vs Llama on your specific prompts before choosing a production model.

TL;DR

Feature	One-liner
Knowledge Bases	Managed RAG — connect your data to models
Agents	Multi-step workflows with tool use
Guardrails	Content filtering, PII, anti-hallucination
Fine-tuning	Customize models on your data
Evaluation	Compare models before deployment

Resources

Bedrock Knowledge Bases
Documentation for RAG implementation.

Bedrock Agents
Building autonomous workflows.

Bedrock Guardrails
Implementing responsible AI policies.

Lalit's Cloud & DevOps notes