Agentic RAG in Practice: Strategies for Smarter Retrieval

December 1, 2025

Agentic RAG in Practice: Strategies for Smarter Retrieval

TLDR

Standard RAG struggles when query complexity increases.
Advanced RAG adds pre/post retrieval logic plus branching strategies.
Agentic RAG introduces an orchestrator that selects strategies and self-corrects.
Metadata filtering, re-ranking, and hybrid search dramatically improve accuracy.
AWS now provides notebooks, APIs, and patterns for production-grade agentic RAG.

This session centered on a consistent theme: most RAG issues aren't LLM issues—they’re retrieval issues. AWS laid out how adding structure, branching, and agentic reasoning to retrieval pipelines drastically improves quality and reduces retries.

Foundational RAG Patterns

Standard RAG is simple: embed content, chop it into chunks, retrieve the top matches, and feed them to the LLM. It's straightforward, scalable, and works great for direct, narrow queries. But as soon as questions become multi-step, ambiguous, or domain-specific, the system often falls apart.

Standard RAG Flow

User issues a query.
Retriever pulls chunks based on embeddings similarity.
LLM generates an answer using whatever context it was given.

The problem: every query gets exactly the same retrieval strategy.

Advanced RAG

Advanced RAG inserts additional intelligence on both sides of retrieval:

Pre-retrieval steps

Query rewriting
Classification
Applying metadata constraints
Relevance expansion

Post-retrieval steps

Re-ranking
Filtering
Merging and deduplicating

The result: a far more context-aware pipeline that can adapt retrieval to the shape of the user's question.

Advanced RAG Techniques

Conditional Branching

This technique routes a query to one appropriate vector store using rules or heuristics. For example, internal HR policy questions vs. product documentation vs. code snippets.

A lightweight routing step drastically improves relevance when you have multiple domains.

Parallel Branching & Retrieval Fusion

Instead of routing to just one store, the system can:

Reformulate the query in multiple ways.
Send each version to different vector stores (or different retrieval methods).
Combine the results through a fusion step.

This pattern is ideal for broad questions, underspecified queries, or highly heterogeneous knowledge bases.

Query Reformulation in Bedrock

AWS demonstrated the RAG API’s ability to automatically generate multiple sub-queries from a single user query. Each sub-query is independently retrieved, and the system then pools and ranks the results.

The benefit is improved recall without manually authoring prompt templates or hand-tuning retrieval logic. It’s especially effective when the initial user query lacks specificity.

Self-Corrective Agentic RAG

This was the highlight of the session. Instead of retrieval being a fixed pipeline, you introduce a central agent that orchestrates the entire workflow.

A self-corrective loop looks like this:

User posts a question.
The agent retrieves context and evaluates relevance.
The agent selects a strategy:
- Query expansion
- Query decomposition
- Or combined strategies
The LLM generates a response.
A quality check evaluates:
- Relevance
- Completeness
- Factual accuracy
If the response fails, the agent loops and adapts.
After several attempts or a satisfactory answer, it finalizes.

This is the emerging canonical pattern for reliable RAG—dynamic, adaptive, and quality-aware.

Enhancing RAG Accuracy

The talk separated accuracy into two flows: ingestion and retrieval.

Ingestion Improvements

Better chunking strategy (structural + semantic)
Parsing using foundation models for accuracy
Multimodal parsing for scanned documents or images
Metadata labeling (critical for filtering and access control)

Retrieval Improvements

Metadata filtering (including tenant isolation)
Re-ranking using cross encoders or LLM scoring
Hybrid search (sparse + dense)

A notable callout:
Access control for vector stores using metadata filtering with Amazon Bedrock Knowledge Bases

Key Takeaways

RAG failures typically arise from misaligned retrieval strategies, not bad LLMs.
An orchestrator agent can analyze query complexity and select the appropriate retrieval strategy upfront.
This reduces retries and significantly improves output quality.
There is an emerging abstract RAG workflow that production teams can adopt to “right size” retrieval based on query type.
Chunking, metadata, hybrid search, and re-ranking remain high-leverage accuracy tools.
AWS now provides a complete notebook demonstrating self-corrective agentic RAG patterns.

Notebook reference:
https://github.com/aws-samples/amazon-bedrock-samples/tree/main/rag/knowledge-bases/use-case-examples/agentic-self-corrective-rag-kb-langraph

Agentic RAG in Practice: Strategies for Smarter Retrieval

TLDR

Foundational RAG Patterns

Advanced RAG

Advanced RAG Techniques

Conditional Branching

Parallel Branching & Retrieval Fusion

Query Reformulation in Bedrock

Self-Corrective Agentic RAG

Enhancing RAG Accuracy

Ingestion Improvements

Retrieval Improvements

Key Takeaways

Further Reading & Resources