Agentic RAG

Agentic RAG is a form of Retrieval-Augmented Generation (RAG) in which an AI agent dynamically controls retrieval. The agent can decide when to search, select a data source, reformulate queries, inspect results, identify missing evidence, and repeat the process before answering.

Conventional RAG often follows a fixed pipeline: transform the user's request into one search query, retrieve a predetermined number of documents, and insert them into a prompt. Agentic RAG replaces some of those fixed decisions with an iterative policy.

A typical workflow may include:

decomposing a complex request into several information needs;
routing each need to keyword, vector, graph, web, or database search;
applying metadata filters and hybrid search;
using reranking to prioritize evidence;
checking whether retrieved sources support the emerging answer; and
issuing follow-up searches when evidence is incomplete or contradictory.

This approach can improve multi-hop questions and research tasks, but it adds latency, cost, and nondeterminism. An agent may search excessively, choose an inappropriate source, or converge on evidence that confirms an early assumption.

Agentic retrieval should use explicit stopping conditions, query and tool budgets, source permissions, and provenance tracking. Retrieved content must be treated as untrusted because it can include inaccurate data or prompt injection attacks.

Evaluation should measure retrieval recall, source quality, citation support, answer correctness, and efficiency across the complete trajectory. Inspecting only the final answer can hide wasteful or unsafe retrieval behavior.

The Self-RAG paper is an influential example of a model learning to retrieve selectively and critique its own use of evidence.

The LLM Knowledge Base is a collection of bite-sized explanations for commonly used terms and abbreviations related to Large Language Models and Generative AI.

It's an educational resource that helps you stay up-to-date with the latest developments in AI research and its applications.

Agent2Agent Protocol (A2A)

Agent Trace and Observability

Artificial General Intelligence (AGI)

Artificial Intelligence (AI)

Atomic Prompt

Attention