Agentic RAG turns AI into a smarter digital sleuth

10 December 2024

Author

Sascha Brodsky

Tech Reporter, IBM

When generative AI burst onto the scene, its ability to answer questions and create content was lauded as transformative. However, enterprises quickly realized that the technology had limitations, including hallucinations, a lack of explainability and an inability to act autonomously on retrieved information. Enter agentic retrieval augmented generation (agentic RAG), a new AI development that promises to address these challenges while unlocking new business applications.

“Content-grounded question and answering is the most popular use case for generative AI in enterprises,” says Maryam Ashoori, a Senior Director of Product Management for IBM watsonx who has been closely involved in developing and refining agentic RAG workflows. “Imagine you’re a customer who just bought a camera and needs help troubleshooting. Instead of relying on the model’s internal knowledge—which might be incomplete or inaccurate—the AI retrieves information from user manuals and other relevant documents to answer your question.”

This process, known as retrieval-augmented generation (RAG), helps minimize one of generative AI’s most persistent problems: hallucination. “If the model doesn’t find anything relevant, it can simply say, ‘I don’t know,’ instead of generating a potentially misleading answer,” Ashoori says.

Beyond retrieval: Adding agency

While RAG has been widely adopted for enterprise use, its capabilities are inherently limited to information retrieval and generation. The addition of “agency”—the ability to plan, reason and take actions autonomously—represents a significant leap forward.

“The true definition of an agent is an intelligent system that can reason, plan and act,” Ashoori says. “For example, say a customer’s troubleshooting query involves multiple steps. An agentic system can break the problem down, retrieve relevant information and even perform web searches or database queries if the initial resources don’t suffice.”

To illustrate the power of agentic RAG, Ashoori offers a hypothetical scenario: “Imagine a customer says, ‘My camera isn’t taking pictures.’ The agent might first search the user manual for troubleshooting steps. If the answer isn’t there, it could then perform a web search or query a database to find a solution. This kind of reasoning and action loop is what sets agentic RAG apart.”

Agentic RAG still stumbles with room to grow

Despite its promise, agentic RAG has its challenges. Hallucination remains a risk, albeit reduced. “Even with the additional tools and feedback loops, we can’t guarantee that the model won’t hallucinate,” Ashoori says. “However, by incorporating mechanisms like confidence thresholds and citation requirements, we’re able to minimize the risk.”

Other potential hurdles are mostly related to the autonomy agentic RAG grants to AI systems. “In traditional RAG workflows, everything happens within a closed system,” Ashoori says. “But with agents, you’re allowing the AI to autonomously interact with external tools and data sources. That raises questions about data security and access control.”

For example, an agent tasked with retrieving information from a database must be restricted to the datasets it is authorized to access. “You also need to control what actions the agent can take,” Ashoori says. “It’s not enough to allow access; you have to specify whether the agent can retrieve, edit or delete data. Otherwise, you risk creating a system that could inadvertently cause harm.”

Explainability is another critical issue. Large language models (LLMs) often produce outputs that are difficult to trace back to their origins. In contrast, agentic RAG offers greater transparency.

"With agents, you can have a chance to observe the behavior of the agent and trace every action,” Ashoori says. “You know whether the information came from a document search, a web search or a database query. This level of observability is crucial for enterprises that need to ensure compliance and accountability.”

A myriad of possibilities

While still an emerging technology, agentic RAG is already finding its way into enterprise workflows. In addition to customer service applications, Ashoori highlights business process automation as a key use case. Agentic RAG’s combination of reasoning capabilities and action tools enables it to handle complex, multi-step workflows. “Imagine automating the processing of loan applications or supply chain queries,” Ashoori says. “The agent can analyze the problem, retrieve the necessary data and even execute predefined actions to resolve it.”

As companies explore the possibilities of agentic RAG, IBM is focusing on building trust and control into these systems, Ashoori says. “Observability and explainability are non-negotiable,” she explains. “Enterprises need to see what the agent is doing at every step and have the ability to intervene if something goes wrong.”

For Ashoori, the future of AI lies in striking the right balance between autonomy and oversight. “We’re just scratching the surface of what’s possible with agentic RAG,” she says. “But with the right guardrails in place, this technology has the potential to transform the way businesses operate.”