The explosion of the term “agentic AI” in the media might make you wonder: is it simply a new marketing phrase? After all, the term “agent” (or “virtual agent”) has long been used in artificial intelligence (AI) to refer to an automated helper, such as chatbots for customer service or HR functions.
More broadly, in robotic process automation, a software robot could act as an agent for a user or application to run specific tasks. Use cases include supply chain management, payroll processing and database administration, just to name a few. Microservices can also be considered as multiagent systems.
We’ve had agents for as long as we’ve had AI. What’s new is that agentic AI is now defined by its dependence on generative AI. Its buzz stems from the potential to extend the emergent, probabilistic abilities of large language models (LLMs) to new action spaces. While generative AI creates answers or content, agentic AI autonomously acts upon those answers. These capabilities raise new ethical risks and governance concerns, which we’ll explore.
Industry newsletter
Get curated insights on the most important—and intriguing—AI news. Subscribe to our weekly Think newsletter. See the IBM Privacy Statement.
Your subscription will be delivered in English. You will find an unsubscribe link in every newsletter. You can manage your subscriptions or unsubscribe here. Refer to our IBM Privacy Statement for more information.
The benefits of agentic AI are compelling. “Agentic AI gets us closer to the use cases that we, until recently, thought of as science fiction, where machines can complete complex tasks involving complex workflows, data-driven decision-making and action-taking with minimal human intervention,” writes IBM’s Cole Stryker.
AI agents can analyze the tools at their disposal, use AI assistants to find more information on topics and guide users through these steps. Multimodal AI is also accelerating agentic adoption by increasing the ability of agentic systems to analyze the world across various mediums, project multiple steps ahead and act on the user’s behalf.
In traditional multiagent architectures, humans design the control logic to solve clearly defined problems. By contrast, AI agents (or LLM agents) use control logic created by generative AI, with the LLM acting as an orchestrator or coordinator. The new IBM® Granite® 3.0 8B Instruct model exemplifies this approach by supporting agentic use cases requiring tool-calling, and IBM watsonx Orchestrate™ extends these capabilities to support agentic functionality for human resources and chat functions.
Fully realized agentic AI systems have the autonomous capability to design workflows and use tools to solve complex problems—whether anticipated, loosely defined or unforeseen. Through application programming interfaces (APIs), these tools can interact in external environments while also drawing upon the data on which their models were trained.
“Agentic systems have a notion of planning, loops, reflection and other control structures that heavily use the model’s inherent reasoning capabilities to accomplish a task end-to-end,” write IBM researchers. “Paired with the ability to use tools, plug-ins and function calling, agents are empowered to do more general-purpose work.”
Consider some use cases in the financial services sector: agentic AI systems could autonomously optimize client communications and tailor engagement strategies. They could assess creditworthiness, customize loan offerings and autonomously manage high-risk accounts. They could also track real-time market threats and recommend risk mitigation. Such systems might be able to greatly improve productivity, but only if concerns of safety and observability are addressed.
Agentic AI amplifies all the risks that apply to traditional AI, predictive AI and generative AI because greater agency means more autonomy and therefore less human interaction.
These risks must be addressed through both technological means and through human accountability for testing and outcomes. A robust operational framework for governance and lifecycle management is required.
“Giving LLMs more freedom to interact with the outside world has the potential to magnify their risks,” says Maya Murad, technical product manager at IBM Research®. “So many things could go wrong when you give an agent the power to both create and then run code as part of the path to answering a query. You could end up deleting your entire file system or outputting proprietary information.”
“Agents have specifically been shown to be ‘less robust, prone to more harmful behaviors and capable of generating stealthier content than LLMs, highlighting significant safety challenges,’” cite IBM researchers.
Murad recommends limiting these risks by executing code in a secure sandbox, installing security guardrails and performing offensive security research through adversarial simulations, malware analysis and red-teaming. Company policies about how and where to share data must also be enforced.
New strategies are emerging to test and monitor agentic AI systems, including adversarial attack templates and tracking methods such as guardian agents. However, these strategies must be evaluated to align with your organizational governance on ethics, security and risk.
Automated AI governance, which spans development, deployment and operation, helps ensure that AI models stay aligned to their intents. Flexible, end-to-end toolkits such as IBM watsonx.governance™ can accelerate responsible workflows and aid in regulatory compliance.
Agentic AI is advancing so quickly that organizations might have difficulty finding precedents or best practices for minimizing harms. As it has the potential to magnify the impact of biased data or algorithms, organizations must take the ethical lead and carefully develop organizational AI governance frameworks alongside automated AI governance.
As we’ve written, the groundwork for all AI governance is human-centered. Organizational governance includes defining processes for AI model intake and inventory. It also involves managing and maintaining employee communication and literacy programs and designating accountable leaders to oversee governance and stay updated on evolving regulations.
Effectively and responsibly operationalizing AI governance requires certain steps, and particular attention must be given to considerations unique to agentic AI. These steps and considerations are:
Accountability for the impacts of agentic AI systems spans LLM creators, model adapters, deployers and AI application users. Best practices for safety are emerging, which include:
Agentic architecture multiplies the opportunities of generative AI and its risks. In the face of this complexity, pilot programs of agentic AI should involve your organization’s lowest-risk use cases.
Comprehensive risk exploration demands a structured approach. Last year, MIT researchers created the AI Risk Repository, a database of more than 700 risks cited in AI literature. It identifies and categorizes risks drawn from 43 AI frameworks produced by research, industry and government organizations. It includes useful taxonomies for classifying how, when and why risks occur, and across seven domains, with subdomains within each:
The infancy of agentic AI governance is reflected in an authors’ note:
“Several areas of risk seem underexplored… [AI agents are] explored in [only] two included documents. Agentic AI may be particularly important to consider as it presents new classes of risks associated with the possession and use of dangerous capabilities, such as recursive self-improvement…”
Participants in risk exploration exercises are advised to familiarize themselves with such taxonomies and risk frameworks. They should also be reassured that their real-world experiences and perceptions of risks might be valid, given the paucity of risk research across certain domains.
Organizations aiming to realize the enormous potential of agentic AI will require thoughtful investment and planning. Consider that it took several years for generative AI projects to achieve broad success. According to Gartner, “At least 30% of generative AI projects will be abandoned after proof of concept by the end of 2025, due to poor data quality, inadequate risk controls, escalating costs or unclear business value.”2
How can your organization beat the odds in this next phase of AI? A Gartner report states that “Generative AI consulting and implementation services accelerate outcomes that are measurable, derisked, democratized and specific to the organization that is buying them.”3
The report makes the following recommendations:
We believe that IBM Consulting® meets all these criteria. It helps clients implement AI through thoughtful, informed approaches to strategy, data, architecture, security and governance. If your organization is looking to explore or implement an agentic AI solution, our experts can help you mitigate risk and bolster its business value.
Easily design scalable AI assistants and agents, automate repetitive tasks and simplify complex processes with IBM® watsonx Orchestrate™.
Create breakthrough productivity with one of the industry's most comprehensive set of capabilities for helping businesses build, customize and manage AI agents and assistants.
Achieve over 90% cost savings with Granite's smaller and open models, designed for developer efficiency. These enterprise-ready models deliver exceptional performance against safety benchmarks and across a wide range of enterprise tasks from cybersecurity to RAG.