The quest to create human-like thinking machines—AI systems that can reason, adapt and tackle a broad range of challenges—is dominating headlines and reshaping tech agendas. But Ruchir Puri, IBM Research’s Chief Scientist, is pushing for a different kind of breakthrough: AI that earns its keep by being useful now, not some hypothetical date in the future.
The idea of a machine that thinks like a person has been a part of tech culture for more than half a century. In recent years, it has moved from the margins of computer science conferences to the center of public and political conversation. Artificial general intelligence (AGI) has become a catchall for both the promise and peril of modern AI.
“Every time we achieve a milestone, we tend to push that boundary forward,” says Puri, speaking of AI in an interview with IBM Think. “We thought chess was it. Then we thought it was language. Now it’s coding or reasoning. But general intelligence remains undefined.”
The phrase “artificial general intelligence” first gained traction in the 2000s as a way to distinguish between narrow AI systems—those designed to perform one task very well—and the kind of broad, adaptable intelligence seen in humans. For decades, the goal remained on the fringe, a topic of theoretical research and science fiction more than mainstream engineering.
That began to shift with the advent of deep learning in the 2010s and the emergence of foundation models, or large-scale neural networks trained on broad datasets. These models didn’t think in any meaningful sense, but they could generate convincing text, identify patterns and perform tasks that once seemed well out of reach for machines.
Recent years have accelerated those developments. GPT-4, Claude 3, Gemini and other models can now pass standardized tests, write code, summarize research papers and even explain jokes. They’re still brittle, often get answers wrong and lack any real understanding. But they blur the boundary between narrow AI and something more flexible—and to some researchers, this feels like early-stage AGI. Others caution that it’s still an illusion of competence, and that the term “AGI” remains more rhetorical than technical.
At its most basic, AGI refers to a machine that can perform any intellectual task that a human being can—learning, reasoning, problem-solving—with the same adaptability and generalization ability. Unlike current AI systems, which are narrow and highly specialized, AGI would operate with flexibility across a wide range of domains, without needing to be explicitly programmed for each one.
The past year has seen a wave of increasingly capable models that have only fueled expectations. OpenAI CEO Sam Altman has described AGI as imminent. His public statements—and the company’s strategic direction—suggest an unambiguous commitment to building machines that can reason with broad competence. Google DeepMind and Anthropic have followed suit, releasing systems optimized for more general reasoning, memory and tool use.
High-profile investors and research labs are backing the AGI vision, with some declaring it achievable within five to ten years. Policy debates have also entered the mainstream, as lawmakers and watchdogs raise concerns about how to regulate a technology that, by definition, has no historical precedent. For some, the future feels tantalizingly close; for others, dangerously premature.
Puri’s response isn’t to dismiss the ambition outright, but to reframe the question: what if the real frontier isn’t building a machine that can do everything, but one that can do a few essential things well?
Puri was instrumental in building the systems that powered IBM Watson and watsonx.ai platforms, which set benchmarks for what machines could accomplish. Today, he believes that the field needs a new organizing principle—something less mythic and more practical. That idea, which Puri refers to as artificial useful intelligence (AUI), is beginning to influence how IBM and other research institutions frame their efforts.
“I like this perspective on ‘Let’s make artificial intelligence useful,’ because if we make it useful, nobody will object to it,” he says.
Unlike AGI, which implies a machine that can match or exceed human intelligence across all domains, AUI aims lower—but with a potentially more immediate payoff. It’s not about creating something that can write symphonies or replace philosophers. It’s about building tools that work reliably in clearly defined contexts, such as code copilots, compliance checkers and workflow optimizers.
Puri spoke to me from his office at IBM’s Thomas J. Watson Research Center in Yorktown Heights, New York. The building—an expansive, modernist structure nestled in the wooded hills about an hour north of Manhattan—has the quiet gravity of a place where scientific developments happen just out of view. Visitors pass blue nameplates etched with mathematical formulas and whiteboards marked with half-resolved equations. Designed by modernist architect Eero Saarinen in the early 1960s, the lab has been home to some of IBM’s most famous breakthroughs. Puri has worked here for decades.
With its long hallways and forest views, it’s the kind of place where long-range thinking feels not only possible, but expected—a setting well-suited to the way Puri approaches his work. As he tells it, his shift toward AUI isn’t a pivot so much as a continuation of the questions he’s been pursuing for years. He has long gravitated toward systems that show practical value at scale, such as AI tools for software development, which he describes as “surprisingly useful”—not just for experts, but also for beginners learning to code. That kind of broad accessibility, he argues, reflects the real promise of AI.
And when models are embedded in real-world operations, Puri says, the stakes are high. In banking, transportation and healthcare, for example, small errors can cascade quickly.
“I think the public got the wrong idea from all the flashy demos,” Puri says. “ChatGPT, Midjourney, Sora—all amazing. But in the enterprise world, you can’t hallucinate a number or misread a regulation. It’s got to be right.”
That concern has prompted IBM to adopt a hybrid model. Rather than relying solely on probabilistic systems, they are pairing models with verified tools, such as calculators, search APIs and structured data validators to minimize risks. These approaches do not promise human-like cognition. They aim for consistency. The goal is to ensure that AI systems can be trusted in high-stakes environments, where accuracy and auditability matter more than creativity or conversational flair.
“There are use cases where creativity is an asset,” he says, “and others where it’s a liability.”
Puri references IBM’s experience in software development and automation as a proof of concept. “It has become surprisingly useful,” he says of AI. “From developers finding it useful in their day-to-day lives, to students who are learning to code and do things, finding help from automated coding agents as well—it has become not just useful for experts, but for people across the board.”
He recalled initial doubts of some researchers that software development—often seen as a highly creative and complex task—would be difficult for machines to tackle. “It was thought that it was a very, very hard task,” he says. But thanks to LLMs like ChatGPT and in-house specialized models, that assumption didn’t hold. “That was surprising—not just to me, but to many people.”
The bigger implication, Puri says, is that since “every enterprise is a software enterprise now,” the usefulness of AI in coding could translate into broad relevance across industries.
For all the noise surrounding AI—its promise, its pitfalls, its mythology—Puri says he often returns to a quieter idea: trust. Not just in the ethical sense, but in the mechanical one. Can the system do what it claims to do? Can it be relied upon, even when no one is watching?
He’s referring to the proliferation of open-source models—Meta’s Llama, Mistral’s mixture-of-experts (MoE) systems, IBM’s own Granite line—and the broader shift toward transparent, collaborative research. These systems aren’t built for spectacle. They’re meant to be tested, tuned and challenged. They allow for scrutiny, and they invite modification.
IBM, for example, is pitching its Granite models not as general-purpose minds. Instead, the models are tools, developed with enterprise customers in mind and optimized to handle specific tasks in finance, logistics and other highly structured domains. Puri emphasizes their intentional focus and targeted utility.
“These models aren’t designed to replace humans,” he says. “They’re designed to help people do more.”
Rather than imitating human intelligence wholesale, he stresses, the real test is whether AI delivers outcomes that matter to people and institutions—consistently, predictably and on a large scale.
Looking ahead, Puri sees the future of AI as something that will be shaped less by top-down breakthroughs and more by broad, iterative progress, driven by collaboration, experimentation and shared infrastructure.
Notably, he believes the most meaningful advances won’t come from sealed labs or secret demos, but from communities building in the open. As he puts it: “This is one of the first sets of technologies where innovation is taking place truly in the open.”
Easily design scalable AI assistants and agents, automate repetitive tasks and simplify complex processes with IBM® watsonx Orchestrate™.
Move your applications from prototype to production with the help of our AI development solutions.
Reinvent critical workflows and operations by adding AI to maximize experiences, real-time decision-making and business value.