September 24, 2024 By Sascha Brodsky 3 min read

OpenAI’s latest o1-preview models tackle complex problems by breaking them into steps, potentially changing how industries approach challenges.

This approach, known as chain of thought reasoning, marks a significant shift from previous AI models that often produced answers without explaining their reasoning. These advancements could reshape how businesses and researchers approach complex problem-solving tasks.

“These models are better at tasks requiring more logic and reasoning because they take time to think through the problem,” said IBM Distinguished Engineer Chris Hay. “It’s like they’re showing their work, step by step.”

Chain of thought reasoning

The chain of thought approach allows users to see how the AI arrives at its conclusions. Hay explained the process: “If you ask a child, for example, what’s 25 multiplied by 10 plus five, there’s three steps there. They might just throw that blurred answer. But you said, no, no, you need to break this down… it’s like in school, you’re showing your work.”

Nathalie Baracaldo, an IBM AI Security Senior Research Scientist, emphasized the significance of this development: “The main difference is related to how we can know how the model arrived at a decision. We have explanations about what the agent did that are very useful for understanding why something happened.”

This level of transparency could have far-reaching implications across various industries. In software development, for instance, the models are showing improved coding abilities with fewer errors. Hay noted, “They’re coding better and hallucinating less,” referring to instances where AI produces plausible but incorrect information.

The new models also incorporate reinforcement learning in their training process. Hay explained, “They’ve also changed the way that they are trained in the base models. They talk about how they’re using reinforcement learning… to teach and train those models.”

Human-AI collaboration

The most effective use of these advanced AI models will likely involve a partnership between human expertise and machine capability. “The human will always have to provide input, be okay with the planning, and verify these things,” Hay said.

Hay cautioned against overestimating the models’ capabilities: “I think you can get great outputs. I think when people hear the words AGI, they’re thinking of this big pulsating head in the clouds… actually, if I think about it, the models, as they are, with their next token prediction and good training data and their planning, etc., they do a pretty good job—better than humans in quite a lot of tasks.”

The development of these models raises questions about the nature of artificial intelligence and its comparison to human cognition. The new models have demonstrated remarkable prowess in certain areas—outperforming humans on standardized tests like the bar exam and SATs. Yet they still struggle with tasks that most humans find intuitive.

Hay pointed out that the models can struggle with tasks that humans find simple: “The model excels at specific, individual tasks. However, it currently has difficulty distinguishing between different parts of a conversation. This leads to confusion in its ability to handle multiple concepts simultaneously. The model overemphasizes context, often considering too much irrelevant information when processing requests.”

Baracaldo added a note of caution: “Even though this model is super impressive, sometimes it makes mistakes. And if you read the technical report, sometimes it creates solutions that a real expert, a human being, will think are not feasible, but the model does not know all the assumptions.”

The implications of these advancements extend beyond the tech industry. In research and academia, they might accelerate the pace of discovery by assisting in complex data analysis and hypothesis generation. In fields like medicine and law, they could serve as tools to augment human expertise, potentially leading to more accurate diagnoses or more comprehensive legal analyses.

Hay summarized the practical value of the new models for enterprises: “They are a lot better coders than they were before.”

eBook: How to choose the right foundation model
Was this article helpful?
YesNo

More from Artificial intelligence

IBM launches Mistral AI with IBM, enabling customers to deploy Mistral AI’s most powerful foundation models on premises with IBM watsonx

2 min read - Foundation modes are trained on billions of parameters of data, but most of this data is general purpose and from the public domain. While useful in some scenarios, enterprises must often train these base foundation models on their own proprietary data, a step called “fine-tuning.” Tuning helps to maximize a model’s productivity in terms of overall accuracy for any specific use case. Given the potentially sensitive nature of this data and an organization’s data security standards, uploading proprietary data to a…

How a solid generative AI strategy can improve telecom network operations

3 min read - Generative AI (gen AI) has transformed industries with applications such as document-based Q&A with reasoning, customer service chatbots and summarization tasks. These use cases have demonstrated the impressive capabilities of large language models (LLMs) in understanding and generating human-like responses, particularly in fields requiring nuanced language understanding and inferencing. However, in the realm of telecom network operations, the data is different. The observability data comes from proprietary sources and encompasses a wide variety of formats, including alarms, performance metrics, probes…

IBM watsonx Assistant for Z V2 now offers flexibility to clients to ingest their enterprise documentation for a more personalized experience

3 min read - Imagine your mainframe users having accurate, curated responses to all their questions instantly at their fingertips. What if your system programmers and operators could perform both routine and complex tasks correctly every time, with minimal reliance on subject matter experts? That’s the transformation IBM watsonx™ Assistant for Z aims to deliver. IBM watsonx Assistant for Z is a generative AI assistant launched at Think 2024 earlier this year. This AI assistant uniquely combines conversational artificial intelligence (AI) and IT automation to…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters