Traditional data management approaches store data in disparate databases, often with data duplication across systems and time consuming, risky, and expensive data integration and processing. Getting reliable data without friction is key in achieving successful Generative AI. Watsonx.data is a data lakehouse architecture built with open standards that support both traditional SQL-derived analytics and AI driven insights with automation in a single platform, supporting the needs of different data users and a broad variety of enterprise workloads. 

Think 2024 announcements, now GA! 

IBM announced several upcoming updates for new and exciting capabilities in IBM® watsonx.data™at Think 2024, our annual event that brings together over 5,000 technology pioneers and leaders. These new features are now generally available in watsonx.data.   

IBM introduced a new query engine within IBM watsonx.data, Presto C++, along with an integrated query optimizer featuring enterprise-proven query compilation technology and advanced query rewrite and cost-based optimization techniques. In other words, watsonx.data has been enhanced for fast query performance at optimized costs. 

IBM watsonx.data with Presto C++ v0.286 and query optimizer on IBM Storage Fusion HCI, tested internally by IBM, was able to deliver better price performance compared to Databrick’s Photon engine, with equal query runtime at less than 60% of the cost, derived from public 100 TB TPC-DS Query benchmarks.* 

IBM clients can now unlock transactional mainframe data for artificial intelligence (AI) and analytics with IBM Data Gate for watsonx™ integrated with IBM watsonx.data. This revolutionizes the way organizations synchronize, analyze and build AI models from data originating on IBM Z®.  

By bringing transactional data from the mainframe into an open, governed data lakehouse such as watsonx.data, enterprises can readily build AI models to grow revenue, enhance productivity and manage costs. 

IBM Knowledge Catalog announced a Gen-AI infused semantic layer, embeddable into IBM watsonx.data. When embedded, the semantic layer generates data enrichments that enable clients to find and understand previously cryptic, structured data across their data estate in natural language through semantic search. This accelerates data discovery to unlock insights faster, without requiring SQL. 

IBM also announced enhanced integrations with IBM® Db2® database, Db2® Warehouse, IBM® Netezza® and Informix® with watsonx.data, and support for open formats such as Apache Iceberg to unify and share a single copy of data and metadata across the hybrid cloud without needing to migrate or re-catalog. With these integrations, clients can query data from their IBM databases across multiple engines to prepare data for AI. 

IBM also announced IBM Data Product Hub, a new solution for repeatable data sharing between internal data producers and data consumers, now generally available. IBM Data Product Hub users can connect to IBM watsonx.data and package relevant metadata to create a repeatable, governed data product. That data product can then deliver the right data for various AI use cases across the organization at scale, without the need for repeated, manual workflows.  

What our clients say about watsonx.data 

IBM was thrilled to highlight some of the amazing work our clients are doing with watsonx.data on stage at Think this year. Themes included scalability, data governance and management, and speed to value.  

“Our architecture was a monolithic data lake, and then we started a migration to IBM watsonx.data. Many of the things we were struggling with, we got from the platform: workload isolation, data governance, and the most important part, having the confidence that IBM watsonx.data scales really well with our growth.”   —Toomas Römer, VP of Engineering at Bolt 
“One of the things IBM does really well is scale. They have great data governance and data management strategies that are prebuilt into the platform so that we can enable every data steward, every data engineer, and every data scientist to very quickly get access to all the data that’s required.”  —Dr. Stephan Gerali, Senior Fellow at Lockheed Martin 
“IBM’s hybrid cloud and Bhuma’s real-time analytics and app platform are paving the way for us to leverage watsonx.data and IBM® watsonx.ai™ without the need to transition away from our existing AWS environment. This seamless integration will be a game-changer for us, enabling us to maintain operational efficiency, cost-effectiveness and at the same time leverage IBM and Bhuma’s tech stacks.”   —Srikanth Radhakrishnan, Senior Vice President of Engineering at Tyfone 
“Bhuma is thrilled to be an IBM watsonx.data ecosystem partner, helping our customers build real-time data apps, analytics, and AI agents delivering modern experiences with insights-actions-outcomes on a modern lakehouse. Many of our customers, such as Tyfone, are deploying IBM watsonx.data for the best price-performance for real-time queries, flexibility to deploy on-prem and in the cloud of your choice to meet security and regulation, and the ability to integrate into the watsonx ecosystem with assistants and governance.”  —Srini Gurrapu, Founder and CEO at Bhuma Inc. 
“The combination of AI and mainframe data has many clients intrigued, as many of them now have the integrated AI acceleration unit of the IBM® z16®, which they would like to exploit. I am seeing a lot of discussion and activity in individual workshops. Clients want to understand how technologies like Data Gate for watsonx can help them. It’s becoming top of mind in many mainframe shops today.” —Leonard J. Santalucia, CTO and Business Development Manager at Vicom Infinity 

Get started with IBM watsonx.data  

Try watsonx.data yourself with a free trial or book a meeting with an IBM watsonx.data product specialist. Interested in diving deeper into Think announcements? Watch the watsonx keynote

Learn more at watsonx.data

*Based on IBM internal testing of Presto C++ 0.286 on a hyperconverged infrastructure setup with 1 master + 75 worker nodes, 1009 vCPUs, 18 TB memory, 344.8 TB of file system storage, distributed RAID and 50 GB network compared to public Databricks 100TB TPC-DS. Query benchmarks published in 2021 with 1 master + 256 worker nodes, 2112 vCPUs, 16.1 TB Memory, 528.2 TB of total storage and 10 GB Network. Pricing calculations are based on IBM watsonx.data pricing as of 7 May 2024 and Databricks published pricing for Photon as of 7 May 2024. Results are based on testing conditions and pricing as of the dates shown. Actual costs and performance can vary depending on individual client configurations and conditions. Results are derived from the Databricks SQL 8.3 benchmark and as such is not comparable to published Databricks SQL 8.3 benchmark results, as results do not comply with the Databricks SQL 8.3 benchmark specification. 

More from AI for the Enterprise

Turn data into insights: Ground AI models with multiple documents 

2 min read - Generative AI (gen AI) is revolutionizing the ability to quickly access knowledge. Organizations aiming to improve operations are taking note. According to IDC FutureScape: Worldwide Generative Artificial Intelligence 2024 Predictions, IDC, October 2023, “By 2025, two-thirds of businesses will use a combination of gen AI and retrieval augmented generation (RAG) to power domain-specific self-service knowledge discovery, improving decision efficacy by 50%.” To actualize this, organizations need gen AI capabilities, such as natural language question-answering systems and enterprise search, to support self-service…

Success and recognition of IBM offerings in G2 Summer Reports  

2 min read - IBM offerings were featured in over 1,365 unique G2 reports, earning over 230 Leader badges across various categories.   This recognition is important to showcase our leading products and also to provide the unbiased validation our buyers seek. According to the 2024 G2 Software Buyer Behavior Report, “When researching software, buyers are most likely to trust information from people with similar roles and challenges, and they value transparency above other factors.”  With over 90 million visitors each year and hosting more than 2.6…

Unify and share data across Netezza and watsonx.data for new generative AI applications

3 min read - In today's data and AI-driven world, organizations are generating vast amounts of data from various sources. The ability to extract value from AI initiatives relies heavily on the availability and quality of an enterprise's underlying data. In order to unlock the full potential of data for AI, organizations must be able to effectively navigate their complex IT landscapes across the hybrid cloud.   At this year’s IBM Think conference in Boston, we announced the new capabilities of IBM watsonx.data, an open…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters