We have seen that the explosion in interest and adoption of AI has led IT leaders to revisit their capacity plans. They are seeing the need for increasing compute resources at a scale that has rarely been observed in the past.

As businesses look to adopt AI pervasively across their organization to both improve operational productivity and efficiency and to create new business value, they are having to look for new ways to meet the need for increased compute resources.

Many enterprises are realizing that to maximize the value of their AI investments, infrastructure needs to scale rapidly and efficiently to deliver AI insights where and when they need them. Also, the infrastructure must deliver the security and resiliency they need for mission-critical workloads.

In 2022, IBM introduced the groundbreaking IBM® z16™ featuring an on-chip AI inference accelerator that is focused on accelerating AI model execution to deliver real-time insights that can meet the needs of the most demanding business workloads. With IBM z16, process up to 3.5 million inference requests per second with 1 ms response time using a credit card fraud detection model.1

Today, we are announcing the AI Bundle for IBM Z® and LinuxONE. This builds on the success of the IBM z16 or LinuxONE 4 by providing both new dedicated capacity for AI workloads with a highly optimized software stack.

What is the AI Bundle for IBM Z and LinuxONE?

The AI Bundle for IBM Z and LinuxONE is an AI dedicated hardware infrastructure with an optimized core software stack. It allows clients to pursue their AI journey with streamlined AI deployment on IBM Z and LinuxONE. The AI Bundle for IBM Z and LinuxONE brings the advantage of dedicated hardware, providing enterprises with control over their infrastructure and data to manage processes within their organization’s data center environment and gain business insights.

Distinguishing features

With a curated suite of AI software (AI Toolkit for IBM Z and LinuxONE and IBM Cloud Pak® for Data on IBM Z and LinuxONE), clients can manage AI model lifecycles in one place allowing for a quick deployment of wide range of use cases.

Leveraging the IBM Telum® processor with the Integrated Accelerator for AI, enterprises can run inferencing for high volume workloads at scale. On digital currency transactions, run inferencing for fraud 85% faster by colocating your application with Snap ML on IBM z16 versus running inferencing remotely using Scikit-learn on a compared x86 server.2

A wide range of use cases can be implemented with the AI Bundle software components for the latest IBM hardware platforms:

  1. Claims fraud: A state government in the US realized that their process to determine fraudulent claims was manual and intensive and taking up to 40 hours per case. In support of this use case, IBM has demonstrated that by leveraging IBM z16 with the Integrated Accelerator for AI, add-in transactional fraud detection for OLTP transactions resulted with only 2 ms additional response time.3
  2. Clearing and settlement: A card processor explored using AI to assist in determining which trades and/or transactions have a high-risk exposure before settlement to reduce liability, chargebacks and costly investigation. In support of this use case, IBM has validated that the IBM z16 is designed to score business transactions at scale delivering the capacity to process up to 300B deep inferencing requests per day with 1 ms of latency.4
  3. Anti-money laundering (AML): A large European bank needed to introduce AML screening into their instant payments operational flow. Their current end-day AML screening was no longer sufficient due to stricter regulations. In support of this use case, IBM has demonstrated that the IBM z16 with the Integrated Accelerator for AI provides 4x faster response time versus compared z15® when both are running equivalent OLTP workloads with batched fraud detection.5

Get started today

The AI Bundle for IBM Z and LinuxONE will be generally available from IBM and certified Business Partners on 26 April 2024.

In addition, IBM offers a no-charge AI on IBM Z and LinuxONE Discovery Workshop. This workshop is a great starting point and can help you evaluate potential use cases and define a project plan. This workshop can help you leverage the AI Bundle for IBM Z and LinuxONE effectively. To learn more or to schedule a workshop on any of the AI use cases or products, email us at aionz@us.ibm.com.


1 DISCLAIMER: The performance result is extrapolated from IBM internal tests running local inference operations in an IBM z16 LPAR with 48 IFLs and 128 GB memory on Ubuntu 20.04 (SMT mode) using a synthetic credit card fraud detection model using the Integrated Accelerator for AI. The benchmark was running with 8 parallel threads each pinned to the first core of a different chip. The lscpu command was used to identify the core-chip topology. A batch size of 128 inference operations was used. Results were also reproduced using a z/OS® V2R4 LPAR with 24 CPs and 256 GB memory on IBM z16. The same credit card fraud detection model was used. The benchmark was executed with a single thread performing inference operations. A batch size of 128 inference operations was used. Results may vary.

2 DISCLAIMER: Performance results based on IBM internal tests doing inferencing using a Scikit-learn Random Forest model with Snap ML v1.9.0 (tech preview) backend on IBM z16 and with Scikit-learn v1.0.2 backend on compared x86 server. The model was trained on the following public data set: https://www.kaggle.com/datasets/ellipticco/elliptic-data-set. BentoML v0.13.1 was used on both platforms as a model serving framework. IBM z16 configuration: Ubuntu 20.04 in an LPAR with 2 dedicated IFLs, 256 GB memory. x86 configuration: Ubuntu 20.04 on 9 IceLake Intel® Xeon® Gold CPU @ 2.80 HGz with hyperthreading turned on, 1 TB memory.

3 DISCLAIMER: Performance results were extrapolated from IBM internal tests running an OTLP workload with credit card transaction using the credit card fraud detection model on IBM z16 vs running it without credit card fraud detection. IBM z16 con figuration: Ubuntu 20.04 in an LPAR with 12 dedicated IFLs, 256 GB memory and IBM FlashSystem® 9200 storage. System utilization has been in both cases above 70%. Results may vary.

4 DISCLAIMER: Performance result is extrapolated from IBM internal tests running local inference operations in an IBM z16 LPAR with 48 IFLs and 128 GB memory on Ubuntu 20.04 (SMT mode) using a synthetic credit card fraud detection model (https://github.com/IBM/ai-on-z-fraud-detection) exploiting the Integrated Accelerator for AI. The benchmark was running with 8 parallel threads each pinned to the first core of a different chip. The lscpu command was used to identify the core-chip topology. A batch size of 128 inference operations was used. Results were also reproduced using a z/OS V2R4 LPAR with 24 CPs and 256 GB memory on IBM z16. The same credit card fraud detection model was used. The benchmark was executed with a single thread performing inference operations. A batch size of 128 inference operations was used. Results may vary

5 DISCLAIMER: Performance results based on IBM internal tests running online transaction processing (OLTP) credit card workloads with in-transaction fraud detection (https://github.com/IBM/ai-on-z-fraud-detection). On IBM z16 A01 and z15 T01, both systems ran z/OS® 2.4, had 4 central processors, 8 z Systems® Integrated Information Processors (zIIPs) with simultaneous multithreading 2, and 16 GB memory. Inferencing was done in IBM Watson®z15 Machine Learning for z/OS Online Scoring Community Edition v1.0.0 in a single IBM z/OS Container Extensions (zCX) container. zCXwas version V2R4 with APAR 0A59865. The application ran in CICS v5.4 on WebSphere® Application Server Version v8.5 Liberty with Java 8.0.6.20 and IBM Enterprise COBOL for z/OS 6.2.0 P190522. The database for the application was a colocated Db2 for z/OS v12. The workload driver, JMeter, was based on an initial workload that targeted 10,000 transactions per second on z15 without fraud detection. This same driver configuration was then used with fraud detection on both systems where the 32 most recent transactions for that credit card were batched–client-side for fraud detection.

More from Artificial intelligence

Responsible AI is a competitive advantage

3 min read - In the era of generative AI, the promise of the technology grows daily as organizations unlock its new possibilities. However, the true measure of AI’s advancement goes beyond its technical capabilities. It’s about how technology is harnessed to reflect collective values and create a world where innovation benefits everyone, not just a privileged few. Prioritizing trust and safety while scaling artificial intelligence (AI) with governance is paramount to realizing the full benefits of this technology. It is becoming clear that…

Taming the Wild West of AI-generated search results

4 min read - Companies are racing to integrate generative AI into their search engines, hoping to revolutionize the way users access information. However, this uncharted territory comes with a significant challenge: ensuring the accuracy and reliability of AI-generated search results. As AI models grapple with "hallucinations"—producing content that fills in gaps with inaccurate information—the industry faces a critical question: How can we harness the potential of AI while minimizing the spread of misinformation? Google's new generative AI search tool recently surprised users by…

Are bigger language models always better?

4 min read - In the race to dominate AI, bigger is usually better. More data and more parameters create larger AI systems, that are not only more powerful but also more efficient and faster, and generally create fewer errors than smaller systems. The tech companies seizing the news headlines reinforce this trend. “The system that we have just deployed is, scale-wise, about as big as a whale,” said Microsoft CTO Kevin Scott about the supercomputer that powers Chat GPT-5. Scott was discussing the…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters