The powerful, cost-effective and open AI accelerator for generative AI workloads.
Unlock, innovate and deploy new AI solutions with Intel® Gaudi® 3 AI accelerators on IBM Cloud®—designed to help you cost-effectively scale for enterprise AI demands with high-performance, flexibility in deployment and open development.
Support a broad range of generative AI inferencing applications and frameworks, including large language models (LLM) and multi-modal models (MMM). Start quickly with IBM Cloud Virtual Servers for VPC (with or without Red Hat Linux AI), Red Hat OpenShift on IBM Cloud, Intel Deployable Architectures, and IBM watsonx® via bring-your-own-license or on-premises deployment.
Intel® Gaudi® 3 AI accelerators are paired with 5th Gen Intel® Xeon® processors on IBM Cloud Virtual Servers for VPC.
Intel Gaudi 3 AI accelerators can be deployed through IBM Cloud Virtual Servers for VPC cloud instances. IBM Cloud VPC is designed for high resiliency and security inside a software-defined network where clients can build isolated private clouds while maintaining essential public cloud benefits. The Intel Gaudi 3 cloud instance, which also supports Red Hat Enterprise Linux AI images, is ideal for clients with highly specialized software stacks, or those who require full control over their underlying server.
Leverage a fully managed, containerized infrastructure designed for cloud-native security and scale with Red Hat OpenShift clusters. Intel Gaudi 3 AI Accelerators are available on IBM Cloud Virtual Servers for VPC, clusters v4.18 with Red Hat CoreOS – ideal for cluster management, load balancing, automation and orchestration.
Run and scale a low-cost generative AI inferencing solution with standards-based APIs. Intel AI for Enterprise Inference is a deployable architecture that pairs Intel Gaudi 3 and Intel Xeon performance with OpenAI-compatible APIs to help maintain integration with your existing applications.
Intel® Gaudi® 3 AI accelerators on IBM Cloud are designed for high-performance AI workloads, featuring 64 Tensor Processor Cores (TPCs) and eight Matrix Multiplication Engines (MMEs) to help accelerate deep neural network computations. Intel® Gaudi® 3 AI accelerators on IBM Cloud are also equipped with 128 GB of HBM2E memory and offer up to 3.7 TB/s of memory bandwidth, and support industry-standard Ethernet networking with 24x200 GbE ports, providing 9.6 Tbps of bi-directional bandwidth for scalable system interconnectivity.
Intel® Gaudi® 3 AI accelerators deliver broad AI application support, including inferencing, 3D generation, text generation, classification, video generation, sentiment, translation, image generation, summarization, and Q&A – with focus on multi-modal, large language modals (LLM), and retrieval-augmented generation (RAG).
With 128 GB of HBM2E memory and up to 3.7 TB/s of memory bandwidth, Intel® Gaudi® 3 AI accelerators on IBM Cloud help ensure fast data throughput, reducing bottlenecks and enabling developers to process massive datasets more quickly and efficiently.
Intel® Gaudi® 3 AI accelerators on IBM Cloud are fitted within IBM Cloud Virtual Servers on the IBM Cloud Virtual Private Cloud (VPC). The IBM Cloud VPC is a highly resilient and highly secure software-defined network (SDN) on which you can build isolated private clouds while maintaining essential public cloud benefits. The Intel® Gaudi® 3 virtual server profile on IBM Cloud VPC is a pre-configured combination of vCPU, RAM, and storage to quickly to start a virtual server instance.
Intel® Gaudi® 3 AI
accelerators on IBM Cloud support popular frameworks, including,
PyTorch, ONNX, and DeepSpeed. Over 400k models are available on Hugging Face, optimized for use with the
Optimum Habana software library. The full Intel® Gaudi®
software suite and framework support is designed to facilitate easy migration,
enabling developers to integrate existing models with minimal code changes.