Scale enterprise gen AI for code generation with IBM Granite code models, available as NVIDIA NIM inference microservices

Many enterprises today are moving from generative AI (gen AI) experimentation to production, deployment and scaling. Code generation and modernization are now among the top enterprise use cases that offer a clear path to value creation, cost reduction and return on investment (ROI).

IBM® Granite™ is a family of enterprise-grade models developed by IBM Research® with rigorous data governance and regulatory compliance. Granite currently supports multilingual language and code modalities. And as of the NVIDIA AI Summit in Taiwan this month, IBM Granite code models, 8b-code-instruct and 34b-code-instruct, are available as NVIDIA-hosted NIM inference microservices on the NVIDIA API catalog.

These models are optimized for higher throughput and performance, powered by NVIDIA NIM. The new availability of these models comes after IBM’s announcement of its collaboration with NVIDIA to drive enterprise gen AI adoption by pairing NVIDIA AI Enterprise software and accelerated computing with industry solutions from IBM Consulting®.

Enterprise decision makers are facing the challenge of scaling gen AI faster while mitigating foundation model-related risks. They are looking for truly enterprise-grade foundation models and software capabilities to bring trusted, performant and cost-effective generative AI to key business workflows and processes.

Based on HumanEvalPack evaluation, Granite code models can outperform some models that are even twice their size. In the evaluation, no single model could perform at a high level at generation, fixing and explanation—apart from Granite. The family of models was also recognized by the Stanford Transparency Index as one of the most transparent modes in the industry, with a perfect score in several categories designed to measure how open models really are. In fact, since the recognition, IBM has taken further steps to enhance Granite’s transparency by releasing Granite code models into open source, aimed at making coding as easy as possible for the developer community. Granite models are trained on 116 programming languages, including Python, JavaScript, Java, Go, C++ and Rust.

Granite models are available in a curated foundation models library part of the IBM watsonx™ data and AI platform, open-source platforms—Hugging Face, GitHub, watsonx.ai™ and RHEL AI (the new foundation model platform from Red Hat®)—and now on the NVIDIA API catalog, which makes coding easy and accessible for as many developers as possible.

IBM Granite code models on the API catalog are planned to be offered as downloadable NIM inference microservices—designed to simplify and accelerate the deployment of AI models across GPU-accelerated workstations, data center and cloud platforms. The flexibility to deploy on your preferred infrastructure ensures your data is private and secured. Containerized for easy deployment, NVIDIA NIM microservices deliver superior throughput to power more responses on the same infrastructure and support industry-standard APIs that can be easily incorporated into existing workflows.

Furthermore, as part of the NVIDIA AI Enterprise software platform, self-hosted NIM models include ongoing security updates and are backed by enterprise-grade support. Developers can access NIM for free-to-start testing of IBM Granite code models at scale and build a proof of concept (POC) by connecting applications to the NVIDIA-hosted API endpoint running on a fully accelerated stack.

By bringing Granite code models to the NVIDIA API catalog, IBM is enabling enterprises to easily use industry-leading models for trusted code generation and translation, GPU infrastructure, and inference management software capabilities for price-performance optimization. More Granite models will soon be available on the NVIDIA API catalog as IBM and NVIDIA continue to expand their collaboration.

Explore what Granite and NVIDIA can do for you

Tags

More from Artificial intelligence

Taming the Wild West of AI-generated search results

Are bigger language models always better?

Generative AI meets application modernization

IBM Newsletters