Skip to main contentIBM 

Introducing Qiskit Code Assistant

A new assistant for writing quantum code is now available as Preview for IBM Quantum™ Premium Plan users.


9 Oct 2024

Daniella Garcia Almeida

Juan Cruz-Benito

Robert Davis

To bring useful quantum computing the world, it isn’t enough to build the world’s most performant quantum software, or its largest fleet of utility-scale quantum processors. We also have to empower users to make efficient and effective use of the tools we’ve created. With the release of Qiskit Code Assistant, now available as a private preview via the IBM Quantum Premium Plan, we’re doing exactly that.

Qiskit Code Assistant combines the sophisticated large language models (LLMs) of IBM® watsonx™with the collective expertise of Qiskit users all across the quantum community to help you write better Qiskit code with less effort. Its quantum code-generation capabilities serve not only to make quantum computing more accessible and efficient, but also provide users with a new, hands-on way of learning to write Qiskit code.

Our hope is that Qiskit Code Assistant will open up the world of quantum computing, helping users learn to write better code, simplify their development process, optimize their quantum programs to generate better quantum circuits, and finish their projects more quickly. Users on the IBM Quantum Premium Plan who wish to get started with the Qiskit Code Assistant right away can do so by reviewing its documentation.

We first previewed the Qiskit Code Assistant at IBM Quantum Summit 2023, alongside other AI-enhanced quantum software capabilities. That presentation also gave viewers an early look at the Qiskit HumanEval evaluation benchmark we developed for assessing the performance of the generative AI models we trained to produce quantum code. Since then, our testing with the Qiskit Human Eval benchmark has indicated that our quantum code generation models have made Qiskit Code Assistant the best available model for writing usable, high-quality Qiskit Code.

We believe classical AI tools like Qiskit Code Assistant will play a crucial role in the evolution of the quantum software stack. In the future, we plan to release key components of the Qiskit Code Assistant project as open source, including the Qiskit Granite model upon which Qiskit Code Assistant is based, and the Qiskit HumanEval dataset. We hope this will encourage others in the quantum community to collaborate with us in making these tools even better.

In the meantime, we’ve been working to make Qiskit Code Assistant more performant, more accessible, and easier to use. Earlier this year, we invited participants in the 2024 IBM Quantum Challenge to try out the code assistant as part of the Challenge labs, and collaborated with volunteers across the IBM organization to test its safety and security. We also wrote two papers detailing our work developing the AI model that powers Qiskit Code Assistant and the Qiskit HumanEval benchmarking method.

This blog post will delve into some of the most important takeaways from each of those papers. Before we do that, however, let’s take a closer look at what the Qiskit Code Assistant project can do for you, and how Qiskit users can get started with it.

Getting started with Qiskit Code Assistant

Qiskit Code Assistant integrates with popular development environments like Visual Studio Code (VS Code) and JupyterLab so that you can easily access it via your preferred user interface.

Once you’ve installed it in your chosen environment, you can ask the code assistant to generate Qiskit code in response to natural language prompts or function definitions — for example, “#define a Bell circuit and run it on ibm_brisbane using the Qiskit Runtime Service”. Alternatively, you can input your own rough or partial code and ask the code assistant to clean it up or fill in the missing pieces with its autocomplete functionality. In either case, Qiskit Code Assistant returns high-quality suggested code that you can easily integrate into your existing work with just a few keystrokes.

VIDEO: See Qiskit Code Assistant at work in VS Code and JupyterLab.

To get started, VS Code users will need to install the Qiskit Code Assistant VS Code extension either by using the VS Code “Extensions” tool or by searching for it in the VS Code Marketplace. JupyterLab users will install the Qiskit Code Assistant JupyterLab extension by executing pip install qiskit_code_assistant_jupyterlab.

After you install Qiskit Code Assistant, it will automatically attempt to authenticate you as a user of IBM Quantum services. Refer to the documentation for details on how to authenticate manually. From there, first time users will see a modal with the End User License Agreement, which details important rules and restrictions to keep in mind when using the code assistant. After you accept the license agreement, you can begin generating code.

For the most detailed, up-to-date instructions and guidance on using Qiskit Code Assistant, be sure to refer to the main documentation page, and the platform-specific documentation for Visual Studio Code and JupyterLab

Qiskit Code Assistant under the hood

Earlier this year, we published a paper on arXiv that offers insight into how the original Qiskit Code Assistant model was built, how it performed, and the unique challenges that we encounter in the field of quantum code generation. Before we dive into that, however, let’s review some of the basics of how LLMs like Qiskit Code Assistant actually work.

Large language models are a type of “generative AI,” AI models that use statistical data analysis to generate text, images, and many other kinds of data. LLMs are trained on large data sets to predict the next word in a text sequence based partially on the words leading up to the end of the input sequence, and partially on the model’s analysis of language patterns in its training data. The model essentially assigns a probability to all the words that could come next, and outputs the one that has the highest probability. This has allowed LLMs to quickly emerge as a powerful tool for generating classical software code.

Generating quantum code, however, has proven to be a more complex task. For one thing, training an LLM solely on code examples is insufficient for building a model that outputs high-quality quantum code; the LLM also needs basic contextual knowledge of quantum computing itself. Otherwise, it may not be able to make the connection between a user’s input and the desired output. For example, you may ask an LLM to write the code for the Deutsch-Jozsa algorithm, but the words “Deutsch-Jozsa algorithm” usually don’t appear anywhere in that code, so the model needs background knowledge to connect the user request with the code it needs to generate.

Another challenge in quantum code generation is that quantum computing is a much smaller field than classical computing, so the availability of training data and code examples is extremely limited. At the same time, the field evolves even faster than many areas of classical computing, and libraries are constantly being updated with new techniques. This means quantum code generation models also need to be updated regularly, and we have to be careful when using training data that’s more than a few years old.

Training LLMs to generate quantum computing code has proven so challenging that the task has become a useful benchmark for classical developers working to measure the performance of LLMs for general coding tasks. For example, Qiskit Github issues make up a whopping 13% of the SWE-bench benchmarking dataset published last year by a team of researchers at Princeton and University of Chicago.

The granite-8b-qiskit model underlying the Qiskit Code Assistant was trained on top of an IBM Granite™ Code model, one of many developed by the IBM watsonx team for code generation tasks. The “8b” in granite-8b-qiskit represents the 8 billion parameters that influence its code output. To improve the performance of the model further, we extended its training with additional Qiskit data filled with a variety of Python scripts and Jupyter notebooks, as well as data taken from open-source GitHub repositories containing the word Qiskit.

All data was collected in August 2024, and we were careful to filter out deprecated code and other outdated material. We also added synthetic data — thousands of question and answer pairs specific to Qiskit, all synthetically generated from Qiskit Tutorials — to shore up its general quantum computing knowlege. Moving forward, the Qiskit Code Assistant will continue to be trained on the latest code examples and Qiskit releases to ensure it remains valuable and relevant for all users.

Qiskit HumanEval

Beyond building Qiskit Code Assistant itself, we also needed a way to evaluate its performance. This presented its own challenges, since there seems to be no prior research specifically concerning the evaluation of quantum code generated by LLMs. To fill this gap, we created the Qiskit HumanEval dataset, our spin on the popular HumanEval dataset used to evaluate classical code generated by LLMs. Qiskit Human Eval comprises a collection of tasks each designed to evaluate the performance of quantum code LLMs.

One of the advantages to working with code LLMs over natural language LLMs is that, rather than generating human speech, they’re meant to generate executable code. This means we can evaluate their performance by simply asking the model to generate code for different tasks, running the generated code, and seeing how well it works.

The Qiskit HumanEval dataset includes ~150 distinct tests across eight categories to evaluate how a model performs on different elements of quantum programming. These include fundamental quantum programming tasks in areas like quantum circuit generation, execution of circuits, state preparation and analysis, algorithm implementation, and more.

The tests in Qiskit HumanEval were developed by a panel of real world quantum computing experts both within and outside of IBM, including Qiskit advocates, members of the Qiskit community, and members of our support and documentation teams. The panel worked together to ensure each task included in the dataset is novel and original, with a clear relationship between the prompt or test and the intended task.

We compared the performance of the Qiskit Code Assistant against state-of-the-art open source code LLMs including CodeLlama, DeepseekCoder, and Starcoder. The table below shows the results of this comparison in terms of the percentage of benchmarking tests the different LLMs passed in both the standard HumanEval dataset of classical coding benchmarks, and the Qiskit HumanEval dataset of quantum coding tasks. As you can see, Qiskit Code Assistant’s granite-8b-qiskit model significantly outperformed all other models at Qiskit HumanEval’s quantum code generation tasks.

ModelHumanEvalQiskit HumanEval
CODELLAMA-34B-PYTHON-HF52.43%26.73%
DEEPSEEK-CODER-33B-BASE49.39%39.6%
STARCODER2-15B45.12%37.62%
CODEGEMMA-7B42.68%24.75%
GRANITE-8B-CODE-BASE39.02%28.71%
GRANITE-8B-QISKIT38.41%46.53%

Up next: Making Qiskit Code Assistant and Qiskit HumanEval open source

In the near future, we plan to open source both the granite-8b-qiskit LLM and the Qiskit HumanEval dataset so researchers and developers all around the world can benefit from them. Our hope is that making these resources publicly available will enable and encourage other code LLMs to assess their performance in the quantum coding domain. We also plan to strongly encourage feedback from all users, so we can identify areas for improvement and ensure the dataset meets the diverse needs of the community.


View documentation