About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Technical Blog Post
Abstract
nvidia-docker on POWER: GPUs Inside Docker Containers
Body
By now everyone has heard of the microservices revolution that is sweeping the industry. Microservices are here to change the world, and developers and companies alike are rushing to transform their workflows to this new model.
Some workflows require the use of host devices, so docker provides a --device flag that will pass through a host device (such as a block device) into a container. However, for NVIDIA GPUs it’s not as simple as using the --device flag. To allow developers to isolate GPUs in docker containers, NVIDIA has created a wrapper for the docker command aptly named nvidia-docker. You can read more about why --device isn’t sufficient here: https://github.com/NVIDIA/nvidia-docker/wiki/Why%20NVIDIA%20Docker. Primarily, using a GPU “requires the installation of the NVIDIA driver.”
nvidia-docker is a great tool for developers using NVIDIA GPUs, and NVIDIA is a big part of the OpenPOWER Foundation – so it’s obvious that we would want to get ppc64le support into the nvidia-docker project. Luckily, the project was well laid-out and it was a piece of cake to get ppc64le support added.
Let’s walk through building the required packages and docker images, and install the nvidia-docker plugin.
If you're unfamiliar with the nvidia-docker project's components, here is a quick summary. There are two main components: 1) the plugin and command-wrapper, and 2) the docker image build scripts (makefiles and dockerfiles). The plugin and command can be installed by building a package, or by using make and make install. If you use make, you'll have to start the nvidia-docker service manually (e.g. $ systemctl start nvidia-docker). For POWER, you can build a deb package or make to install nvidia-docker, also build docker images for CUDA 7.5 (which have 14.04 as the image base) and CUDA 8.0 (which have 16.04 as the image base).
This walk-through is using Ubuntu 16.04-based images with CUDA 8.0.
Pre-Req: You have to install the nvidia drivers for the GPUs you’re going to use on your host system in order to use nvidia-docker. This article assumes that you have been running GPU workloads on your system and have those libraries pre-installed.
Check that your installed nvidia driver supports the CUDA 8.0 toolkit version here: https://github.com/NVIDIA/nvidia-docker/wiki/CUDA#requirements
Build the nvidia-docker plugin deb
Install ‘docker-engine’
Optional: If you already have docker installed, and it’s version 1.9 or later, you can choose to skip this step and force-install the deb we create next.
Docker, Inc. has its own repositories for those who want to use a more recent docker version than their distro might ship. Those repositories contain the docker-engine package. If you want to install docker-engine on ppc64le, use the following:
$ sudo echo ‘deb http://ftp.unicamp.br/pub/ppc64el/ubuntu/16_04/docker-1.13.1-ppc64el/ xenial main’ >> /etc/apt/sources.list
$ sudo apt-get update && apt-get install docker-engine
To use docker as a non-root user (e.g. your-username), execute the following to add the user to the docker group.
$ sudo usermod -aG docker your-username
Perform the rest of the tasks as the user you added to the docker group.
Clone the repository and build the deb
First clone the nvidia-docker repository and checkout the ppc64le branch:
$ git clone https://github.com/NVIDIA/nvidia-docker.git && git fetch –all && cd nvidia-docker
$ git checkout ppc64le
Next, create the installable plugin package. This uses docker, so you’ll see some docker image builds happen. When the images have built and the deb has been built (inside a docker image), you’ll have a deb in your local filesystem.
$ make deb
When prompted to update the changelog, chose ‘n.’
Note: If you’re not running on Ubuntu, you can also run ‘make’ and ‘make install’ and then start the plugin manually by starting the nvidia-docker service. Building and installing the deb does this all for you.
Install and verify the deb
$ cd tools/dist && sudo dpkg -i nvidia-docker_1.0.0~rc.3-1_ppc64el.deb
Installing the deb installs an nvidia-docker plugin, as well as a wrapper for the docker command. You’ll notice that you can run all docker commands using nvidia-docker instead. From now on, if you’re working with GPUs, use 'nvidia-docker' instead of ‘docker.' For example, $ nvidia-docker images will list your docker images, just like $ docker images does.
Build the cuda images using Ubuntu 16.04 (xenial) as the image base. This will also use docker, so you will see docker images being created.
$ OS=ubuntu-16.04 make cuda
Note: If your build errors out complaining that some packages couldn’t be installed, scroll up and see if there was a warning that the repository didn’t have a Release file. Delete your cuda images and try again if so. Sometimes a network hiccup causes issues.
Now you’ll have new docker images:
The nvidia-docker images were created during the ‘make deb’ step, and the cuda images were just created. The cuda images will be used to run container with GPU workloads.
Using Your Images
Let’s make sure the images built correctly and that the nvidia-docker plugin is working.
$ nvidia-docker run --rm cuda:8.0 nvidia-smi
This is your GPU inside a container!
Now that you have a base cuda image, you can use it to create your own GPU workloads. You can create Dockerfiles using 'FROM cuda:8.0' to use the image you just created.
You can reach out to us by filing an issue in the nvidia-docker repo, or leaving a comment on this blog post.
[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"HW1W1","label":"Power ->PowerLinux"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"","label":""}}]
UID
ibm16169899