Installing NVIDIA Docker component and Python packages

After docker has been installed, install the NVIDIA docker component and the Python packages required to run the check service script.

Procedure

  1. Install nvidia-docker

    Follow the instructions below for your computer architecture.

    • For x86_64:

      • Remove nvidia-docker 1.0 and all existing GPU containers, if it's already installed:

        # docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
        # sudo yum remove nvidia-docker
        
      • Add the package repositories

        # distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
        # curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | \
        tee /etc/yum.repos.d/nvidia-docker.repo
        
      • Install nvidia-docker2 and reload the Docker daemon configuration:

        # sudo yum install -y nvidia-docker2
        # sudo pkill -SIGHUP dockerd
        
      • Set up the container runtime and tell Docker to use it as the default. Type the following, hitting enter after each line or simply copy the contents starting with { into /etc/docker/daemon.json:

        # cat /etc/docker/daemon.json << EOF
        {
        "default-runtime": "nvidia",
        "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
        }
        }
        
      • Restart Docker:

        # systemctl restart docker
        
      • Test that it worked:

        # docker run --rm nvidia/cuda:10.0-runtime-ubuntu18.04 nvidia-smi
        
    • For POWER9:

      • Install the nvidia-docker and nvidia-container-runtime repositories. Type the following, hitting enter after each line or simply copy the contents starting with [nvidia-docker] into /etc/yum.repos.d/nvidia-docker.repo:

        # cat > /etc/yum.repos.d/nvidia-docker.repo << EOF
        [nvidia-docker]
        name=nvidia-docker
        baseurl=https://nvidia.github.io/nvidia-docker/centos7/ppc64le
        repo_gpgcheck=1
        gpgcheck=0
        enabled=1
        gpgkey=https://nvidia.github.io/nvidia-docker/gpgkey
        sslverify=1
        sslcacert=/etc/pki/tls/certs/ca-bundle.crt
        
        [nvidia-container-runtime]
        name=nvidia-container-runtime
        baseurl=https://nvidia.github.io/nvidia-container-runtime/centos7/$basearch
        repo_gpgcheck=1
        gpgcheck=0
        enabled=1
        gpgkey=https://nvidia.github.io/nvidia-container-runtime/gpgkey
        sslverify=1
        sslcacert=/etc/pki/tls/certs/ca-bundle.crt
        
      • Install the container runtime and runtime hook:

        # yum install -y nvidia-container-runtime-hook
        # yum install -y nvidia-container-runtime
        # mkdir -p /usr/libexec/oci/hooks.d
        # echo -e '#!/bin/sh\nPATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" exec nvidiacontainer-runtime-hook "$@"' | sudo tee /usr/libexec/oci/hooks.d/nvidia
        # chmod +x /usr/libexec/oci/hooks.d/nvidia
        
      • Set up the container runtime and tell Docker to use it as the default:

        # cat /etc/docker/daemon.json << EOF
        {
        "default-runtime": "nvidia",
        "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
        }
        }
        
      • Restart Docker:

        # systemctl restart docker
        
      • Test that it worked:

        # docker run --rm nvidia/cuda-ppc64le:10.0-runtime-ubuntu18.04 nvidia-smi
        
      • You should see output like you would see when you run nvidia-smi on the host.

  1. Install Python packages for the check service script
    # pip install PyYAML requests colorama
    

Parent topic: Installing the Deep Learning Engine (DLE) component
Next topic:
Running the DLE installer