Using nvidia-docker 2.0 with RHEL 7

While running deep learning frameworks (TensorFlow, PyTorch, Caffe, and so on) in a containerized environment has a lot of advantages, getting nvidia-docker installed and working correctly can be a source of frustration. This is especially evident when trying to work in a Red Hat® Enterprise Linux (RHEL) environment, where the RHEL shipped version of Docker is slightly different than the upstream Docker editions. In this tutorial, I’m going to walk through why RHEL Docker is different, why we can’t just install nvidia-docker 2.0 in this environment, and why that isn’t really a bad thing.

Note : For those of you who aren’t interested in the background, and just want to install nvidia-docker 2 on RHEL, you can jump to the section here.

Red Hat versus upstream Docker

Most Linux® distributions include a Docker package in their upstream repositories. For non-Red Hat distributions, a version of Docker is chosen, tested, and shipped with minimal changes, while upstream Docker development continues along with regular releases shipped as docker-ce. This differs, however, when you get to RHEL. Because RHEL requires subscriptions, entitlements, and so on they have more requirements for running Red Hat in a containerized environment. To accommodate these differences, Red Hat has provided a modified Docker container run time or runc (named docker-runc). We will discuss two of the additions to this container run time before moving on.

  • Access to Red Hat specific Docker registries

    If you’re using upstream Docker packages, docker-ce or docker-ee on RHEL, and want to build RHEL-based images, you’d have to either produce your own RHEL base, or use CentOS. With the RHEL version of Docker, you have access to the full list of images that Red Hat provides (https://access.redhat.com/containers/).

  • Passing the host server’s entitlement credentials to a container at runtime

    The second addition, and arguable more important, is the ability to have your host machine’s entitlement passed along to any Red Hat-based containers. This is a huge help when you have containers that need to periodically install packages. With the automatic entitlement support, there’s no need to manually configure repository links, or write scripts to entitle the container.

Installing Docker for IBM POWER on RHEL 7

Before we move on to nvidia-docker, let’s cover how to install RHEL’s flavor of Docker. You need to first enable the extras repo, and after that it is as simple as installing the docker package.

sudo yum-config-manager --enable extras
sudo subscription-manager repos --enable=rhel-7-for-power-le-extras-rpms
sudo yum makecache fast
sudo yum install -y docker

nvidia-docker 1.0 versus nvidia-docker 2.0

Now that you have the correct version of Docker installed, I’ll give you an abridged rundown of what nvidia-docker provides, and the difference between nvidia-docker 1.0 and nvidia-docker 2.0. You can get more information about nvidia-docker from their GitHub page (https://github.com/NVIDIA/nvidia-docker).

Initially, nvidia-docker was created to address a critical need in the container world, which is the ability to share a kernel module and devices with a running container. As most folks know, containers are glorified user spaces running on the host kernel. This means no matter what you’re running in your container, it is still dependent on what’s available in the host kernel, including kernel modules. In our case, we’re interested in the NVIDIA module that provides communication to the GPUs on the system. While you can install the drivers in a container to get access to a GPU, you run into problems if the host and container versions of the driver ever differ. This makes managing containers almost impossible. On a same note, devices aren’t passed into containers by default. This is done with intention, and any needed device can be manually added to a container using the --device argument. Because NVIDIA GPUs are all named the same (/dev/nvidia0, /dev/nvidia1, and so on) it became busy work to constantly add them on a docker run statement. So nvidia-docker sets out to solve both of these problems without requiring too much upfront work from an end user.

nvidia-docker 1.0: The first pass

The original nvidia-docker package (no longer maintained) accomplished its task using a combination of a Docker plug-in (https://docs.docker.com/engine/extend/plugin_api/) and a Docker volume. All relevant libraries and binaries needed from the NVIDIA driver were copied over to the Docker volume, and the nvidia-docker plug-in was used to mount both the volume, and all necessary NVIDIA devices to a container when executed. While this method worked, it had a few drawbacks. First, every time a driver was updated, nvidia-docker would have to create a new volume and copy all the necessary files over. In addition, because they were copies, any changes to host files weren’t reflected in the container. The second problem was that the plug-in often collided with other plug-ins for additional tools such as Kubernetes. This prompted a rewrite, and nvidia-docker 2.0 was born.

nvidia-docker 2.0: Where we are now

nvidia-docker 2 decided to do away with plug-ins and volumes, opting instead, to utilize OCI runtime hooks to add the necessary NVIDIA libraries and devices. (https://github.com/opencontainers/runtime-spec/blob/master/config.md#prestart).

The nvidia-docker project was split into three repositories. libnvidia, nvidia-container-runtime, and the original nvidia-docker.

libnvidia : https://github.com/NVIDIA/libnvidia-container

libnvidia is the lowest level, and provides a nvidia -container-cli binary that allows for the manipulation of NVIDIA GPU containers. While it is useful to know this exists, it is not necessary to invoke the CLI by hand. The nvidia-container-runtime-hook will manage this for you.

nvidia-container-runtime : https://github.com/NVIDIA/nvidia-container-runtime

The nvidia-container-runtime repository contains the main piece of nvidia-docker 2 code. It provides two features. First, it contains the prestart hook, nvidia-container-runtime-hook. This hook will utilize the nvidia-container-cli and set up a container with the necessary NVIDIA drivers and GPUs. The second piece is a package named nvidia-container-runtime which provides a new run time named nvidia. This run time is essentially the same as runc, with a call to the prestart hook linked in.

nvidia-docker : https://github.com/NVIDIA/nvidia-docker

nvidia-docker2 is the final package, and at this point is just a wrapper that sets up the nvidia run time to be the default when invoking docker commands.

Now we can get to the entire purpose of this tutorial. RHEL has made many changes to their version of Docker including a new runc named docker-runc. because of this, we are not able to install the nvidia run time and, by extension, the nvidia-docker2 package. But this really isn’t a set back at all. As I’ve alluded to above, the key to nvidia-docker 2’s functionality isn’t the nvidia-docker binary, or even the nvidia run time. It is the OCI runtime hook support, and despite RHEL’s use of a custom runc, it supports the use of these hooks.

Installing nvidia-docker 2.0 on RHEL

With all of the background into nvidia-docker 2.0, I feel we have enough to dive right into enabling NVIDIA’s runtime hook directly. We won’t be installing nvidia-docker2, or the nvidia-container-runtime, but we will still be installing the key features that make up nvidia-docker 2.0 functionality.

The installation steps are also listed on NVIDIA’s nvidia-docker GitHub page (https://github.com/NVIDIA/nvidia-docker#centos-7-docker-rhel-7475-docker). The only piece not accounted for are users of SELinux. We will get to that in the next section.

  1. Remove nvidia-docker 1.0.
  2. Enable access to the nvidia-container-runtime repository.
  3. Install nvidia-container-runtime-hook.
  4. Add hook to the path that OCI understands.
  5. (Optional) Change SELinux file permissions for NVIDIA devices.
# 1. If you have nvidia-docker 1.0 installed: we need to remove it and all existing GPU containers
docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
sudo yum remove nvidia-docker

# 2. Add the package repositories
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.repo | \
  sudo tee /etc/yum.repos.d/nvidia-container-runtime.repo

# 3. Install the nvidia runtime hook
sudo yum install -y nvidia-container-runtime-hook

# NOTE:  Step 4 is only needed if you're using the older nvidia-container-runtime-hook-1.3.0  The default(1.4.0) now includes this file
# 4. Add hook to OCI path
#sudo mkdir -p /usr/libexec/oci/hooks.d

#echo -e '#!/bin/sh\nPATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" exec nvidia-container-runtime-hook "$@"' | \
#  sudo tee /usr/libexec/oci/hooks.d/nvidia

#sudo chmod +x /usr/libexec/oci/hooks.d/nvidia

# 5. Adjust SELINUX Permissions
sudo chcon -t container_file_t  /dev/nvidia*

After you’ve followed these steps, the nvidia-container-runtime-hook will be invoked whenever you issue the docker run command. The hook will look for specific NVIDIA keywords specified when images were built and act accordingly on them, such as NVIDIA_VISIBLE_DEVICES and NVIDIA_DRIVER_CAPABILITIES.

To test out this functionality, you can run a simple nvidia-smi container.

#Test nvidia-smi capabilities using default cuda container
docker run --rm nvidia/cuda-ppc64le nvidia-smi

SELinux considerations

If you’re running with SELinux in the Enforcing mode, you will have to take a few extra steps to use nvidia-docker 2. This step is necessary regardless of whether or not you’re using RHEL’s version of Docker or upstream. As you may have noticed in the steps above, there is a final chcon command which is run before starting up containers. Because the physical NVIDIA devices are passed through to a running container at run time, their file context (fcontext) determines if the Docker service has permission to access it. By default, the /dev/nvidia x devices have a fcontext of xserver_misc_device_t which, by default, does not grant permission to Docker. While, there are many fcontext options that will grant containers access, we’ve chosen to reassign the devices to container_file_t. You can do this manually with the chcon command used in the above example, or edit your SELinux configuration to specifically target those devices and alter their file contexts. You can reference this RHEL documentation that describes more on how to make fcontext changes permanent (https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/security-enhanced_linux/sect-security-enhanced_linux-selinux_contexts_labeling_files-persistent_changes_semanage_fcontext)

After reading this tutorial, I hope you now have a more comfortable understanding of nvidia-docker 2 (RHEL’s version of Docker), and how they interact with each other. It is easy to write off RHEL’s Docker when the nvidia-docker 2 package won’t install. In the previous paragraphs it should be clear that the meat of the new nvidia-docker wasn’t the nvidia-docker 2 wrapper, but rather its OCI runtime hook, which RHEL actually supports natively, while its upstream Docker that needs the additional container runtime modifications to support.