IBM Cloud Satellite: Build faster. Securely. Anywhere. Read more

Write once, run anywhere with multi-architecture CRI-O container images for Red Hat OpenShift

Containers bring true “write once, deploy anywhere” functionality to the enterprise world. And “anywhere” doesn’t just imply laptops and servers, but any architecture, like the amd64, arm, and s390x architectures.

IBM® and Red Hat® have been active contributors to the open source community for decades, and over the past five years both organizations have focused on improving incubating and graduated Cloud Native Computing Foundation (CNCF) projects like Istio. IBM has specifically focused on “write once, deploy anywhere” contributions, like multi-architecture manifests in 2017 which has been the key building block for this capability across the stack.

In this tutorial, I describe the steps required to build multi-architecture images (i.e images that can run on amd64, s390x, arm, ppc64le, etc.) that can be deployed on OpenShift®, Red Hat’s enterprise Kubernetes distribution.

Figure 1. Evolution of the open source container and Red Hat ecosystem

Evolution of the open source container and Red Hat ecosystem

Figure 1 shows the evolution of the container space in the OSS community and the Red Hat ecosystem. Note that while the underlying technology for containers have been around for more than two decades, containers weren’t popular until Docker came along and weren’t as popular in the enterprise until OpenShift came along. The Kubernetes re-base of OpenShift significantly improved enterprise adoption, and the latest changes (including CRI-O adoption and CoreOS) have further improved this. There were several other important ideas like BSD Jails that were skipped for terseness.

To enable multi-architecture, IBM made large contributions to the Docker codebase in 2017 which added support for manifests that let you link a platform to an image (while exposing the end result as the same image). For example, docker run hello-world first looks at the version (latest is implied if no version tag is specified), then it checks the local operating system and architecture (such as Linux or s390x) and queries that combination in the registry. Once it finds that combination, it pulls only that specific container locally. Multi-architecture images are similar to “fat binaries” at the container registry level, but single, OS, and architecture-specific images at the Docker daemon level.

Figure 2. Relationship between Kubernetes and containers via CRI-O

Relationship between Kubernetes and containers via CRI-O

By default, the Docker daemon looks at its current operating system and architecture. However, it is possible to force download of a specific platform/architecture using the --platform command, which is available in Docker API 1.32+ and requires that experimental features be turned on in Docker daemon.

For background, you can read the full specification of multi-architecture manifests along with more information on docker pull in the official Docker docs.

Figure 3. Multi-architecture manifests

Multi-architecture manifests

(Figure 3 was first used at DockerCon 2017 to describe multi-arch manifests.)

Prerequisites

To complete this tutorial, you’ll need to install:

These can be installed as follows:

dnf -y install buildah skopeo podman

Note: If this is a newly deployed RHEL instance, you will need to add git and vim to the above.

The entire list can also be downloaded with package name container-tools. The commands will vary depending on your distribution of Linux (zypper, apt, etc.).

Estimated time

It should take you about one hour to complete this tutorial

Relationship between the tools

For developers coming from Docker background, it is important to understanding the difference between Podman, Buildah, and Skopeo. Figure 4 shows a side-by-side comparison between Docker and Podman/Buildah/Skopeo. Notice the lack of containerd or a daemon.

Figure 4. Comparison of Docker and Podman/Buildah/Skopeo

Comparison of Docker and Podman/Buildah/Skopeo

Podman

Podman is a daemonless container engine for developing, managing, and running OCI containers on your Linux System. Containers can be run either as root or in rootless mode. Most Docker commands work for Podman — in fact, alias docker=podman provides the same developer experience.

Podman currently provides the following:

  • Support for multiple image formats including OCI and Docker
  • Support for multiple means of securely downloading images, including trust and image verification
  • Container image management (managing image layers, overlaying filesystems, etc.)
  • Full management of container lifecycle
  • Support for pods to manage groups of containers together
  • Resource isolation of containers and pods

Buildah

Buildah is much more than just a third-party tool for processing Dockerfiles. Buildah allows you to build container images one step at a time interactively. It does this by spawning an instance of the container from a base image. You can then use this container to execute all the necessary steps to get to your final image or some intermediate layer. Once you are done with a layer of the build, you can commit the container up to that point as an image tag to Buildah and restart the process from that tag as the base image. Once you are completely done, commit the final tag and remove the working containers. (Fun fact: The name of the tool was going to be “Builder,” but the creator of the project had a Boston accent — thus “Buildah” was born!)

  • buildah from — builds up a container root filesystem from an image or scratch
  • buildah config — adjusts defaults in the image’s configuration blob
  • buildah run — runs a command in the container’s filesystem using runc
  • buildah mount — mounts the container’s root filesystem on the host
  • buildah commit — commits the container’s changes to a new image

There is an important distinction between buildah run and docker run: The latter runs a Docker container, whereas buildah run is the equivalent of RUN in a Dockerfile.

Skopeo

Skopeo is all about working with images once they’re built, even in remote repositories — transferring them, inspecting them, and even deleting them. The remote repository is important because prior tooling required containers to be downloaded or “pulled” locally before inspecting.

Build the app using Buildah

buildah manifest has the following options:

  • buildah manifest create List — creates a new “image” that’s actually an image index/manifest list

  • buildah manifest add List image — adds an entry to the list; handles either local (names or IDs) or remote (docker://...) images

  • buildah manifest push List registry/repository:tag — pushes just the list

  • buildah manifest push --all List registry/repository:tag — pushes the list and everything it references

  • buildah rmi List — removes the list from local storage

And the following options:

  • create localhost/list
  • add localhost/list localhost/image
  • annotate --annotation A=B localhost/list localhost/image (or sha256:entryManifestDigest)
  • localhost/list sha256:entryManifestDigest
  • inspect localhost/list
  • push localhost/list transport:destination

And the following arguments:

  • arch — mostly Go arch names (amd64, s390x, arm64, pp64le, etc.)
  • os — Linux, Windows
  • os-version — mostly unused, except maybe when OS is Windows
  • variant — mostly unused, except for ARM
  • features — unused, in Docker format but reserved in OCI format
  • os-features — mostly unused (unless Windows)

In the “good ol’ days,” we had just one mechanism for building multi-architecture containers — build on the target architecture; this was no different than building source from scratch on each architecture. With modern virtualization technology, however, we have several elegant mechanisms for building cross-architecture images. Here they are, listed from slowest to fastest:

  • Build in a VM, push to registry, for each architecture. Build list, push list. Cross-arch emulation is really slow.
  • Build for runtime, emulate for RUN (qemu-user-static), and push along with list.
  • Cross-compile on build host for runtime arch, install onto a suitable base image, push along with list. Mark the image with the correct architecture using multi-arch manifests.
  • Build on actual hardware, push to registry, for each architecture. Build list, push list. Add images using their digests or architecture-specific tags. This is the mechanism most used in production due to its efficiency. You do not have one specific architecture slowing the whole DevOps pipeline down, since it’s being emulated.

Steps

Now that you have sufficient background on the tooling and differences from Docker, it’s time to build and run the images using the recommended tooling from Red Hat:

To build a simple image, you’ll use a go-hello-world that deploys a Go HTTP server serving on port 8080.

  1. Download the source from GitHub:

    git clone https://github.com/e-desouza/go-hello-world.git
    
  2. Navigate into the source and build the Dockerfile:

    buildah build-using-dockerfile --tag go-hello-world --override-arch s390x Dockerfile
    

    Note: By default, Buildah doesn’t use Dockerfiles for building containers — thus the use of build-using-dockerfile; this can also be shortened to bud.

  3. Inspect the image as a learning exercise with buildah inspect go-hello-world; this will have a lot of text in JSON format (274 lines for this image).

    {
        "Type": "buildah 0.0.1",
        "FromImage": "localhost/go-hello-world:latest",
        "FromImageID": "...",
        "FromImageDigest": "...",
        "Config": "...",
        "Manifest": "...",
        "Container": "",
        "ContainerID": "",
        "MountPoint": "",
        "ProcessLabel": "",
        "MountLabel": "",
        "ImageAnnotations": null,
        "ImageCreatedBy": "",
        "OCIv1": {
            "created": "...",
            "architecture": "s390x",
            "os": "linux",
            "config": {
                "User": "nonrootuser:nonrootuser",
                "Env": "...",
                "Entrypoint":"...",
                "WorkingDir": "/app"
            },
            "rootfs": "...",
            "history": ["..."]
        },
        "Docker": "...",
        },
        "DefaultMountsFilePath": "",
        "Isolation": "IsolationDefault",
        "NamespaceOptions": "...",
        "Capabilities": null,
        "ConfigureNetwork": "NetworkDefault",
        "CNIPluginPath": "",
        "CNIConfigDir": "",
        "IDMappingOptions": {
            "HostUIDMapping": true,
            "HostGIDMapping": true,
            "UIDMap": [],
            "GIDMap": []
        },
        "History": "..."
        "Devices": null
    }
    

    I have commented out the irrelevant content and left in the two key values (pun intended). The most important keys are the os and the architecture keys, and our images shows:

    "architecture": "s390x",
    "os": "linux",
    

    A manifest can contain multiple os and architecture permutations. This is just a single image created on one architecture, so you can only see one permutation.

  4. You can do the same on an Intel-based server. I have an RHEL 8.2 VM on IBM Cloud and I ran this:

    buildah build-using-dockerfile --tag go-hello-world --override-arch amd64 Dockerfile
    

    This time, the only difference is the architecture. I don’t need to change code, Dockerfiles, or anything — just git clone and buildah build, and I have the same container builder for the amd64 architecture.

    {
        "..."
        "OCIv1": {
            "created": "...",
            "architecture": "amd64",
            "os": "linux",
        "..."
    }
    

    Some images (such as the hello-world on Dockerhub) have all of the possible permutations of architecture and os:

    podman pull hello-world
    buildah inspect hello-world
    

    There are nine different combinations of architecture and platform at the time of this writing (see Figure 5).

    Figure 5. Architecture/OS combinations

    Architecture/OS combinations

Manifest manipulation

Manifests are just metadata, so they can be manipulated using Buildah without the need to rebuild any containers images.

buildah manifest create thinklab/go-hello-world:latest

Next, add an existing amd64 image to the manifest:

buildah manifest add --override-arch=amd64 --override-os=linux  --os=linux --arch=amd64 thinklab/go-hello-world:latest docker://thinklab/go-hello-world:amd64-latest

And you do the same for the s390x image:

buildah manifest add --override-arch=s390x --override-os=linux  --os=linux --arch=s390x thinklab/go-hello-world:latest docker://thinklab/go-hello-world:s390x-latest

Sometimes you may need to amend existing manifests. You can do this using the --amend command:

buildah manifest create thinklab/go-hello-world:latest \
--amend thinklab/go-hello-world:amd64-latest

Finally, the push command pushes your manifest to the registry:

buildah manifest push --all thinklab/go-hello-world:latest docker://thinklab/go-hello-world:latest

Note the docker:// requirement. You need to specify what protocol is being used here — in this case, Buildah follows the Docker protocol. This was not needed in native Docker, but it is needed now as there are several protocols that can be used other than docker://.

Troubleshooting

You may encouter some issues when doing this for the first time. Here are a couple different scenarios:

  • Trying to create a manifest when once already exists

     Buildah manifest create go-hello-world > error creating image to hold manifest list: image name "localhost/go-hello-world:latest" is already associated with image "2f354eb335212aa8d9ed89eac0417d364fd3fbc3f41ed046ef45c1ad3bebf316": that name is already in use.
    

    Solution: Make use of the --append CLI option if a manifest already exists.

  • Trying to build on the wrong architecture

      STEP 1: FROM golang:alpine AS builder
      ..
      Writing manifest to image destination
      Storing signatures
      STEP 2: RUN apk update
      standard_init_linux.go:211: exec user process caused "exec format error"
      error building at STEP "RUN apk update": error while running runtime: exit status 1
    

    exec format error is a dead giveaway when Go binaries are compiled on one architecture but run/executed on another.

This might even happen at run-time if you’re trying to pull containers built for one architecture but deploying on another. In the run-time scenario, you’ll most likely see CrashLoopBackOff during container spawning. You can look at the logs using the k8s console or via CLI.

Solution: Ensure the binaries are built on the same architecture as they are run using some of the methods mentioned earlier (building natively, emulation, cross compilation, etc.).

Summary

This tutorial walked you through the differences between Docker-based tooling and newer, more efficient and purpose-built tooling such as Podman, Skopeo, and Buildah. My next tutorial will show you how to use this knowledge to build multi-architecture containers using s2i and Red Hat OpenShift buildconfigs.

To experiment with OpenShift on IBM LinuxONE/IBM Z, head over to our LinuxONE Community Cloud for a no-cost sandbox environment. This is a great way to experiment with an alternative architecture used in production in almost all of the Fortune 100 companies without needing to buy hardware or pay for instances on an ARM based cloud.