Kubernetes with OpenShift World Tour: Get hands-on experience and build applications fast! Find a workshop!

Automate the deployment of pod dependencies in Kubernetes

Do you often need to synchronize the deployment of a new design document or an API update with related application code? This article is an inspiration for DevOps teams who want to automate Kubernetes pod dependencies.

Prerequisites

To get the most out of this article, you should have a general understanding of Kubernetes, containers, continuous deployment and integration, and Cloudant design documents.

Estimated time

Take about 15 minutes to read this article.

How far can we go to automate the Ops part of DevOps?

As a developer, I want to be in full control. I want to define changes to external dependencies directly in my application code. I want to deploy updates to my NodeJS or Java application and see it rolling out alongside the old version. I want to avoid down time and conflicts between the database views that my web service needed before and those views that it needs now.

At the end of the day, the trend of cloud automation enables DevOps teams to own the end-to-end deployment.

Luckily, Kubernetes is making rolling upgrades a very straight-forward process. Together with continuous deployment and continuous integration tools such as Travis CI or the Toolchain developer tool for IBM Cloud, the process gets easier and easier to set up. Multiple deployments a day are now routine. However, it’s not always so easy to synchronize custom databases or API updates with application code. Often you need to orchestrate deployments.

This article describes one of the ways to synchronize pod dependencies using Kubernetes Init Containers.

Automation scope

Applications like microservices typically need a limited set of dependencies such as their own storage and API definition. Storage or API-related upgrades often evolve into a routine, and they beg for automation. Cloudant design document update is a great example. The critical part of continuous deployment of high availability applications is ensuring that the updated view is available the same time the service is upgraded.

An automated deployment process is suitable when you have the following kinds of the dependencies:

  • Databases (Cloudant, object storage)
  • Database design documents (indexes, views)
  • Seed documents
  • APIs (swaggers)
  • API changelogs
  • Automated integration tests

The configurations for all of these dependencies can be stored in the application code repository and packaged into the docker image.

Note: By configuration I mean code, not credentials. Credentials must be injected as environment variables or provided by configuration services.

Review the following structure of an example config directory:

config
│ cloudant
│ │ {databaseSuffixOrFullName}
│ │ │ designs
│ │ │ │ {designName}.json
│ │ │ seeds
│ │ │ │ {seedName}.json
│ │ parameters.json (optional)
│ apis
│ │ {apiName}
│ │ │ {version}.yaml
│ │ │ {version}.md
│ │ │ {version}.test.json
│ │ │ {version}.test.js

Do we really need to store all of this information with the application code? No, but a complete control of all the parts is extremely liberating for developers. Fully self-contained services gives you the peace of mind that none of the dependencies can create a conflict between different service versions.

Init containers: the Kubernetes answer to application dependencies

The recommended way to install application dependencies in Kubernetes is through Init Containers. Init Containers are defined in the pod deployment, and they block the start of the application until they run successfully.

In the following, very basic, example, the Init Container creates a new Cloudant database:

apiVersion: v1
kind: Pod
metadata:
  name: app
spec:
  containers:
  - name: app
    image: registry.ng.bluemix.net/services/app:v1.0.1
    ports:
    - containerPort: 8080
  initContainers:
  - name: deploy
    image: appropriate/curl
    command: [ "sh" ]
    args: [ "-c", "curl -X PUT $URL/dbname" ]
    env:
    - name: URL
      valueFrom:
        configMapKeyRef:
          name: config
          key: cloudantApiKeyUrl

However, true dependency logic is far more complex and requires a new application. Let’s call it the “Deployment Manager”. The initContainer script could call the Deployment Manager running as a service, but a much more self-contained approach is to create the Deployment Manager as another docker in the registry and integrate its image in the deployment scripts or helm charts.

The deployment file will then look like this:

apiVersion: v1
kind: Pod
metadata:
  name: app
spec:
  containers:
  - name: app
    image: registry.ng.bluemix.net/services/app:v1.0.1
    ports:
    - containerPort: 8080
  initContainers:
  - name: deploy
    image: registry.ng.bluemix.net/utilities/deployment_manager:v0.0.1
    command: [ "sh" ]
    args: [ "-c", "cd /usr/src/app/;npm run start-as-init-container" ]
    env:
    - name: config
      valueFrom:
        configMapKeyRef:
          name: config
          key: config

How to keep all dependency inputs in the application code

In the previous example, the Deployment Manager needs to get all the inputs from the environment variable. However, the goal is to get the inputs out of the application code, which means extracting the inputs (all but credentials) out of the application’s container.

The Deployment Manager Init Container does not see the application container in the pod, and it cannot communicate with it directly. The main trick to make this approach work is to load the application container beforehand. Then extract the necessary input into a shared volume for the Deployment Manager’s use later.

The following examples shows using another Init Container for that purpose:

apiVersion: v1
kind: Pod
metadata:
  name: app
spec:
  volumes:
    - name: deployment-volume
      emptyDir: {}
  containers:
  - name: app
    image: registry.ng.bluemix.net/services/app:v1.0.1
    ports:
    - containerPort: 8080
    volumeMounts:
      - name: deployment-volume
        mountPath: "/init"
  initContainers:
  - name: copy
    image: registry.ng.bluemix.net/services/app:v1.0.1
    command: [ "sh" ]
    args: [ "-c", "set -e;cp -v -r /usr/src/app/config/* /init/" ]
    volumeMounts:
      - name: deployment-volume
        mountPath: "/init"
  - name: deploy
    image: registry.ng.bluemix.net/utilities/deployment_manager:v0.0.1
    command: [ "sh" ]
    args: [ "-c", "cd /usr/src/app/;npm run start-as-init-container" ]
    env:
    - name: config
      valueFrom:
        configMapKeyRef:
          name: config
          key: config
    volumeMounts:
      - name: deployment-volume
        mountPath: "/init"

Why it works

The the containers start sequentially:

  1. The first Init Container named copy loads the application image but overrides the way the application normally starts by using a custom copy command. As long as the application Docker image supports the sh script, the copy command extracts all configuration files into the new deployment-volume shared volume under the /init path, and then exits. Any failure at this point produces an error and blocks the next steps.

  2. The next Init Container is your Deployment Manager application. It uses the same shared volume and finds all the required dependency inputs from the application left by the first container. The deploy container can take as long as it needs to install the dependencies. The roll out of the pod waits until the process exits successfully.

  3. Finally, the main application container also loads the same shared volume. This step is necessary because the Deployment Manager generates an init.json file as the output of the initialization. The init file contains the details about which version of a specific resource (for example, which Cloudant design document) the application should use.

To make the same process reusable in polyglot environments, use the standardized naming convention and structure of the config directory. JSON is one of the ideal formats for both input and output, but you can also use other formats.

The follow example shows init.json output:

{
  "init": {
    "cloudant": {
      "channels": {
        "designs": {
          "search": "search~2019-06-04T13:19:49.745Z"
        },
        "seeds": {
          "taxonomy": "taxonomy"
        }
      }
    },
    "apis": {
      "search": {
        "v3": {
          "api": "API~channels~v3~2019-06-01T23:15:18.609Z"
        }
      }
    }
  }
}

Rolling out a new application

Assume that an updated Cloudant design document is required in a new version (v1.0.2) of the application that is currently running in the environment.

You don’t want any down time during the rolling update of several pods. Terefore you must ensure that the two application versions can run at the same time.

This situation means that you can’t just remove old databases and views. Instead, you first need to create new ones and wait until the change is rolled out to all application pods in the replica. Then, only after the successful deployment, you must ensure that the old, unused dependencies are removed.

In practice, you need to choose a different name for each version, such as using a time stamp suffix added to the name of the design document.

Example: A Cloudant design document called search required by v1.0.1 is replaced by a different version in v1.0.2. The initial document, named search~2019-06-04T13:19:49.745Z, is installed in the database. At the time of the v1.0.2 installation, the Deployment Manager application compares the view using a deep diff. If it finds view doesn’t match the view in the application code, it installs the second version named search~2019-06-05T08:12:33.123Z.

During the rollout, the old pods still use the first design document, and the new pods start using the new one. There is no conflict between the pods, and the transition can happen without any down time. Moreover, if the application needs to roll back, the same process is applied. Again, at each moment, each pod uses exactly the view code it contains — the one it was tested with.

Cleaning up

So far so good. You have a beautiful separation of dependencies. But what do you do with dependencies you no longer need?

Because the goal is a fully automated system, you need an awareness of all the dependencies currently in use by all pods. Consider the following ways of of tracking the dependencies in use:

  • Maintain logs of dependencies used by the Deployment Manager.
  • Expose dependencies used by the application on request.
  • Set an expiration on each resource and require regular renewal.

You can use any of these approaches, as long as they can easily recover from failure. However, in most cases, the clean-up process must be able to read the list of all active pods to establish whether the deployed dependencies are still needed.

If required, the DevOps team might invest in an administration user interface to oversee the clean-up process, like the following example:

Administrator interface

Alternative solutions

You can also use continuous deployment and integration scripts to extract dependency configurations out of the application code. A custom-build manager can then ensure the dependencies are available before the application starts.

However, using Init Containers is usually more efficient. The full deployment can take as little as a couple of seconds, because all the work happens directly in the Kubernetes cluster with all the images readily available. Also, a definite benefit of Init Containers is that no pod starts without a full verification of all its dependencies.

Summary

Including pod dependencies into the application image can dramatically simplify the lifecyle management for an application. The application simply does not start until the right dependencies are available, and upgrades and downgrades are seamless. This suggested approach can help DevOps team concentrate more on the development and the business value, instead of operations.

To learn more, see Kubernetes Init Containers.

Acknowledgements

Many thanks to Vojtech Bartos who came up with the idea of copy container.

Ondrej Lehota