Taxonomy Icon

Containers

Docker is so popular because it created a unified way to package, run, and maintain containers from convenient command line interface (CLI) and HTTP API tools. This simplification lowered the barrier to entry to this technology, where it became feasible to package applications and their runtime environments into self-contained images in one simple Dockerfile. Docker empowers you to deliver more complex projects, yet you still have to configure those containers. In this article, I show how Ansible can bring the features of configuration managers with a clearer syntax. You’ll learn how to build any stack, with just Python and Docker installed.

Before I get into the details of Ansible, examine these points that were mentioned in an analysis about Ansible:

  • Despite the rise of new workflows that were brought on by containers, orchestration and configuration tools are thriving.
  • New players such as Ansible and Salt are challenging existing tools such as Chef and Puppet.
  • Many developers who are involved with Docker are also concerned about these tools.

To be clear, with Docker you can spin up fully isolated stack environments in a matter of seconds, or replicate an exact setup between servers. However, Docker does not include robust tools that provide an end-to-end experience, both for development and for production. The Docker team addressed these evolving needs with new clustering tools, trying to morph Docker into a reliable solution for running containers at scale. Nonetheless, Docker still requires that you manually hardcode tasks and repeat common setups. So, the key Docker processes of orchestration and the configuration management of containers are yet to be solved. In this article, you’ll learn how to use Ansible with Docker to help address these issues.

The rise of DevOps

Modern applications usually involve a complex deployment pipeline before they move to production. Best practices suggest that you release code early and often, following each small iteration. Manually performing the tasks is not scalable, and organisations have started to refine the process half-way between developers and system administrators, so DevOps was born. Since then, agile teams are trying to strengthen and automate the way code is tested and delivered to their users.

By implementing state-of-the-art technologies and methodologies, companies gain confidence for the code on their server. Nevertheless, developers and system administrators continue to face numerous challenges as applications are growing in size and complexity. More than ever, community-driven tooling that supports products is needed.

The extensible design of Ansible

In this environment, Ansible offers an interesting framework to manage infrastructures. You can gain control over a server’s definition, such as the packages to install or the files to copy, and scale the configuration to thousands of servers. Ansible playbooks constitute a safe representation of the desired state of the cluster. Its YAML syntax and extensive list of modules produce readable configuration files that any developer can quickly understand. Unlike Chef or Puppet, Ansible is agentless, which means all you need to run commands on remote hosts is an SSH connection. As you can see, Ansible can easily handle the DevOps complexity.

Ansible, however, was designed before the fast rise of containers and their revolution in the cloud development environment. So, is Ansible still relevant? The paradigms and complex development environments of microservices have introduced new requirements:

  • Lightweight images. For ease of transportation or cost savings, images are stripped down to their minimal dependencies.
  • Single purpose, single process. The SSH daemon does not need to run if it’s not strictly needed by the application.
  • Ephemeral. Containers are expected to die, move, and resurrect all of the time.

In this context, the extensible architecture of Ansible addresses these issues. A Docker module manages hosts and containers at a higher level. Although you might debate which orchestration tool (Kubernetes, from Google, or Centurion, from New Relic) is best suited to this environment, the Docker module performs efficiently, which is why I used it in this article. However, you can also build containers that start from their official Ansible image, and run playbooks in local mode from the inside. Although this approach fits remarkably well with Packer, and certainly suits many use cases, its drawbacks are often deal breakers:

  • You’re locked in with one base image and no longer can take advantage of the special recipes or other stacks.
  • The resulting artifact has installed Ansible and its dependencies, which have nothing to do with the actual application and makes the artifact heavier.
  • Although Ansible can manage thousand of servers, it only provisions a single container.

This approach considers containers as small VMs, where you would use a specific solution. Fortunately, Ansible has a modular design. Modules are spread among different repositories, and most of the capabilities of Ansible can be extended through plugins.

In the next section, you’re going to set up an effective environment to adapt Ansible to your needs.

Setting up an Ansible environment

Let’s say you want a tool that is trivial to deploy and that configures application environments in lightweight containers. Separate from those containers, you need a client node with Ansible installed that you will use to send commands to a Docker daemon. This setup is shown in Figure 1.

Figure 1. Components required to provision containers with Ansible
Components required to provision containers with Ansible


The dependencies that you must manage in this configuration are minimized by running Ansible from a container. This architecture limits the host to a communication bridge between containers and commands.

Many options are available to install Docker on your server:

  • Use docker-machine to install it on remote hosts.
  • Install locally. As a side note, you probably don’t want to manage a serious container-based infrastructure by yourself; in this case, consider external providers.
  • Rely on external providers.
  • Use boot2docker, which is a lightweight Linux distribution that runs Docker containers, on Windows and Mac.

Whatever solution you choose, make sure that it deploys a Docker version 1.3 or later (version 1.3 introduced process injection). You also need to run an SSH server to securely process Ansible commands.

Commands in Listing 1 set up a convenient and robust authentication method by using public keys.

Listing 1. Commands to set up authentication by using public keys

#install dependencies
sudo apt‑get install ‑y openssh‑server libssl‑dev
#generate private and public keys
ssh‑keygen ‑t rsa ‑f ansible_id_rsa
#allow future client with this public key to connect to this server
cat ansible_id_rsa.pub >> ~/.ssh/authorized_keys
#setup proper permissions
chmod 0700  ~/.ssh/
chmod 0600  ~/.ssh/authorized_keys
#make sure the daemon is running
sudo service ssh restart

Configuring SSH and security concerns are beyond the scope of this article. The curious reader can explore the /etc/ssh/sshd_config file to learn more about the available options to configure SSH.

The next step is to load the public key on the client container that runs Ansible and to provision the builder container. Use a Dockerfile to provision the builder. See Listing 2.

Listing 2. Dockerfile that provisions the builder

FROM python:2.7

#Install Ansible from source (master)
RUN apt‑get ‑y update && 
    apt‑get install ‑y python‑httplib2 python‑keyczar python‑setuptools python‑pkg‑resources
git python‑pip && 
    apt‑get clean && rm ‑rf /var/lib/apt/lists/ /tmp/ /var/tmp/*
RUN pip install paramiko jinja2 PyYAML setuptools pycrypto>=2.6 six 
    requests docker‑py  #docker inventory plugin
RUN git clone http://github.com/ansible/ansible.git /opt/ansible && 
    cd /opt/ansible && 
    git reset ‑‑hard fbec8bfb90df1d2e8a0a4df7ac1d9879ca8f4dde && 
    git submodule update ‑‑init

ENV PATH /opt/ansible/bin:$PATH
ENV PYTHONPATH $PYTHONPATH:/opt/ansible/lib
ENV ANSIBLE_LIBRARY /opt/ansible/library

#setup ssh
RUN mkdir /root/.ssh
ADD ansible_id_rsa /root/.ssh/id_rsa
ADD ansible_id_rsa.pub /root/.ssh/id_rsa.pub

#extend Ansible
#use an inventory directory for multiple inventories support
RUN mkdir ‑p /etc/ansible/inventory && 
    cp /opt/ansible/plugins/inventory/docker.py /etc/ansible/inventory/
ADD ansible.cfg  /etc/ansible/ansible.cfg
ADD hosts  /etc/ansible/inventory/hosts

These instructions are adapted from the official build and automate a working installation from commit fbec8bfb90df1d2e8a0a4df7ac1d9879ca8f4dde on the Ansible master branch.

Hosts and ansible.cfg configuration files (see Listing 3a and Listing 3b) are packed. By using a container, you can guarantee that you will share the same environment. For this example, the Dockerfile installs Python version 2.7.10 and Ansible 2.0.0.

Listing 3a. Hosts configuration file

#hosts
#this file is an inventory that Ansible is using to address remote servers. 
Make sure to replace the information with your specific setup and variables 
that you don't want to provide for every command.

docker#host properties where docker daemon is running
192.168.0.12 ansible_ssh_user=xavier

Listing 3b. Ansible configuration file


#ansible.cfg

defaults
#use the path created from the Dockerfile
inventory = /etc/ansible/inventory

#not really secure but convenient in non‑interactive environment
host_key_checking = False
#free you from typing ‑‑private‑key parameter
priva_key_file = ~/.sh/id_rsa

#tell Ansible where are the plugins to load
callback_plugins   = /opt/ansible‑plugins/callbacks
connection_plugins = /opt/ansible‑plugins/connections

Before you can build the Ansible container, you must export the DOCKER_HOST environment variable, since Ansible will use it to connect to the remote Docker daemon. When you use an HTTP endpoint, you need to modify /etc/default/docker (see Listing 5).

Listing 5. Modifying /etc/default/docker

#make docker to listen on HTTP and default socket
DOCKER_OPTS="‑H tcp://0.0.0.0:2375 ‑H unix:///var/run/docker.sock"

Enter the command sudo service docker restart to restart the Docker daemon so that you pick up the changes to its configuration file.

The following commands build and validate the Ansible container from which you will enter commands (see Listing 4).

Listing 4. Commands to build and validate the Ansible container

#you need DOCKER_HOST variable to point to a reachable docker daemon
#pick the method that suits your installation

#for boot2docker users
eval "$(boot2docker shellinit)"
#for docker‑machine users, provisioning the running VM was named "dev"
eval "$(docker‑machine env dev)"
#for users running daemon locally
export DOCKER_HOST=tcp://$(hostname ‑I | cut ‑d" " ‑f1):2375
#finally users relying on a remote daemon should provide the server's public ip
export DOCKER_HOST=tcp://1.2.3.4:2375

#build the container from Dockerfile
docker build ‑t article/ansible .

#provide server API version, as returned by docker version | grep ‑i "server api"
#it should be at least greater or equal than 1.8
export DOCKER_API_VERSION=1.19

#create and enter the workspace
docker run ‑it ‑‑name builder 
    #make docker client available inside
    ‑v /usr/bin/docker:/usr/bin/docker 
    ‑v /var/run/docker.sock:/var/run/docker.sock 
    #detect local ip
    ‑e DOCKER_HOST=$DOCKER_HOST 
    ‑e DEFAULT_DOCKER_API_VERSION=DOCKER_API_VERSION 
    ‑v $PWD:/app ‑w /app \  #mount the working space
    article/ansible bash

#challenge the setup
$ container > ansible docker ‑m ping
192.168.0.12 | SUCCESS => {
    "invocation": {
        "module_name": "ping",
        "module_args": {}
    },
    "changed": false,
    "ping": "pong"
}

So far, so good. You’re able to enter commands from a container. In the next section, you will use the Docker-specific extensions to Ansible.

Extending your Ansible environment with playbooks and plugins

At its core, Ansible automates its execution through playbooks, which are YAML files that specify every task to perform and their properties (see Listing 6).

Ansible also consults inventories to map user-provided hosts to concrete endpoints in the infrastructure. Unlike the static hosts file that is used in the previous section, Ansible also supports dynamic content. The built-in lists include a Docker plugin that can query the Docker daemon and share a significant amount of information to Ansible playbooks.

Listing 6. An Ansible playbook

#provision.yml

‑ name: debug docker host
  hosts: docker
  tasks:
  ‑ name: debug infrastructure
    #access container data : print the state
    debug: var=hostvars["builder"]"docker_state"
#you can target individual containers by name
‑ name: configure the container
  hosts: builder
  tasks:
   ‑ name: run dummy command
     command: /bin/echo hello world
     

The command in Listing 7 queries the Docker host, imports facts, prints some, and uses them to perform the second task against the builder container (shown in Listing 6).

Listing 7. Command to query the Docker host

ansible‑playbook provision.yml ‑i /etc/ansible/inventory
#...
TASK [setup] **
fatal: [builder]: FAILED! => {"msg": "ERROR! SSH encountered an unknown error during the
connection. Re‑run the command using ‑vvvv, which enables SSH debugging
output to help diagnose the issue", "failed": true}
#...

Ansible can’t reach the container because it doesn’t run an SSH server. The SSH server is an additional process to manage, and it is completely unrelated to the actual application. In the next section, we will remove this difficulty by using a connection plugin.

Connection plugins are classes that implement commands for transport, such as SSH or local execution. Docker 1.3 came with docker exec and the ability to run tasks inside the container namespace. And, because you learned earlier how to target specific containers, you can use this ability to process the playbook.

Like other plugin types, connection hooks (see Listing 8) inherit an abstract class and are automatically available when you present it into the expected directory (/opt/ansible-plugins/connections as you configured in the configuration file ansible.cfg).

Listing 8. Connection plugin

#saved as ./connection_plugins/docker.py

import subprocess
from ansible.plugins.connections import ConnectionBase

class Connection(ConnectionBase):

   @property
    def transport(self):
        """ Distinguish connection plugin. """
        return 'docker'

   def _connect(self):
        """ Connect to the container. Nothing to do """
        return self

   def exec_command(self, cmd, tmp_path, sudo_user=None, sudoable=False,
                     executable='/bin/sh', in_data=None, su=None,
                     su_user=None):
        """ Run a command within container namespace. """

    if executable:
        local_cmd = "docker", "exec", self._connection_info.remote_addr, executable, '‑c', cmd    else:
        local_cmd = '%s exec "%s" %s' % ("docker", self._connection_info.remote_addr, cmd)

    self._display.vvv("EXEC %s" % (local_cmd), host=self._connection_info.remote_addr)
    p = subprocess.Popen(local_cmd,
        shell=isinstance(local_cmd, basestring),
        stdin=subprocess.PIPE, stdout=subprocess.PIPE,
        stderr=subprocess.PIPE)

    stdout, stderr = p.communicate()
    return (p.returncode, '', stdout, stderr)

    def put_file(self, in_path, out_path):
        """ Transfer a file from local to container """
        pass

    def fetch_file(self, in_path, out_path):
        """ Fetch a file from container to local. """
        pass

    def close(self):
        """ Terminate the connection. Nothing to do for Docker"""
        pass
        

This code hooks in to the Ansible methods to run commands through a local docker exec, instead of the default ssh. You’ll need to rearrange a few setup steps to instruct Ansible to use this plugin (see Listing 9).

Listing 9. Connection plugin for docker exec

#modify the builder Dockerfile to upload the plugin code 
where Ansible is expecting connection plugins
echo "ADD connection_plugins/docker.py /opt/ansible‑plugins/connections/docker.py" >> Dockerfile

#then, you need to explicitly tell which connection hook to use 
when executing playbooks.
#you can achieve this by inserting the 'connection' property at the top 
of provision tasks in provision.yml

‑ name: configure the container
  connection: docker
  hosts: builder

#you are ready to redeploy the builder container 
#(providing DOCKER_HOST and DOCKER_API_VERSION are still set like before)

#rebuild the image
docker build ‑t article/ansible .

#restart the builder environment
docker run ‑it ‑‑name builder 
    #make docker client available inside
    ‑v /usr/bin/docker:/usr/bin/docker 
    ‑v /var/run/docker.sock:/var/run/docker.sock 
    #detect local ip
    ‑e DOCKER_HOST=$DOCKER_HOST 
    ‑e DEFAULT_DOCKER_API_VERSION=DOCKER_API_VERSION 
    ‑v $PWD:/app ‑w /app \  #mount the working space
    article/ansible bash

#rerun provisioning from inside
ansible‑playbook ‑i /etc/ansible/inventory provision.yml
#... Hurrah, full green output ...

So far, you have run Ansible tasks within containers without many requirements on the containers or on the host. While this implementation meets the initial requirements, imprecisions remain that still need to be addressed.

The previous code ran a task on the same node. A more realistic workflow would spin up a new base image, provision it, and finally commit, push, and shutdown the resulting artifact. Thanks to the built-in Docker module in Ansible, those steps can be achieved without additional code (see Listing 10).

Listing 10. Docker module in Ansible that spins up a new base image

‑‑‑
‑ name: initialize provisioning
  hosts: docker

  ‑ name: start up target container
    docker:
      image: python:2.7
      name: lab
      pull: missing
      detach: yes
      tty: yes
      command: sleep infinity
      state: started
  #dynamically update inventory to make it available down the playbook
  ‑ name: register new container hostname
    add_host: name=lab

‑ name: provision container
  connection: docker
  hosts: lab
  tasks:
      #...

‑ name: finalize build
  hosts: docker
  tasks:
    ‑ name: stop container
      docker:
        name: lab
        image: python:2.7
        state: stopped
        

As mentioned, it would be convenient to automatically name and store the image that was built on successful provisioning. Unfortunately, the Docker module in Ansible does not implement methods to tag and push images. You can overcome this limitation with plain shell commands (see Listing 13.

Listing 13. Shell commands to name and store images

#name the resulting artifact under a human readable image tag
docker tag lab article/lab:experimental

#push this image to the official docker hub
#make sure to replace 'article' by your own Docker Hub login (https://hub.docker.com)
#(this step is optional and will only make the image available from any docker host. 
You can skip it or even use your own registry)
docker push article/lab:experimental

Our tool is taking shape, but it still lacks an essential feature: layer caching.

Building containers with Dockerfiles often involves many iterations to get it right. To significantly speed up the process, successful steps are cached and reused in subsequent runs.

To replicate this behavior, our tool commits the container state after each successful task. If build errors occur, the tool restarts the provisioning process from the last snapshot. Ansible promises idempotent tasks, so previously successful ones won’t be processed twice.

With Ansible, you can hook on task events with callback plugins (see Listing 11). Those classes are expected to implement specific callbacks, which are triggered at various steps of the playbook lifecycle.

Listing 11. Callback plugin that hooks on task events

#save as callbackplugins/docker‑cache.py
import hashlib
import os
import socket

#Hacky Fix ImportError: cannot import name display
#pylint: disable=unused‑import
import ansible.utils
import requests
import docker


class DockerDriver(object):
    """ Provide snapshot feature through 'docker commit'. """

    def _init(self, author='ansible'):
        self._author = author
        self._hostname = socket.gethostname()
        try:
            err = self._connect()
        except (requests.exceptions.ConnectionError, docker.errors.APIError), error:
            ansible.utils.warning('Failed to contact docker daemon: {}'.format(error))
            #deactivate the plugin on error
            self.disabled = True
            return

        self._container = self.target_container()
        self.disabled = True if self._container is None else False

    def _connect(self):
        #use the same environment variable as other docker plugins
        docker_host = os.getenv('DOCKER_HOST', 'unix:///var/run/docker.sock')
        #default version is current stable docker release (10/07/2015)
        #if provided, DOCKER_VERSION should match docker server api version
        docker_server_version = os.getenv('DOCKER_VERSION', '1.19')
        self._client = docker.Client(base_url=docker_host,
                                     version=docker_server_version)
        return self._client.ping()

    def target_container(self):
        """ Retrieve data on the container you want to provision. """
        def _match_container(metadatas):
            return metadatas['Id'][:len(self._hostname)] == self._hostname

        matchs = filter(_match_container, self._client.containers())
        return matchs[0] if len(matchs) == 1 else None

    def snapshot(self, host, task):
        tag = hashlib.md5(repr(task)).hexdigest()
        try:
            feedback = self._client.commit(container=self._container['Id'],
                                           repository='factory',
                                           tag=tag,
                                           author=self._author)
        except docker.errors.APIError, error:
            ansible.utils.warning('Failed to commit container: {}'.format(error))
            self.disabled = True


#pylint: disable=E1101
class CallbackModule(object):
    """Emulate docker cache.
    Commit the current container for each task.

    This plugin makes use of the following environment variables:
        ‑ DOCKER_HOST (optional): How to reach docker daemon.
          Default: unix://var/run/docker.sock
        ‑ DOCKER_VERSION (optional): Docker daemon version.
          Default: 1.19
        ‑ DOCKER_AUTHOR (optional): Used when committing image. Default: Ansible

    Requires:
        ‑ docker‑py >= v0.5.3

    Resources:
        ‑ http://docker‑py.readthedocs.org/en/latest/api/
    """

    _current_task = None

    def playbook_on_setup(self):
        """ initialize client. """
        self.controller = DockerDriver(self.conf.get('author', 'ansible'))

    def playbook_on_task_start(self, name, is_conditional):
        self._current_task = name

    def runner_on_ok(self, host, res):
        if self._current_task is None:
            #No task performed yet, don't commit
            return
        self.controller.snapshot(host, self._current_task)

Because you uploaded the code in the expected location and rebuilt the builder container, you can register this plugin like you did with the docker exec connection plugin.

Listing 14. Command to register the callbacks plugin

#modify the builder Dockerfile to upload the code where Ansible is expecting callback plugins
echo "ADD callback_plugins/docker‑cache.py /opt/ansible‑plugins/callbacks/docker‑cache.py" >> Dockerfile

After you rebuild the builder container and rerun the Ansible playbook, the module is automatically loaded and you can see how intermediate containers were created (see Listing 12).

Listing 12. Docker images

REPOSITORY          TAG                     IMAGE ID            CREATED             VIRTUAL SIZE
factory             bc0fb8843e88566c    bbdfab2bd904        32 seconds ago      829.8 MB
factory             d19d39e0f0e5c133    e82743310d8c        55 seconds ago      785.2 MB

Conclusion

Provisioning is a complex process, and the implementation that you followed in this tutorial laid the foundations for further development. The code itself was simplified, and some steps still require human intervention. The cache implementation certainly deserves a lot more attention, with more specific commit namings or cleanup skills, for example.

Still, you crafted a tool that can run Ansible playbooks that manage the configuration of containers. With this implementation, you can use the full power of Ansible by combining, reusing, and setting up declarative build files for the microservices of an infrastructure. This solution helps to avoid lock-in issues. The plugins that you developed wrap playbooks that you can reuse against different targets, and minimal requirements make the project compatible with most providers.