Microsurvival Part 3: Hello Docker

Note: This blog post is part of a series.

Hello, Developer! I am so glad you could join me today. I’m extremely happy that your child Appy told you about what we discussed, but even happier that you suggested meeting today. I know you want the best for Appy, to look good and get along with others. We’ve been talking about an appropriate and safe environment for Appy, especially as Appy continues to grow and mature.

So, you’re trying to work with Docker and need some tips, correct? Well, I am more than happy to help. It’s pretty easy. I see you have a Windows laptop like me, so you can follow along just fine!

First, let’s start by installing Docker. You will need to follow the instructions specified for your operating system. Make sure your version of Windows is compatible.

Now that you have Docker, you can run Docker commands using the Docker executable.

Because it’s common to start developing with a “Hello World!” program, let’s run a “Hello World!” container. Try running the command docker run busybox echo "Hello world" and you should get a similar output:

> docker run busybox echo "Hello world"
Unable to find image 'busybox:latest' locally
latest: Pulling from library/busybox
90e01955edcd: Pull complete
Digest: sha256:2a03a6059f21e150ae84b0973863609494aad70f0a80eaeb64bddd8d92465812
Status: Downloaded newer image for busybox:latest
Hello world

Allow me to explain what running this command did for us. Docker first searched for the image we are trying to pull on your local machine but couldn’t find it. The image was pulled from Docker Hub instead, Docker’s public registry of ready-made container images. The image we pulled is a BusyBox image, which combines tiny UNIX tools into a single executable. Then, Docker created an isolated container based on the image. Optionally, we specified which command to execute when running the container. We downloaded and ran a full application without the need to install it or any of its dependencies, all in the same command. Fascinating, don’t you agree?

What’s a docker run command?

Now, let me elaborate a bit more on the docker run command. This command runs existing images or pulls images from the Docker Hub registry. These images are software packages that get updated often, so there is more than one version for each image. Docker allows multiple versions of an image with the same name, but each version must have a unique tag. If you run the docker run <image> command without a tag, Docker assumes you are looking for the latest version of the image, which has the latest tag. To specify the version of the image you are looking for, simply add the tag docker run <image>:<tag>.

You might want to list the images using docker images to check the images created, their tags (or versions), creation dates, and sizes. After you run it, you should get an output similar to the following example:

> docker images
REPOSITORY    TAG      IMAGE ID       CREATED       SIZE
busybox       latest   59788edf1f3e   8 weeks ago   1.15MB

You can also use the docker container list command to list the running containers. If you run it right now, you probably won’t get any containers listed because the container no longer running. But if you add the -a or --all flag, both running and stopped containers are displayed in a similar output to this example:

>docker container list -a
CONTAINER ID IMAGE COMMAND CREATED ... 47130c55f730 busybox "echo 'Hello world'" About an hour ago ...

(Some of the details are omitted and replaced by ...)

Do you find the command docker container list a bit long? If so, there is an alternative command, docker ps, with the same function. You can optionally add the -a flag to show the stopped containers as well.

Since the container shows up as a stopped container, you can start it up again by using the docker start <container ID> command. And, you can stop a running container by using the command docker stop <container ID>.

Create a Docker image

Now that you know how to run a new container using an image from the Docker Hub registry, let’s make our own Docker image. The image we will create consists mainly of two things: the application you want to run and the Dockerfile that Docker reads to automatically build an image for our application. The Dockerfile is a document that contains all the commands that Docker users could call on the command line to assemble an image. Let’s first start with the simple Node.js application, and name it app.js. Feel free to customize the name if you’d like.

const http = require('http');
const os = require('os');var server = http.createServer(function(req,res){
  response.end("Hostname is " + os.hostname() + "\n");
})
server.listen(3000);

As you can see in this code sample, we are just starting an HTTP server on port 3000, which will respond with “Hostname is (the hostname of the server host)” to every request. Make a directory and name it as you like, then save the app code inside of it. Make sure no other files are present in that directory.

Now that we’ve created an application, it’s time to create our Dockerfile. Create a file called Dockerfile, copy and paste the content from the following code sample into that file, and then save it in the same directory as your app code.

FROM node:8
COPY app.js /app.js
CMD ["node", "app.js"]

Each FROM statement has a meaning in the Dockerfile. FROM designates which parent image you are using as a base for the image you are building. It is always better to choose a proper base image. We could have written FROM Ubuntu, but using a general-purpose image for running a Node application is unnecessary, because it increases the image overhead. In general, the smaller the better.

Instead, we used the specialized official Node runtime environment as a parent image. Another thing to note is that we specified the version with the tag FROM node:8 instead of using the default latest tag. Why? The latest tag result in a different base image used when a new version is released, and your build may break. I prefer to take this precaution.

We also used COPY <src> <dest> to copy new files or directories from <src> and add them to the file system of the container at the path <dest>. The COPY instruction copies new files or directories from <src> and adds them to the filesystem of the container at the <dest> path. Another Dockerfile instruction that has a similar function as COPY, is ADD. However, COPY is preferred because it is simpler. You can use ADD for some unique functions like downloading external resources or extracting .tar files into the image. You can explore that option further by checking out the Docker documentation.

Lastly, you can use the CMD instruction to run the application contained by your image. The command, in this case, would be node app.js.

There are other instructions that can be included in the Dockerfile. Reviewing them briefly now could prove helpful for you later on. The RUN command, for example, allows you to run commands to set up your application and you can use it to install packages. An example of that is RUN npm install. We can expose a specific port to the world outside the container we are building by using EXPOSE <port>.

Before writing all these commands, you should know some essential knowledge about Dockerfiles. Every command you write in the Dockerfile creates a layer, and each layer is cached and reused. Invalidating the cache of a single layer invalidates all the subsequent layers below it. For example, invalidation occurs after command change. Something to note is that Docker likes to keep the layers immutable. So, if you add a file in one layer and remove it in the next one, the image still contains that file on the first layer. It’s just that now the container doesn’t have access to it anymore.

Two things to keep in mind is that the fewer layers in a Dockerfile, the better. To change the inner layers in Docker images, Docker must remove all the layers above it first. Think about it like this: you’ll cry less if you have fewer layers to peel off an onion. Also, the most general steps and the longest steps should come first in your Dockerfile (the inner layers), while the specific ones should come later (outer layers).

Build an image from a Dockerfile

Now that you have a better understanding of the contents of the Dockerfile, let’s go ahead and build an image. First, make sure your path is inside the new directory that you made. Using ls command should only show you two files: app.js and Dockerfile. You can build the image by using the docker build -t medium. command. We tag it medium by using the -t flag and we target the current directory. (Note the dot at the end of the following command.)

>docker build -t medium .
Sending build context to Docker daemon 3.072kB
Step 1/3 : FROM node:8
8: Pulling from library/node
54f7e8ac135a: Pull complete
d6341e30912f: Pull complete
087a57faf949: Pull complete
5d71636fb824: Pull complete
0c1db9598990: Pull complete
89669bc2deb2: Pull complete
3b96ee2ed0b3: Pull complete
df3df33f8e3c: Pull complete
Digest: sha256:dd2381fe1f68df03a058094097886cd96b24a47724ff5a588b90921f13e875b7
Status: Downloaded newer image for node:8
---> 3b7ecd51ffe5
Step 2/3 : COPY app.js /app.js
---> 63633b2cf6e7
Step 3/3 : CMD ["node", "app.js"]
---> Running in 9ced576fdb46
Removing intermediate container 9ced576fdb46
---> 91c37fa82fe5
Successfully built 91c37fa82fe5
Successfully tagged medium:latest
SECURITY WARNING: You are building a Docker image from Windows against a non-Windows Docker host. All files and directories added to build context will have '-rwxr-xr-x' permissions. It is recommended to double check and reset permissions for sensitive files and directories.

You can use the run command again to run the built image, or you can use docker push to push it to a registry and pull it again on another computer from the registry using docker pull.

Congratulations! Now you know how to make a proper Dockerfile, build an image from that Dockerfile, and run it. These skills will help you send young Appy out into the world. If you’re feeling adventurous, you can check out the Docker docs and keep going. Good luck!



A previous version of this post was published on Medium.

I shellcheck and you should too

This post is about ShellCheck and the power that can come from using it to find bugs in shell scripts. Over the years, I wrote a lot of code that glues things together, as a DevOps engineer, production engineer, and developer advocate. Since I spend most of my time on Linux and Mac, bash is the obvious choice for me. However, I started talking to people at different conferences and found out that the idea of linting code is still considered a developer practice, instead of an operator practice, so I’m here to try to set the record straight.

Why

First, let me answer why I think linting code is an operator practice and you should use ShellCheck to do it. Standardizing on coding practices (yes, you have to code) will save you from insane overhead costs and tech debt in the future. When your team’s bash scripts start to look the same, they are easier to read, allowing people to start paying attention to what’s happening instead of where the do command or semicolon is. Running shellcheck enforces this. I should say that you can override ShellCheck suggestions, but that’s a team decision and out of the scope of this article. (Read up about ignoring a ShellCheck error if you need to go down this path.)

How

Now, let’s talk about how to use ShellCheck. There are a handful of ways to get it: apt, yum, homebrew, and even docker can be used to run this application. It’s also easy to add it to your continuous integration and continuous delivery (CI/CD) pipeline by pulling out every .sh file and running shellcheck against it.

Even adding it to your Makefile is simple:

check-scripts:
    # Fail if any of these files have warnings
    shellcheck myscripts/*.sh

Even Travis CI has ShellCheck built-in. Just add the following to your .travis.yml:

script:
  # Fail if any of these files have warnings
  - shellcheck myscripts/*.sh

Personally, I use a Docker container on my local laptop. So, I wrote a simple script named shellcheck.sh and any time I save a script .sh, I run shellcheck.sh script.sh against it. As you can see, it’s very straight forward:

#!/bin/bash

if [ $# -eq 0 ]
  then
    echo "You need to add an script to this check, shellcheck.sh script.sh"
    exit 1
else
  if [[ "$(ping www.google.com -c 1 | grep '100% packet loss' )" != "" ]]
  Then
    echo "You can't seem to connect to the internet, you should probably fix that."
    exit 1
  fi
  docker pull koalaman/shellcheck:latest
  docker run -v "$PWD:/mnt" koalaman/shellcheck "$1"
fi

Knowing my code is standardized and works as expected gives me peace of mind. The output of shellcheck shows exactly what your offending code is and there is a dedicated wiki page explaining why you should not code that way. I applaud the main developer, Vidar Holen, and his team for helping us become better developers.

Summary

Honestly, I’m just here to convince you, dear reader, to give this linter a shot. The overhead expense is minimal, and the benefits are immense. This is one of those things that you shouldn’t waste time debating, since it will make your life, and your team’s life, easier. You can integrate it into CI, run locally, and make sure everyone writes their scripts with a unified pattern for only a tad bit of work.

Scaffold and deploy a scalable web application in an enterprise Kubernetes environment

Deploying your application to a container, or multiple containers, is just the first step. When a cloud-native system becomes more established, it’s even more important to manage, track, redeploy, and repair the software and architecture.

You can choose from various techniques to help platforms provision, test, deploy, scale, and run your containers efficiently across multiple hosts and operating environments, to perform automatic health checks, and to ensure high availability. Eventually, these approaches transform an app idea into an enterprise solution.

The code patterns, tutorials, videos, and articles on IBM Developer about Red Hat OpenShift on IBM Cloud™ are a good place to start considering ways to use an enterprise Kubernetes environment with worker nodes that come installed with the Red Hat OpenShift on IBM Cloud Container Platform orchestration software. With Red Hat OpenShift on IBM Cloud, you can use IBM Cloud Kubernetes Service for your cluster infrastructure environment and the OpenShift platform tools and catalog that run on Red Hat Enterprise Linux for deploying your apps.

As you move forward in exploring how to work with combined Red Hat OpenShift on IBM Cloud capabilities, you will want to know how to scaffold a web application (both Node.js and Express), run it locally in a Docker container, push the scaffolded code to a private Git repository, and then deploy it. You can follow the details in the Scalable web application on OpenShift tutorial in the Red Hat OpenShift on IBM Cloud documentation.

Consider a few tips: You can expose the app on an OpenShift route, which directs ingress traffic to applications deployed on the cluster, a simplified approach. You can bind a custom domain in OpenShift with one command, instead of defining an Ingress Kubernetes service in YAML and applying it. Also, you can monitor the health of the environment scale the application. For example, if your production app is experiencing an unexpected spike in traffic, the container platform automatically scales to handle the new workload.

You can check out the architecture diagram at the Scalable web application on OpenShift tutorial and then try it for yourself.

A brief history of Kubernetes, OpenShift, and IBM

The recent introduction of Red Hat® OpenShift® as a choice on IBM Cloud sparked my curiosity about its origins, and why it is so popular with developers. Many of the developers I sat beside at talks, or bumped into at lunch, at a recent KubeCon Conference, mentioned how they used OpenShift. I heard from developers with financial institutions running analytics on transactions and with retailers creating new experiences for their customers.

OpenShift is a hybrid-cloud, enterprise Kubernetes application platform. IBM Cloud now offers it as a hosted solution or an on-premises platform as a service (PaaS). It is built around containers, orchestrated and managed by Kubernetes, on a foundation of Red Hat Enterprise Linux.

With the growth of cloud computing, OpenShift became one of the most popular development and deployment platforms, earning respect based on merit. As cloud development becomes more “normal” for us, it is interesting to consider where OpenShift fits, as another tool from the toolbox for creating the right solution. It might mix with legacy on-premises software, cloud functions, Cloud Foundry, or bare metal options.

In this blog post, my colleague Olaph Wagoner and I step back in time to understand where OpenShift came from, and we look forward to where it might be going in the world of enterprise application development with Kubernetes.

The following graphic shows a timeline of OpenShift, IBM, and Kubernetes:

OpenShift, IBM, and Kubernetes timeline

Early OpenShift: 2011-2013

OpenShift was first launched in 2011 and relied on Linux containers to deploy and run user applications, as Joe Fernandes describes in Why Red Hat Chose Kubernetes for OpenShift.

When OpenShift was born in 2011, it relied on Linux containers to deploy and run user applications. OpenShift V1 and V2 used Red Hat’s own platform-specific container runtime environment and container orchestration engine as the foundation.

However, the story of OpenShift began sometime before its launch. Some of the origins of OpenShift come from the acquisition of Makara, announced in November of 2010. That acquisition provided software as an abstraction layer on top of systems and included runtime environments for PHP and Java applications, Tomcat or JBoss application servers, and Apache web servers.

Early OpenShift used “gears”, which were a proprietary type of container technology. OpenShift nodes included some kind of containerization. The gear metaphor was based on what was contained. OpenShift called the isolated clusters gears: something capable of producing work without tearing down the entire mechanism. An individual gear was associated with a user. To make templates out of those gears, OpenShift used cartridges, which were acquired from Makara.

OpenShift itself was not open source until 2012. In June 2013, V2 went public, with changes to the cartridge format.

Docker changes everything

Docker was started as a project by a company called dotCloud, made available as open source in March 2013. It popularized containers with elegant tools that enable people to build and transfer existing skills into the platform.

Red Hat was an early adopter of Docker, announcing a collaboration in September 2013. IBM forged its own strategic partnership with Docker in December 2014. Docker is one of the essential container technologies that multiple IBM engineers have been contributing code to since the early days of the project.

Kubernetes

Kubernetes surfaced from work at Google in 2014, and became the standard way of managing containers.

Although originally designed by Google, it is now an open source project maintained by the Cloud Native Computing Foundation (CNCF), with significant open source contributions from Red Hat and IBM.

According to kubernetes.io, Kubernetes aims to provide “a system for automating deployment, scaling, and operations of application containers” across clusters of hosts. It works with a range of container tools, including Docker.

With containers, you can move into modular application design where a database is independent, and you can scale applications without scaling your machines.

Kubernetes is another open source project that IBM was an early contributor to. In the following graphic you can see the percentage of IBM’s contribution to Docker, Kubernetes, and Istio in the context of the top 5 orgs to contribute to each of those container related projects. It highlights the importance of container technology for IBM, as well as some of the volume of open source work.

Some of IBM's contributions to open source container technology

OpenShift V3.0: open and standard

Red Hat announced an intent to use Docker in OpenShift V3 in August 2014. Under the covers, the jump from V2 to V3 was quite substantial. OpenShift went from using gears and cartridges to containers and images. To orchestrate those images, V3 introduced using Kubernetes.

The developer world was warming to the attraction of Kubernetes too, for some of the following reasons:

  • Kubernetes pods allow you to deploy one or multiple containers as a single atomic unit.

  • Services can access a group of pods at a fixed address and can link those services together using integrated IP and DNS-based service discovery.

  • Replication controllers ensure that the desired number of pods is always running and use labels to identify pods and other Kubernetes objects.

  • A powerful networking model enables managing containers across multiple hosts.

  • The ability to orchestrate storage allows you to run both stateless and stateful services in containers.

  • Simplified orchestration models quickly allow applications to get running without the need for complex two-tier schedulers.

  • An architecture understood that the needs of developers and operators were different and took both of those requirements into consideration, eliminating the need to compromise either of these important functions.

OpenShift introduced powerful user interfaces for rapidly creating and deploying apps with Source-To-Image and pipelines technologies. These layers on top of Kubernetes simplify and draw in new developer audiences.

IBM was already committing code to the key open source components OpenShift is built on. The following graphic shows a timeline of OpenShift with Kubernetes:

OpenShift and Kubernetes timeline

OpenShift V4.0 and the future

Red Hat clearly proved to be at the forefront of container technology, second only to Google in contributions to CNCF projects. Another recent accomplishment of Red Hat I want to mention is the the acquisition of CoreOS in January of 2018. The CoreOS flagship product was a lightweight Linux operating system designed to run containerized applications, and Red Hat is making available in V4 of OpenShift as “Red Hat Enterprise Linux CoreOS”.

And that’s just one of many exciting developments coming in V4. As shown in the previous timeline graphic, OpenShift Service Mesh will combine the monitoring capability of Istio with the display power of Jaeger and Kiali. Knative serverless capabilities are included, as well as Kubernetes operators to facilitate the automation of application management.

The paths join up here, also. IBM is a big contributor of open source code to Istio, Knative, and Tekton. These technologies are the pathways of container-based, enterprise development in the coming decade.

OpenShift V4.0 has only recently been announced. And Red Hat OpenShift on IBM Cloud™ is a new collaboration that combines Red Hat OpenShift and IBM Cloud Kubernetes Service. For other highlights, review the previous timeline graphic.

Some conclusions

Researching the origins and history of OpenShift was interesting. Using OpenShift as a lens recognizes that in terms of software development, this decade really is the decade of the container.

It is impressive how much energy, focus, and drive Red Hat put into creating a compelling container platform by layering significantly, progressing the same technologies that IBM has shown interest in, and dedicating engineering resources to over the past decade.

We’re looking forward to learning and building with all of these cloud technologies in the years ahead.

An overview of the Kubernetes Container Runtime Interface (CRI)

A long time ago, in a Github repo far away, the Kubernetes development team…

Wait! Before we go in depth into the Kubernetes Container Runtime Interface, let me explain a few things first. Kubernetes includes a daemon called kubelet for managing pods. Kubernetes introduced pods, which specify the resources used by a group of application containers. Docker made these application containers popular just five years ago, and now they are even more popular due to the immense ecosystem surrounding Kubernetes. At run time, a pod’s containers are instances of container images, packaged and distributed through container image registries.

The following architecture diagram shows where kubelet and Docker fit in the overall design:

image

Arguably the most important and most prominent controller in Kubernetes, kubelet runs on each worker node of a Kubernetes enabled cluster. Acting as the primary node agent, kubelet is the primary implementer of the pod and node application programming interfaces (APIs) that drive the container execution layer. Without these APIs, Kubernetes would, mostly, be a CRUD-oriented REST application framework backed by a key-value store.

Kubelet processes pod specs, which identify the configuration for the Pod and application containers. Kubernetes pods can host multiple application containers and storage volumes. Pods are the fundamental execution primitive of Kubernetes. Kubernetes pods facilitate the packaging of a single application per container and decouple deployment-time concerns from build-time concerns. Kubelet executes isolated application containers as its default, native mode of running containers in a pod, as opposed to processes and traditional operating-system packages. After kubelet gets the configuration of a pod through it’s pod spec, it ensures that the specified containers for the pod are up and running.

To create a pod, kubelet needs a container runtime environment. For a long time, Kubernetes used Docker as its default container runtime. However, along the path to each release, it became clear that the Docker interface would continue to progress and change — and thus occasionally break Kubernetes.

In addition, other container runtime environments came along, each vying to be the container runtime environment used by Kubernetes. After trying to support multiple versions of kubelet for different container runtime environments, and trying to keep up with the Docker interface changes, it became clear that the specific needs of a Kubernetes container runtime environment needed to be set in stone. Now any container runtime environment under kubelet needs to meet a specified interface, allowing for separation in the kubelet codebase, and quelling the need to support n-different versions of kubelet. This situation begat the Kubernetes Container Runtime Interface (CRI). The current version of CRI is v1alpha2.

To implement a CRI integration with Kubernetes for running containers, a container runtime environment must be compliant with the Open Container Initiative (OCI). OCI includes a set of specifications that container runtime engines must implement and a seed container runtime engine called runc. Most container runtime environments use runc, and it can also be used as a measure of compatibility against other non-runc container runtime engines. The OCI Runtime Specification defines “the configuration, execution environment, and lifecycle of a container.” The OCI Image Format Specification defines “an OCI Image, consisting of a manifest, an image index (optional), a set of filesystem layers, and a configuration.” Additionally, the CRI container runtime environment should successfully run all CRI tools validation tests and Kubernetes end-to-end tests on Kubernetes test-infra.

For more information, check out this brief discussion around OCI image support in the ecosystem.

To go a bit deeper, look at an architecture diagram for a container runtime environment called containerd:

image

Since containerd V1.1, CRI support is built into containerd as a plugin. It is enabled by default, but optional. The CRI plugin interacts with containerd through direct function calls that exploit containerd’s client interface. This architecture based on plugins proved to be both stable and efficient. The plugin handles all CRI service requests from kubelet and manages the pod lifecycle through operating system services, CNI, and containerd client APIs that in turn use containerd services (more plugins). These containerd services pull container images, creating runtime images for the containers (snapshots), and use container runtimes environments, such as runc, to subsequently create, start, stop, and monitor the containers.

Other CRI integrations include cri-o and dockershim (which is currently built into kubelet). It uses Docker, which in turn uses containerd.

To figure out which CRI you should use is beyond the scope of this blog post, and there are many opinions on this topic. Phil Estes of IBM recently presented “Let’s Try Every CRI Runtime Available for Kubernetes” at KubeCon Barcelona, which gives some perspective.

In follow-up blog posts, I will compare and contrast some of the more popular CRI integrations and the runtime environments they can be configured to use, both virtual machines and runc types. I will dig into the CRI APIs themselves and a few commonly used CRI and OCI tools.

For now I leave you with a few links to reference for configuring CRI integrations and configuring pod specs to select the runtimes used by these CRI integrations.

In closing, I’d like to give a special shout out to all the maintainers of kubelet, CRI runtimes, and OCI, for which there are seriously too many to list.

May the CRI be with you.

Developer relations down the stack

IBM recently closed the acquisition of Red Hat for $34 billion, underscoring the huge and growing importance of hybrid cloud infrastructure. My colleague Marek Sadowski has become a subject matter expert in containers, Kubernetes, and server-side Swift, although he started out as a full stack developer advocate, a robotics startup founder, and an entrepreneur.

Marek Sadowski presenting

Marek has 20 years of enterprise consulting experience throughout the United States, Europe, Japan, Middle East, and Africa. During his time at NASA, he pioneered research on virtual reality goggles for the system to control robots on Mars. After founding a robotics startup, Marek came to work at IBM. I talked to him about his experience in DevOps advocacy.

Marek Sadowski presenting in a classroom

Q: One of your focus areas in developer relations (DevRel) is containers. How is advocating for a DevOps technology different than advocating for an API or application?

Good question. When working with containers, engineers think more in terms of the plumbing and ideas of DevOps and the ease of expanding your infrastructure footprint. In contrast, when you talk about APIs, you try to make application development the center of gravity for the discussion.

When discussing APIs with developers, you talk about how one could consume the API in a robust way. Let’s take the IBM Watson API as an example. Our team will talk about how you can create and run SDKs for developers to consume APIs in their own language, such as Swift (for mobile) or Java (for enterprise). You look at the consumer of your API and discuss how you can produce the API, protect yourself, and do the billing.

Getting back to containers, you speak more about plumbing of the cloud when discussing container technology. How do you manage containers? Expand them? Manage their workloads? Deliver and test new versions?

It quickly becomes apparent that these are two separate concepts. Containerization deals with how your back end is working and proper maintenance of your application, which attracts people from a DevOps background. When you talk about APIs, that’s a completely different story. Your thought paradigm changes to be the point of view of the consumer. How does the consumer find the API? How can developers consume the API?

I speak at conferences on both subjects areas. I’ve found that people who develop applications are more interested in the look, feel, and developer experience of the application. Whereas, with containers, it’s more about back end, load balancing, and seeing issues from a system administrator’s perspective.

Q: Many people are familiar with DevRel with a focus on software engineers, but DevOps is a different community entirely. How do you focus on that community?

There is a division. Everybody is interested in new things like Kubernetes and Docker, but not too many want to perfect their skills to the point that it’s their daily job. So many developers want to know how to spin up a container and a service inside the container, put it in their resume, and be done with it. Developers may be interested because it’s fashionable or it’s a buzzword. However, you can find a lot of people who are running services in containers and have specific questions: sysadmins who want to monitor containers and assure security, load balancing, and other aspects of administration. It’s a completely different audience from developers who consume APIs and create a cool web application. They are two different communities and you have to give each community different content.

For example, in a hackathon, it’s very difficult to create large deployments in containers. It’s about an optimization of development and operations more than application coding.

Marek Sadowski with other IBMers

Q: How have you had to change your approach to DevRel when moving to DevOps advocacy?

Previously, when I ran workshops focused on application developers, they usually had a few goals: understand our API, consume data from API endpoints, and create a simple “Hello World!” type of application. Developers in these workshops ask questions about high-level ways of architecting applications, for example with Watson, in mobile applications or web applications, or a chain of processes.

On the contrary, when I speak about DevOps and containers, developers in the audience want to spin up the services, see how they scale up and scale down, investigate how the services behave when something is failing, and how to ameliorate security issues. It’s a completely different approach. They are not interested in building something new; they want to perfect their approach to deployment.

Here’s an analogy I can give to people new to this field. It’s like inviting a painter and a plumber to a party. They both do similar things, yet the painter wants to make a painting that you can hang on the wall, and the plumber will rarely speak about the type of piping he’s using inside your walls. Both are doing something in your house, but the painter is thinking about the people they will attract and the paint (our APIs) to ensure a pleasant viewing experience. Whereas, the plumber just wants to get the job done and never touch it again. The plumber wants to make changes as rarely as possible and focus on stability, while the painter wants to create more new paintings. They have different approaches based on their different goals.

Q: You also give talks on Swift, specifically on the server side. Most people know Swift from the iOS development side, so why is it useful on the server? How do you get developers to think of it as a server language?

Server-side Swift is a relatively new development. I compare the current state of server-side Swift to where Java was 24 years ago. In 1996, I started writing a server-side application using Java. It was a novel concept at that time! The same thing is happening now with Swift, as developers are moving the Swift language to the server. There are a lot of reasons why. One of the simplest is that you write in the same language on the server as you do for your mobile app, and in that way you can use the same data constructs, thought processes, and personnel resources on both systems. You don’t need different systems or frameworks to talk to the database or the cloud.

Every mobile app nowadays asks you to connect to the internet for AI, messaging, and social media. Even simple games allow you to exchange information or have a conversation with people all over the world. If your app and back end are written in one language like Swift, it makes these data exchanges simple and transparent.

Some people say Swift is a fashionable language to learn. Since you have the option to write apps in Java or JavaScript, you can also write them in Swift. Apple made Swift open source, similar to the way Sun Microsystems opened up Java. You can now write applications in the cloud or on any platform. For example, OpenWhisk allows you to write event-based Swift functions in the cloud without any DevOps code.

With Swift, developers are attracted to the beauty of the language and the ability to write one language from mobile to cloud to make your application better and easier to maintain. You can enjoy writing in your language of choice and expand the capabilities of the environment you love. If you are an iOS developer, maybe you can become a full-stack developer. Developers love the story that they can become something more and participate in the full stack development process.

Marek Sadowski at a meetup

Q: How did you get into developer relations?

I had just come to the United States from Poland as the founder of a startup, and the purpose of the move was to expand my company. They say that 99% of startups don’t succeed right away, and founders often need to bootstrap while in an existing job. I was told that working in the cloud is the key factor in a lot of industries, but I had little exposure to those technologies. On the other hand, I had built up skills talking to investors, and as an entrepreneur, I was able to understand what was important to startups. I also had a robust background in Java development and different IT technologies; I had a career as an architect supporting banks and other EMEA enterprises as a Java professional, demonstrating systems to customers.

There was an opening for a mobile-first developer advocate, and despite having no mobile or cloud experience, I convinced the interviewer that I was the perfect candidate due to my ease of speaking with developers and presenting technical subjects in an accessible manner. I enjoy explaining complex topics in a simple way through demos and example projects.

My hiring manager asked me to build a small mobile app as an employment test, which connected to IBM Cloud to exchange information between the user and a back end. I enjoyed the task and found I was good at it! After two years, I migrated to more cloud technologies and more IBM APIs. Eventually, I started to find interest in Kubernetes and containers, and realized containers are a field with amazing growth potential.

I must say, the thing that attracted me the most to DevRel was the opportunity to learn and convey new technologies to developers out there, and use my talent for explaining complex things in a straightforward manner.

Marek Sadowski snowboarding

To get in touch with Marek, feel free to reach out on any of the channels listed on his IBM Developer profile page or see him speak at an upcoming IBM Developer SF Meetup.

Microsurvival Part 2: Divide and containerize

Note: If you didn’t already read part one, go there first for the beginning of young Appy’s story.

You know Appy, I was always fascinated by the term “Divide and Conquer” (or divide et impera if you like fancy talk). It is such a great idea that it is used in politics and in computer science. You don’t see these two fields mentioned in the same sentence too often, do you? Well, the concept of breaking up big headaches into smaller headaches can apply to a lot of things, whether it be armies, factions, algorithms, or Hawaiian pizza.

Last time we talked, I mentioned how you were monolithic while growing up, and you then were divided into processes that were put in containers and became easier to manage. Anyway, do you remember where we stopped? How container isolation is possible?

Mechanisms to make isolation happen

I see… Well, container isolation is easy, but you need to pay attention though. Container isolation of processes is possible due to two mechanisms: Linux namespaces and Linux control groups (cgroups). I want to start with Linux namespaces but before that, I want to be sure about something. You do have knowledge about Linux systems, right? Wow, that is a lot of coughing… I guess that’s a “no” then. I will just mention the relevant information.

Typically, a Linux system has one single namespace, and all the resources belong to that namespace, including file systems, network interfaces, process IDs, and user IDs. Now if we run one of your processes, we run it inside one of these namespaces. The process is only able to see the resources inside the same namespace. Easy, right?

Now it can get a bit complex, because we have different kinds of namespaces like:

  • Mount (mnt)
  • Process ID (pid)
  • Network (net)
  • Inter-process communication (ipc)
  • UTS
  • User ID (user)

Each of these namespaces isolate a specific group of resources, so a process belongs to one namespace of each kind. Your parents will probably tell you more about what kind of resources they would isolate and how, but I’ll give you a small example, if we give each of your processes a different UTS namespace, it will be as if these different processes are running on different machines because they see different local host names!

How cool is that? Yeah, I know you want to learn more about them, but for now, I think this is enough to give you an idea of how they would isolate processes running in containers.

Okay, Appy, now to complete the container isolation, we need to limit the amount of system resources that each container can consume. This is where cgroups, a Linux kernel feature, comes in play. It limits the processes’ resource usage, whether CPU, memory, or network bandwidth. A process can’t use more than the configured amount so it cannot hog other processes’ resources.

How about it? I told you that it’s going to be easy to understand container technologies. They have been around for some time now.

Enter Docker

Containers are not new, but they became more famous when Docker was introduced. Docker simplified the whole process of packaging the application with all its libraries, dependencies, and a whole OS file system that the application runs on. All that in a small, package that can be moved to any machine running Docker to provision the application.

Well, not any machine. There are some limitations. For example, if we containerize one of your applications built for x86 architecture, we can’t expect a machine with an ARM architecture to run that application just because it also runs Docker. We might need a virtual machine to solve that problem.

Hmm… We still have some time before I head out to work, but I will keep it short and tell you about the main concepts of Docker for now. We have images, registries, and containers. Images are where we can package one of your applications with its environment, and other metadata. We build the image and run it on the same computer or we can push — upload — it to a Registry. Registries are like repositories that allow us to store our Docker images and easily share them with other people or computers. We can also pull — download — the image from the registry on another computer. Docker containers are just normal containers but based on a Docker image, and it will run on the host running Docker. Of course, it will be isolated from the other containers — or processes — and the host machine.

Here’s a picture I made that shows the Docker image, the registry, and the container: Docker image, registry, and container architecture diagram

Until next time

Okay, I really need to leave now, but let me know what your parents think of this. I will talk to you about Kubernetes and an example that your parents can try out on the IBM Cloud Kubernetes Service sometime later. Until then, stay stable!


A previous version of this post was published on Medium.

Game of cloud technologies: Kubernetes vs. Cloud Foundry

Kubernetes and Cloud Foundry are both technologies that allow you to deploy and run applications on the cloud. In this blog, I talk about my experiences in using both technologies on IBM Cloud™ with the same code pattern, “Create a health records system with modern cloud technology and legacy mainframe code.”

Architecture

Kubernetes

alt

Cloud Foundry

alt

Comparison

In the code pattern, both architectures followed the following steps:

  1. The Data Service API acts as a data pipeline and is triggered for updating data lake with updated health records data by calling API Connect APIs associated with the zOS Mainframe.
  2. API Connect APIs process relevant health records data from zOS Mainframe data warehouse and send the data through the data pipeline.
  3. The Data Service data pipeline processes zOS Mainframe data warehouse data and updates MongoDB data lake.
  4. User interacts with the UI to view and analyze analytics.
  5. The functionality of the App UI that the User interacts with is handled by Node.js. Node.js is where the API calls are initialized.
  6. The API calls are processed in the Node.js data service and are handled accordingly.
  7. The data is gathered from the MongoDB data lake from API calls.
  8. The responses from the API calls are handled accordingly by the App UI.

Score

Since both follow the same architecture model, there is no favorite and both get a point.

  • Kubernetes: 1
  • Cloud Foundry: 1

Deployment

Kubernetes

The IBM Cloud Kubernetes Service uses containers to run the application. This means that I needed to containerize the code pattern; to do this, I used Docker. I had to find compatible containers for the different parts of the code pattern. This included two Node.js containers: one for the front end UI and the other for the data service and APIs. In addition, I also included a MongoDB container for the data lake database. Once all of the containers were configured and running successfully on my local machine, I pushed the containers to Docker Hub.

With the containers ready to deploy to the cloud, I then needed to provision a cluster on Kubernetes. Deploying this code pattern to IBM Cloud can be achieved by using either the Lite or Standard cluster. Once the cluster was done creating, I had to apply YAML files that configured and deployed each container from Docker Hub. For the standard cluster, additional YAML files were applied for configuring the ingress of the two Node.js containers.

Once all of the containers were deployed, I was able to successfully run and interact with the code pattern on IBM Cloud.

Cloud Foundry

Cloud Foundry on IBM Cloud includes an SDK for Node.js for running Node.js applications on Cloud Foundry. For this code pattern, two instances of the SDK needed to be provisioned: One for the front end UI and the other for the data service and APIs. In addition, I provisioned an instance of Compose for MongoDB for the data lake database.

Once all three instances were running, I had to configure a manifest YAML file for the two Node.js parts of the code pattern. Once I configured the YAML file, I pushed it to IBM Cloud to deploy the code from my local machine to the cloud.

Once the code was successfully deployed, I was able to successfully run and interact with the code pattern on IBM Cloud.

Comparison

Deploying to Kubernetes used more steps and required me to have a Docker Hub account, but in the end all parts of the code pattern were running in the same cluster on IBM Cloud. This means that all parts of the code pattern can be accessed from the same base URL.

Even though there were less steps to deploy to Cloud Foundry, the different parts of the code pattern were deployed separately and therefore had to be accessed from different URLs.

The containers for Kubernetes require a Docker image to create the container. As a result, the correct language and version needed to be specified. In addition, the install and initialize commands needed to be specified for the container to build and run. Alternatively, Cloud Foundry did not require any of this. The only thing that needed to be known was the language, which was required when provisioning the application. Everything else was automatically detected.

Unlike deploying to Cloud Foundry, deploying to Kubernetes on IBM Cloud has a free option. Using the free option means that you can’t configure the ingress and therefore must access the code pattern through the IP address and ports. In addition, it uses http rather than https.

Score

Due to the simplicity in deploying, I would give the advantage here to Cloud Foundry.

  • Kubernetes: 1
  • Cloud Foundry: 2

Updating

Kubernetes

With the code pattern running on Kubernetes on IBM Cloud, I could then focus on making updates. After these updates were running successfully on my local machine, I could deploy the updated code to the cloud. This process works similarly to the setting up process. First, the updated containers need to be pushed to Docker Hub. Next, the YAML files that were used for deploying the containers that are associated with the updated containers need to be deleted and reapplied. What this means is that if I updated code that was in only one of the Node.js containers, I would only need to redeploy that container.

Cloud Foundry

With the code pattern running on Cloud Foundry on IBM Cloud, I could then focus on making updates. After these updates were running successfully on my local machine, I could then deploy the updated code to the cloud. This process works similarly to setting up. The same manifest YAML file gets pushed to IBM Cloud to re-deploy the code from the local machine to the cloud.

Comparison

Similar to deploying, it is much easier to update on Cloud Foundry than Kubernetes. It only requires one command, whereas Kubernetes requires the deletion and redeployment of each container that needs updating. I also noticed that at times the container/service would not delete completely on Kubernetes and would require me to delete multiple times.

A key difference between the two is portability and versioning of applications to the cloud. Docker makes it easy to create different versions of a container. For deploying different versions of an application or even multiple instances of just one version, Kubernetes makes this process simple. For a different cluster provisioned, the same YAML files can be used. However, if a different version of a container is used, the image in the file must be changed to where it is located in Docker Hub. In fact, you don’t even need the code on your local machine to deploy, you just need the necessary YAML files. On the other hand, using Cloud Foundry is more complicated. For each additional application you want to deploy, the necessary components need to be provisioned on IBM Cloud, or any other cloud really. In addition, the code must be on your local machine. For versioning, one option that you can use is to create different branches on GitHub that are associated with a version and push the code for the branch you want to deploy.

Score

Because of the difference in portability and versioning, I give the advantage here to Kubernetes.

  • Kubernetes: 2
  • Cloud Foundry: 2

Debugging

Kubernetes

Just because an application is running successfully locally does not mean that it will also run successfully on the cloud. I learned this when deploying the code pattern to the cloud. Debugging this code pattern locally through the browser and logs through the terminal console was pretty similar to how Kubernetes on IBM Cloud debugging works. Browser debugging works the same way as running locally, however the logs can be found by running kubectl logs <podname>. If you’re interested in reading a particular log, the podname is the pod where the container’s log is located. Unfortunately, sometimes debugging requires pushing multiple updated containers to the cloud, which slows down the process.

Cloud Foundry

The process of debugging on Cloud Foundry is similar to Kubernetes, where you can use the browser to debug and also check the logs. The logs can also be run by running a command: ibmcloud cf logs <appname>. Where appname is the name of the application that you’re interested in reading the logs for.

Comparison

I found reading the logs easier to follow on Kubernetes. In Cloud Foundry, the logs wrapped what was shown in the console into CF’s logs, which can be confusing to understand at first.

Sometimes when I was debugging, I needed to redeploy the code pattern multiple times in order to find the bug. While this did slow down the debugging process for both Kubernetes and Cloud Foundry, it was more significant when using Kubernetes.

Score

I prefer debugging on Kubernetes than Cloud Foundry and as a result, Kubernetes gets the advantage.

  • Kubernetes: 3
  • Cloud Foundry: 2

Handling large inputs

Kubernetes

Initially when working on this code pattern, I was working with a small data set (< 1MB) for sending to the Node.js API responsible for populating the MongoDB database. When I decided to use a larger data set (~ 6MB), I started to run into some issues. First when running locally, I had to increase the body parser limit in the Node.js application. After I fixed that and had the data successfully populating the database locally, I tried it on Kubernetes. Unfortunately, I was getting the following error when trying to send the larger data set:

<head><title>413 Request Entity Too Large</title></head>
<body bgcolor="white">
<center><h1>413 Request Entity Too Large</h1></center>
<hr><center>nginx</center>
</body>
</html>

I found that I needed to set the ingress.bluemix.net/client-max-body-size in the ingress file that was associated with the container service with the APIs.

Cloud Foundry

Fortunately, Cloud Foundry has auto-scaling implemented, so sending a larger data set to my API was not a concern and worked without issues.

Comparison

Cloud Foundry’s auto-scaling feature works great for this code pattern, because the scaling is something that does not have to be worried about since it is done automatically. With Kubernetes, a limit on the size of the data has to be set.

Score

Cloud Foundry’s auto-scaling feature gives Cloud Foundry the advantage here.

  • Kubernetes: 3
  • Cloud Foundry: 3

Running

Kubernetes

The way I configured this code pattern for Kubernetes meant that all parts of the code pattern were running in the same Kubernetes cluster. This means that all parts of the code pattern can be accessed from the same base URL/IP address. If ingress files are used, you also have the ability to control what/how gets exposed to access. For example, with this code pattern, I have exposed the UI on https://some-url and the APIs on https://api.some-url.

Cloud Foundry

The way I configured this code pattern using Cloud Foundry meant that each part was running separately on the cloud. Unfortunately, with the way this code pattern is structured, it cannot run all together in one application. This means that the code pattern is exposed on different URLs for each part.

Comparison

As I mentioned in the deploying section, unlike deploying to Cloud Foundry, deploying to Kubernetes on IBM Cloud has a free option. Using the free option means that you can’t configure the ingress and therefore have to access the code pattern through the IP Address and Ports. In addition, it uses http rather than https. Cloud Foundry automatically uses https.

Because the parts of the code pattern are scattered on Cloud Foundry on IBM Cloud, I noticed that running this code pattern was slower than running it on Kubernetes.

Score

I am giving the advantage to Kubernetes here due to the speed at which the application was able to run on Kubernetes compared to Cloud Foundry.

  • Kubernetes: 4
  • Cloud Foundry: 3

Conclusion

From the final tally of the score, Kubernetes beats out Cloud Foundry 4-3. With the score being so close, it shows that both Kubernetes and Cloud Foundry have advantages to using either over the other. When it comes to deciding which to use for your application, it is important to consider which aspects are the priority during the application’s lifecycle. For this application, if I had to choose one, I would go with Kubernetes. With the potential for a large amount of data to be handled by the application, the speed of the system and user experience should be the priority.

Next steps

It depends on your own personal preferences on whether you decide to use Kubernetes or Cloud Foundry. But if you’re still unsure or want to see what other code patterns or tutorials we have, check out Cloud Foundry on IBM Developer or Kubernetes on IBM Developer. Again, check out the code pattern that I wrote, which inspired this blog post.