Overview

Skill Level: Beginner

Some basic understanding of shell scripting and Kubernetes

In this recipe, I demonstrate how to connect to IBM Cloud Private and configure a Chaos Monkey script. This script randomly kills pods in your cluster so you can introduce concepts such as chaos testing early in your DevOps process.

Ingredients


To get started, you should have an elementary understanding of Kubernetes and have installed IBM Cloud Private, Docker, and jq locally. In this tutorial, I will demonstrate some basic concepts around security by authenticating with ICP, interacting with secured Kubernetes APIs, and running this script in your development environment. To wrap up the article,  I will also show how to deploy this script as a Kubernetes resource. 

IBM Cloud Private Installation Guide: https://www.ibm.com/support/knowledgecenter/en/SSBS6K_2.1.0/installing/install_containers_CE.html
Docker Install: https://docs.docker.com/install/
jq Install: https://stedolan.github.io/jq/download/

 

Step-by-step

  1. Obtaining a security token from ICP

    ICP’s security model is based upon Open ID Connect (OIDC). ICP provides a REST API for obtaining security tokens based upon successfully authenticating¬†the user with the appropriate user id and password. ICP¬†makes it easy to consume identity tokens via this API. Client apps receive the user‚Äôs identity encoded in a secure JSON Web Token (JWT), called an ID token. JWTs are appreciated for their elegance and portability, and for their ready support for a wide range of signature and encryption algorithms. All that makes JWT outstanding for the ID token job. In the code below, we authenticate and receive the JSON payload as the result.¬† Using jq makes parsing the response really simple¬†because it uses¬†dot notation¬†to traverse the JSON response body. In this example, id_token is one of the root elements and is accessed directly using the¬†.id_token parameter

    ## grab id token 
    TOKEN=$(curl -s -X POST $HOST:8443/idprovider/v1/auth/identitytoken \
    -H "Content-Type: application/x-www-form-urlencoded;charset=UTF-8" \
    -d "grant_type=password&username=$USER&password=$PASSWORD&scope=openid%20email%20profile" --insecure \
    | jq --raw-output .id_token)
    ##  display the security token to the console
    echo "token: $TOKEN"

    In the code snippet above, I am setting the variable TOKEN to store the token associated with the ICP deployment. This token defined by the $HOST parameter with the credentials associated with the user $USER and the password $PASSWORD.   To verify that the call completed successfully, I am echoing the result of the token to the console.

  2. Retrieving a list of pods for a namespace

    Now that I have obtained the token, I can now query Kubernetes to retrieve various types of resources including Kubernetes Pods.  ICP requires any API calls to Kubernetes resources to be protected by Role Based Access Control (RBAC).   In the code snippet below, I retrieve the list of pods for the default namespace, where I have deployed a set of test resources.

    echo "========Get list of pods for namespace $NAMESPACE============="
    podNames=$(curl -s -k -H "Authorization: bearer $TOKEN" -H "Accept: application/json" \
    $HOST:8001/api/v1/namespaces/$NAMESPACE/pods?pretty=true \
    | jq --raw-output '[ .items | to_entries[] | .value.metadata.name ]')
    podLength=$(echo $podNames | jq length)
    echo "Names: $podNames"
    echo "# of pods $podLength"

    This code snippet defines a JSON payload that contains a list of items.  . Using jq, I am converting the list of items into a set of entries and, again via dot notation, drilling down to the name defined in the metadata. Once I have the list of podNames, I calculate the length of the result set representing the number of pods in the namespace.  This will be used in the next section.

  3. Deleting random pods chaos style

    With the size and list of pods stored in my script variables, we can start to play a little bit with the pods by randomly deleting them.   Using this chaos style testing, we can randomly generate a number less than the number of available pods and target the pod stored in that element for deletion.    Using the same security token for retrieving the pods, we can now make a HTTP DELETE REST API to delete the selected pod in the given namespace.

    podIndex=$(echo $(( $RANDOM % $podLength )))
    echo "Index into array $podIndex"
    selectedPod=$(echo $podNames | jq -r '.['$podIndex']')
    echo "selectedPod $selectedPod"
    deletePod=$(curl -s -k -H "Authorization: bearer $TOKEN" -H "Accept: application/json" \
    $HOST:8001/api/v1/namespaces/$NAMESPACE/pods/$selectedPod -XDELETE)
    echo "deletedPod $deletePod"

    Once this code finishes running, the API returns the POD that is marked for deletion and logs the pod to the console. If you are monitoring the ICP dashboard during this process, you will see the pod get marked for deletion, and a second pod will be created to replace the deleted pod.

  4. Deploying Chaos Monkey on ICP

    Now that I have the basics in place, we can modify this script to run it on ICP. For the first step, I want to move from a one-time execution of this script to a forever style script that constantly runs chaos against a specified namespace. To keep the script simple, I wrapped a do/while loop around the various calls and named it chaos.sh.

    : ${DELAY:=60}

    while true; do
    ## grab id token
    TOKEN=$(curl -s -X POST $HOST:8443/idprovider/v1/auth/identitytoken \
    -H "Content-Type: application/x-www-form-urlencoded;charset=UTF-8" \
    -d "grant_type=password&username=$USERNAME&password=$PASSWORD&scope=openid%20email%20profile" \
    --insecure \
    | jq --raw-output .id_token)
    echo "token: $TOKEN"

    echo "========Get list of pods for namespace $NAMESPACE============="
    podNames=$(curl -s -k -H "Authorization: bearer $TOKEN" -H "Accept: application/json" \
    $HOST:8001/api/v1/namespaces/$NAMESPACE/pods?pretty=true \
    | jq --raw-output '[ .items | to_entries[] | .value.metadata.name ]')

    podLength=$(echo $podNames | jq length)
    echo "Names: $podNames"
    echo "# of pods $podLength"

    podIndex=$(echo $(( $RANDOM % $podLength )))

    echo "Index into array $podIndex"

    selectedPod=$(echo $podNames | jq -r '.['$podIndex']')
    echo "selectedPod $selectedPod"

    deletePod=$(curl -s -k -H "Authorization: bearer $TOKEN" \
    -H "Accept: application/json" \
    $HOST:8001/api/v1/namespaces/$NAMESPACE/pods/$selectedPod -XDELETE)
    echo "deletedPod $deletePod"

    sleep "${DELAY}"
    done

     I then created a Dockerfile to pull in the various script dependencies and the script into the Docker image.

    FROM alpine:3.5
    RUN apk update
    RUN apk add curl
    RUN apk add jq

    WORKDIR /usr/src/app
    COPY chaos.sh ./
    CMD ["sh", "/usr/src/app/chaos.sh"]

     For testing purposes, I had a simple deployAppDocker.sh script.  You will want to update the references in the next two sections to point to your Docker registry.

    #!/bin/sh
    set -e
    export DOCKER_ID_USER="todkap"
    docker login
    docker build --no-cache=true -t todkap/chaos:v1 .
    docker push todkap/chaos:v1

     Once I built and deployed the image to my local docker registry, I am now able to refer to this image from my deployment.yaml. Note that userid and password are hardcoded in the deployment.yaml. In a real world application, these would be defined as Kubernetes Secrets instead. However, to keep the article simple, we put default values in the deployment resource to keep compatibility with the original script.

    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
    name: chaos
    spec:
    replicas: 1
    template:
    metadata:
    labels:
    app: chaos
    version: v1
    spec:
    containers:
    - name: chaos
    image: todkap/chaos:v1
    imagePullPolicy: Always
    env:
    - name: HOST
    value: "https://9.37.39.161"
    - name: NAMESPACE
    value: "default"
    - name: USER
    value: "admin"
    - name: PASSWORD
    value: "admin"
    ---

    In the deployment.yaml, I specified I want to randomly kill pods that are deployed in the namepace default. To avoid killing the chaos deployment, I created a new namespace for this resource named chaos and will deploy the chaos monkey code into that namespace.  This will provide the requisite isolation between the code doing the chaos testing from the resources that are being tested.

    kubectl config set-context $(kubectl config current-context) --namespace=chaos

    Once I set the default namespace for my command line, I deployed my resource by running the kubectl apply -f deployment.yaml command. We now have the same script that was running locally running inside of ICP!

  5. Conclusion

    In order to verify if a container in a pod is healthy and ready to serve traffic, Kubernetes provides for a range of health checking mechanisms. Health checks, or probes as they are called in Kubernetes, are carried out by the kubelet to determine when to restart a container (for livenessProbe) and by services to determine if a pod should receive traffic or not (for readinessProbe).   While these are great to check on the overall health and responsiveness of resources, what happens to the running system when one of the resources randomly fails.

Join The Discussion