Kubernetes is an open source system for managing containerized applications across multiple hosts, providing basic mechanisms for deployment, maintenance, and scaling of applications. The open source project is hosted by the Cloud Native Computing Foundation, and in this tutorial is hosted on the IBM Cloud Kubernetes Service.

Learning objectives

This tutorial is for developers who want to understand more about the Kubernetes cluster and how to debug and get logs from your application.

Prerequisites

Before beginning this tutorial, you’ll need:

  • A free IBM Cloud account: If you do not have an IBM Cloud account, you can create one here.
  • To create a Kubernetes cluster on IBM Cloud.
  • To deploy a sample application and connect kubectl to the Kubernetes cluster. You can follow Lab 0 and Lab 1 for instructions.

Estimated time

Completing this tutorial should take approximately 20 minutes.

Steps

Container logs

Inspect and debug your pods

Let’s talk about application deployment. Sometimes it can go smoothly and you’ll see the message deployment <your-app-name> created! But what happens if your application deployment just failed? There are few steps that you can take: gather information, plan how to fix, and test and execute. Let’s look at gathering information here, as there are many ways to tools and techniques to gather information.

  1. To get basic information about your pods you can use this simple command:

    $ kubectl get pods
    NAME                         READY     STATUS    RESTARTS   AGE
    guestbook-75786d799f-fg72k   1/1       Running   0          7m
    
  2. But you can get much more information if you describe a specific pod, like this:

    $ kubectl describe pod <your-pod-name>
    Name:           guestbook-75786d799f-fg72k
    Namespace:      default
    Node:           10.47.84.98/10.47.84.98
    Start Time:     Sun, 19 Aug 2018 12:56:23 +0300
    Labels:         pod-template-hash=3134283559
                       run=guestbook
    Annotations:    kubernetes.io/psp=ibm-privileged-psp
    Status:         Running
    ...
    

From the above, you can see the configuration information about the container(s) and the pod (labels, resource requirements, etc.), and the status information about the container(s) and pod (state, readiness, restart count, events, etc).

Sometimes using these basic command should be enough. For example, you can look at kubectl describe and see that you have an image pull error and you just forgot to put in some information (like a secret for an image pull). Or maybe you used the latest image version that wasn’t working well and you want to switch back to the previous version.

But what happens if we don’t find any errors and we need to do a deep dive into our logs? What many developers don’t know is that the event (kubectl get events) is actually a resource type in Kubernetes. So when you do this command, it will list the events like it would list any other resource and give you a summarized view. You have to know that the events resource are namespaced, so when you’re trying to get specific events you should mention the namespace that you want to get events for, or just leave it blank to get all events.

  1. To get the events list of your pod, use the command:

         $kubectl get events [--namespace=default]
         LASTSEEN   FIRSTSEEN   COUNT     NAME                         KIND         SUBOBJECT                    TYPE      REASON                  SOURCE                  MESSAGE
         3m         3m          1         guestbook-75786d799f-r6mxl   Pod                                       Normal    Scheduled               default-scheduler       Successfully assigned guestbook-75786d799f-r6mxl to 10.77.155.84
         3m         3m          1         guestbook-75786d799f-r6mxl   Pod                                       Normal    SuccessfulMountVolume   kubelet, 10.77.155.84   MountVolume.SetUp succeeded for volume "default-token-5rlxc"
         3m         3m          1         guestbook-75786d799f-r6mxl   Pod          spec.containers{guestbook}   Normal    Pulled                  kubelet, 10.77.155.84   Container image "ibmcom/guestbook:v1" already present on machine
         3m         3m          1         guestbook-75786d799f-r6mxl   Pod          spec.containers{guestbook}   Normal    Created                 kubelet, 10.77.155.84   Created container
         3m         3m          1         guestbook-75786d799f-r6mxl   Pod          spec.containers{guestbook}   Normal    Started                 kubelet, 10.77.155.84   Started container
         3m         3m          1         guestbook-75786d799f-xvpvv   Pod          spec.containers{guestbook}   Normal    Killing                 kubelet, 10.77.155.84   Killing container with id docker://guestbook:Need to kill Pod
         3m         3m          1         guestbook-75786d799f         ReplicaSet                                Normal    SuccessfulDelete        replicaset-controller   Deleted pod: guestbook-75786d799f-xvpvv
         3m         3m          1         guestbook-75786d799f         ReplicaSet                                Normal    SuccessfulCreate        replicaset-controller   Created pod: guestbook-75786d799f-r6mxl
         3m         3m          1         guestbook                    Deployment                                Normal    ScalingReplicaSet       deployment-controller   Scaled down replica set guestbook-75786d799f to 0
         3m         3m          1         guestbook                    Deployment                                Normal    ScalingReplicaSet       deployment-controller   Scaled up replica set guestbook-75786d799f to 1
    

One common scenario that you can detect with events is when you created a pod that won’t fit any node. It will be in a “Pending” state, which can happen due to something like a lack of resources and you’ll see the problem in the “Events” section. Let’s take a look at the important events parameters, where you can see that the event record contains the following:

  • KIND indicate the type of the resource the event is about (Pod/ReplicaSet/Deployment/etc.).
  • TYPE, which is the status of the event. This can be labeled as “Normal” or “Warning” and new types could be added in the future.
  • REASON for the transition into the object’s current status.
  • SOURCE for the component reporting this event.
  • MESSAGE indicates that it is a human-readable description of the event.

There are more parameters that we don’t see in the list above that might be useful for you like METADATA, which is a standard object meta data. EVENTTIME is when the event was first observed. And ACTION is the specific action that was taken/failed regarding the object.

Get application logs

Application and system logs can help you gain a better understanding of what happened inside your cluster. You can get logs for a specific pod and if the pod has multiple containers, you can specify which container you want.

See your logs

To see the logs, you can run this simple command:

  $ kubectl logs <your-pod-name>
  [negroni] listening on :3000
  [negroni] 2018-08-19T11:55:39Z | 200 |   332.277µs | 173.193.106.55:32412 | GET /
  [negroni] 2018-08-19T11:55:39Z | 200 |   140.407µs | 173.193.106.55:32412 | GET /style.css
  [negroni] 2018-08-19T11:55:39Z | 200 |   123.595µs | 173.193.106.55:32412 | GET /script.js
  [negroni] 2018-08-19T11:55:39Z | 200 |   87.508µs | 173.193.106.55:32412 | GET /lrange/guestbook
  [negroni] 2018-08-19T11:55:39Z | 404 |   74.307µs | 173.193.106.55:32412 | GET /favicon.ico
  [negroni] 2018-08-19T11:57:30Z | 304 |   89.418µs | 173.193.106.55:32412 | GET /
  [negroni] 2018-08-19T11:57:30Z | 200 |   60.671µs | 173.193.106.55:32412 | GET /lrange/guestbook
  [negroni] 2018-08-19T12:06:23Z | 304 |   152.557µs | 173.193.106.55:32412 | GET /
  [negroni] 2018-08-19T12:06:23Z | 200 |   94.091µs | 173.193.106.55:32412 | GET /lrange/guestbook

Note: To print to the logs, write to stdout/stderr from your application. Another thing to note is that there is no get in the logs command, which means that logs are not resources like events.

To help you get better results from logs, you can use “kubetail”, which is an open source project that paints every type of log in a different color. Here’s the link: https://github.com/johanhaleby/kubetail.

Previous logs

You can always ask for previous logs with the --previous flag, but there are two more types of log levels you should know about: the Node level and Cluster level. They’re also quite a bit different from each other. Let’s look at how they differ:

  • Node level: A containerized application writes to stdout and stderr that are handled by a container engine. The container engine redirects those two streams to a logging driver, which is configured in Kubernetes to write to a file in json format. The json logging driver treats each line as a separate message. When using the logging driver, there is no direct support for multi-line messages. You need to handle multi-line messages at the logging agent level or higher. By default, if a container restarts, the kubelet keeps one terminated container with its logs. If a pod is evicted from the node, all corresponding containers are also evicted, along with their logs.
  • Cluster level: While Kubernetes does not provide a native solution for cluster-level logging, there are several common approaches that you can consider. Here are some options:
    • Use a node-level logging agent that runs on every node.
    • Include a dedicated sidecar container for logging in an application pod.
    • Push logs directly to a backend from within an application.

For more information about these approaches, you can read this guide.

Use a shell inside a running container

Most of the time your container logs are your pod logs, especially if your pod only has one container in it. But if something has gone really wrong on your cluster and you cant get the logs from the pod with kubectl, you may have to somehow get into your container and get the logs (a debugging container) that will give you full control of what is going inside the container. You can use kubectl exec to access the shell running in the container and figure out where the process is in your troublesome container, which then gives you a more comfortable way to debug your container.

Remember that you can only do this, though, if you have the Shell exec in the container. If you use an image that doesn’t have it when you were building your container, you wont be able to use the exec command. The Shell exec will probably take some space and make your container heavy, so you might ask “Why should I put it inside my container? Isn’t the low weight the biggest advantage of using containers?” In the production environment, you will probably want to maximize performance and there’s no reason to put the shell into the image. In the test environment, however, you’d probably be happy to access shell to run tests while the container is running.

  1. In order to do so we will need to create new pod with /bin/bash:

    $ kubectl create -f https://kubernetes.io/examples/application/shell-demo.yaml
    
  2. Get a shell to the container:

    $ kubectl exec -it shell-demo -- /bin/bash
    
  3. Now list the root directory:

    root@shell-demo:/# ls /
    bin   dev  home  lib64  mnt  proc  run   srv  tmp
    boot  etc  lib   media  opt  root  sbin  sys  usr
    

Cluster networking issues

Networking issues are the most common issues, as there aren’t any bullet points where you can go through and fix everything – you just have to understand what’s wrong inside your cluster network. To help you find those problems, I listed the critical areas to look for if you suspect a network issue. Further reading: For some guidance on Kubernetes networking, check out Kubernetes Networking: A lab on basic networking concepts.

Debug your service

An issue that comes up frequently for new installations of Kubernetes is that the service aren’t working properly, so you run your deployment and create a service but still don’t get any response. In this section, we’ll go over some commands that might help you figure out what’s not working.

Sometimes we forget to create a service for our pod, as it is essential to have a service. Otherwise, our pod won’t be reachable and we will get errors.

  1. To see all your services, you can use a simple command like this one where we can see all pods:

    $ kubectl get svc
    NAME        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
    guestbook   172.21.30.218   <nodes>       3000:32412/TCP   45m
    
  2. If the service you are looking for doesn’t exist, you can create it by using this command:

    $ kubectl expose deployment <your-deployment-name> --type="NodePort" --port=3000
    service "guestbook" exposed
    

If you already have a service, you should look at its configuration and see if the problem is there.

  1. Let’s try to get more information about this service:

    $ kubectl describe service <your-service-name>
    Name:                   guestbook
    Namespace:              default
    Labels:                 run=guestbook
    Annotations:            <none>
    Selector:               run=guestbook
    Type:                   NodePort
    IP:                     172.21.138.209
    Port:                   <unset> 3000/TCP
    NodePort:               <unset> 31235/TCP
    Endpoints:              172.30.87.71:3000
    Session Affinity:       None
    Events:                 <none>
    

Check if you used the right NodePort to access the container and that you have endpoints. If you see some false information (like no endpoints), try to re-create the service and double check your kubectl expose command for possible mistakes.

  1. You can also get your information in json format:

         $ kubectl get service <your-service-name> -o json
         {
             "apiVersion": "v1",
             "kind": "Service",
             "metadata": {
                 "creationTimestamp": "2018-08-19T11:55:12Z",
                 "labels": {
                     "run": "guestbook"
                 },
                 "name": "guestbook",
                 "namespace": "default",
                 "resourceVersion": "1118",
                 "selfLink": "/api/v1/namespaces/default/services/guestbook",
                 "uid": "baa2e98d-a3a6-11e8-a994-f20601bb534c"
             },
             "spec": {
                 "clusterIP": "172.21.30.218",
                 "externalTrafficPolicy": "Cluster",
                 "ports": [
                     {
                         "nodePort": 32412,
                         "port": 3000,
                         "protocol": "TCP",
                         "targetPort": 3000
                     }
                 ],
                 "selector": {
                     "run": "guestbook"
                 },
                 "sessionAffinity": "None",
                 "type": "NodePort"
             },
             "status": {
                 "loadBalancer": {}
             }
         }
    

You should always double-check some configurations that might cause the problem, like questioning if the spec.ports is the right targetPort for your pod. Are the service and pod using the same protocol?

  1. There are a few more ways to inspect your service. You can read more here.

Check DNS

Some of the network problems could be caused by DNS configurations or errors. So first you’ll need to check if the DNS works correctly.

  1. Use the get pods command to get your pod name:

    $ kubectl get pods
    NAME                         READY     STATUS    RESTARTS   AGE
    guestbook-75786d799f-brrhf   1/1       Running   0          47m
    
  2. Check the pod DNS with the following command :

    $ kubectl exec -ti <your-pod-name> -- nslookup kubernetes.default
    Server:    172.21.0.10
    Address 1: 172.21.0.10 kube-dns.kube-system.svc.cluster.local
    
    Name:      kubernetes.default
    Address 1: 172.21.0.1 kubernetes.default.svc.cluster.local
    

Note that if you don’t use the “default” namespace, you should try other names depending on your cluster namespaces. If your pod and service are in different namespaces, try a namespace-qualified name (default) – but you will need to adjust your app to use a cross-namespace name, or run your app and service in the same namespace. If it still fails, try a fully qualified name (e.g., default.svc.cluster.local). If the nslookup command fails, you should do some checks in configurations or find errors. Errors like nslookup: can't resolve 'kubernetes.default' might indicate that the problem is in the coredns/kube-dns add-on or associated services.

If you are able to do a fully-qualified name lookup but not a relative one, you need to check that your /etc/resolv.conf file is correct.

  1. Go inside the resolv.conf file to see if the parameters are ok:

    $ kubectl exec <your-pod-name> cat /etc/resolv.conf
    nameserver 172.21.0.10
    search default.svc.cluster.local svc.cluster.local cluster.local
    options ndots:5
    

Make sure that the nameserver line indicate your cluster DNS service; this is passed into kubelet with the –cluster-dns flag. The search line must include an appropriate suffix for you to find the service name. In this case, it is looking for services in the local namespace (default.svc.cluster.local), Services in all Namespaces (svc.cluster.local), and the cluster (cluster.local). Depending on your own install you might have additional records. The cluster suffix is passed into kubelet with the –cluster-domain flag. The options line must set ndots high enough that your DNS client library considers search paths at all. Kubernetes sets this to 5 by default, which is high enough to cover all of the DNS names that it generates.

If all of the above still fail, you might need to check your kube-proxy.

Network policies

The NetworkPolicy defines how pods are allowed to communicate with each other and with other network endpoints. NetworkPolicy uses the labels to manage the traffic between pods. If you are not able to communicate with pods, you might want check your network policies to see if this pod doesnt allowed to get any requests. By default, pods are not isolated and they accept traffic. But once you have NetworkPolicy selected a specific pod, you’ll reject any communication with unauthorized connections.

Let’s take a look at a NetworkPolicy example :

        kind: NetworkPolicy
        apiVersion: networking.k8s.io/v1
        metadata:
          name: access-nginx
        spec:
          podSelector:
            matchLabels:
              run: nginx
          policyTypes:
          - Ingress
          - Egress
          ingress:
          - from:
            - podSelector:
                matchLabels:
                  access: "true"
          egress:
          - to:
            - ipBlock:
                cidr: 10.0.0.0/24
            ports:
            - protocol: TCP
              port: 5978

Like every other Kubernetes config, NetworkPolicy has the kind, apiVersion, and metadata parameters for general information. You can see in the above example that we have the podSelector inside the spec, which selects the pods we want to include in this NetworkPolicy. In this example, all nginx pods will be included. Note: If your podSelector is empty this NetworkPolicy will affect all the pods in the same namespace!

Another important parameter that you should check are the ingress and egress; they affect what income and outcome networks are allowed to communicate with the pods in the podSelector. You should double-check those parameters and make sure you insert the right labels, ip, and ports so that you can allow your pods to communicate and not be blocked.

Using Weave Scope

Weave Scope is an open-source visualization and monitoring tool for Docker and Kubernetes. It provides a top-down view into your app as well as your entire infrastructure, and allows you to diagnose any problems with your distributed containerized app in real time, as it being deployed to a cloud provider. To install Weave Cope on your Kubernetes cluster, find directions here. If not Weave Scope, I highly encourage you to use a similar monitoring tool to easily display what your containers are doing and why.

Using Grafana

Grafana is an open-source, general purpose dashboard and graph composer, which runs as a web application. You can use Grafana to get CPU/memory/load metrics of your clusters and pods. Every IBM Cloud user automatically has access to Grafana with an account. To start using Grafana, read the tutorial here. To understand how to work with Grafana and create a dashboard for your Kubernetes cluster, read more here.

Summary

Now that you have learned the basics of logging and debugging your Kubernetes application, you can try to use and explore some more tools: