Kubernetes is an open-source container orchestration platform that allows users to automate deployment, scaling, and management of their containerized applications with ease. Autoscaling is a key feature in Kubernetes clusters that allows users to scale their applications automatically based on demand.

Learning objectives

This guide will step you through the process of enabling autoscaling in your Kubernetes cluster and configuring an application to automatically scale up or down based on its CPU utilization.

Prerequisites

  • A provisioned multi-node Kubernetes (version 1.7 or later) cluster.

Estimated time

It should take about one hour to complete this how-to.

Steps

1. Setup Heapster to collect pod metrics

Heapster monitoring needs to be deployed on the Kubernetes cluster for the Autoscaler to collect metrics such as CPU and memory utilization of the pods. In this guide, we will set up Heapster with an InfluxDB backend and a Grafana interface.

First, download the following yaml files:

curl https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/grafana.yaml > grafana.yaml
curl https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/heapster.yaml > heapster.yaml
curl https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/influxdb.yaml > influxdb.yaml
curl https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/rbac/heapster-rbac.yaml > heapster-rbac.yaml

Create service instances of Grafana, Heapster, and InfluxDB, with the corresponding Kubernetes service account and role binding:

$ kubectl create -f grafana.yaml
deployment "monitoring-grafana" created
service "monitoring-grafana" created

$ kubectl create -f heapster.yaml
serviceaccount "heapster" created
deployment "heapster" created
service "heapster" created

$ kubectl create -f influxdb.yaml
deployment "monitoring-influxdb" created
service "monitoring-influxdb" created

$ kubectl create -f heapster-rbac.yaml
clusterrolebinding "heapster" created

2. Create a deployment

For demonstration purpose, we create a test deployment using the ubuntu image running the sleep command. You can replace this with your own application. Note that the requests cpu flag has to be set. It is required for the autoscaler to work when scaling based on CPU utilization.

$ kubectl run autoscale-test --image=ubuntu:16.04 --requests=cpu=1000m --command sleep 1800
deployment "autoscale-test" created

3. Setup a Horizontal Pod Autoscaler

Now, set up a Horizontal Pod Autoscaler to monitor and autoscale the deployment that was just created in the previous step. You have to specify the target CPU utilization percentage, along with the minimum and maximum number of pods to be maintained. The autoscaler will create more pods (up to the maximum) to the deployment if the average CPU utilization of all existing pods exceeds the specified target. Similarly, the autoscaler will remove pods from the deployment if the average CPU utilization drops below the target.

$ kubectl autoscale deployment autoscale-test --cpu-percent=25 --min=1 --max=5
deployment "autoscale-test" autoscaled

Check the current status of the autoscaler:

$ kubectl get hpa
NAME             REFERENCE                   TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
autoscale-test   Deployment/autoscale-test   0% / 25%   1         5         1          1m

You can repeat this step to enable autoscaling on other deployments in the Kubernetes cluster.

4. Validate autoscaler operation

To validate that autoscaling is functioning properly, we use the “stress” utility to put some artificial load on the pod.

1. Get the pod name

$ kubectl get pod
NAME                              READY     STATUS    RESTARTS   AGE
autoscale-test-59d66dcbf7-9fqr8   1/1       Running   0          9m

2. Install the “stress” utility on the pod

kubectl exec autoscale-test-59d66dcbf7-9fqr8 -- apt-get update
kubectl exec autoscale-test-59d66dcbf7-9fqr8 -- apt-get install stress

3. Run a cpu workload on the pod for 5 minutes

$ kubectl exec autoscale-test-59d66dcbf7-9fqr8 -- stress --cpu 2 --timeout 600s &
stress: info: [227] dispatching hogs: 2 cpu, 0 io, 0 vm, 0 hdd

4. Check the status of the autoscaler

Repeatedly checking the status of the autoscaler (hpa), you should see the number of pods to increase (up to the maximum) while the stress is running, and the number of pods will decrease (down to the minimum) after the stress is done.

$ kubectl get hpa
NAME             REFERENCE                   TARGETS      MINPODS   MAXPODS   REPLICAS   AGE
autoscale-test   Deployment/autoscale-test   199% / 25%   1         5         1          13m

$ kubectl get hpa
NAME             REFERENCE                   TARGETS     MINPODS   MAXPODS   REPLICAS   AGE
autoscale-test   Deployment/autoscale-test   49% / 25%   1         5         4          16m

$ kubectl get hpa
NAME             REFERENCE                   TARGETS     MINPODS   MAXPODS   REPLICAS   AGE
autoscale-test   Deployment/autoscale-test   39% / 25%   1         5         5          20m

$ kubectl get hpa
NAME             REFERENCE                   TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
autoscale-test   Deployment/autoscale-test   0% / 25%   1         5         1          25m

5. Cleanup

Delete the deployment created for testing and its corresponding Horizontal Pod Autoscaler:

$ kubectl delete hpa autoscale-test
horizontalpodautoscaler "autoscale-test" deleted

$ kubectl delete deploy autoscale-test
deployment "autoscale-test" deleted

If you don’t intend to continue using autoscaling on your Kubernetes cluster, delete the Heapster, Grafana, and InfluxDB services:

$ kubectl delete -f heapster-rbac.yaml
clusterrolebinding "heapster" deleted

$ kubectl delete -f grafana.yaml
deployment "monitoring-grafana" deleted
service "monitoring-grafana" deleted

$ kubectl delete -f heapster.yaml
serviceaccount "heapster" deleted
deployment "heapster" deleted
service "heapster" deleted

$ kubectl delete -f influxdb.yaml
deployment "monitoring-influxdb" deleted
service "monitoring-influxdb" deleted

Summary

This guide has illustrated how to set up monitoring using Heapster, using InfluxDB as a backend and Grafana as a user interface, to monitor the resource metrics on Kubernetes. It also described the steps to configure Horizontal Pod Autoscaler to automatically scale the number of pods for a deployment on Kubernetes, based on CPU utilization.