Kubernetes is an open-source container orchestration platform that allows users to automate deployment, scaling, and management of their containerized applications with ease. Autoscaling is a key feature in Kubernetes clusters that allows users to scale their applications automatically based on demand.
Learning objectives
This guide will step you through the process of enabling autoscaling in your Kubernetes cluster and configuring an application to automatically scale up or down based on its CPU utilization.
Prerequisites
- A provisioned multi-node Kubernetes (version 1.7 or later) cluster.
Estimated time
It should take about one hour to complete this how-to.
Steps
1. Setup Heapster to collect pod metrics
Heapster monitoring needs to be deployed on the Kubernetes cluster for the Autoscaler to collect metrics such as CPU and memory utilization of the pods. In this guide, we will set up Heapster with an InfluxDB backend and a Grafana interface.
First, download the following yaml files:
curl https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/grafana.yaml > grafana.yaml
curl https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/heapster.yaml > heapster.yaml
curl https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/influxdb.yaml > influxdb.yaml
curl https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/rbac/heapster-rbac.yaml > heapster-rbac.yaml
Create service instances of Grafana, Heapster, and InfluxDB, with the corresponding Kubernetes service account and role binding:
$ kubectl create -f grafana.yaml
deployment "monitoring-grafana" created
service "monitoring-grafana" created
$ kubectl create -f heapster.yaml
serviceaccount "heapster" created
deployment "heapster" created
service "heapster" created
$ kubectl create -f influxdb.yaml
deployment "monitoring-influxdb" created
service "monitoring-influxdb" created
$ kubectl create -f heapster-rbac.yaml
clusterrolebinding "heapster" created
2. Create a deployment
For demonstration purpose, we create a test deployment using the ubuntu image running the sleep command. You can replace this with your own application. Note that the requests cpu flag has to be set. It is required for the autoscaler to work when scaling based on CPU utilization.
$ kubectl run autoscale-test --image=ubuntu:16.04 --requests=cpu=1000m --command sleep 1800
deployment "autoscale-test" created
3. Setup a Horizontal Pod Autoscaler
Now, set up a Horizontal Pod Autoscaler to monitor and autoscale the deployment that was just created in the previous step. You have to specify the target CPU utilization percentage, along with the minimum and maximum number of pods to be maintained. The autoscaler will create more pods (up to the maximum) to the deployment if the average CPU utilization of all existing pods exceeds the specified target. Similarly, the autoscaler will remove pods from the deployment if the average CPU utilization drops below the target.
$ kubectl autoscale deployment autoscale-test --cpu-percent=25 --min=1 --max=5
deployment "autoscale-test" autoscaled
Check the current status of the autoscaler:
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
autoscale-test Deployment/autoscale-test 0% / 25% 1 5 1 1m
You can repeat this step to enable autoscaling on other deployments in the Kubernetes cluster.
4. Validate autoscaler operation
To validate that autoscaling is functioning properly, we use the “stress” utility to put some artificial load on the pod.
1. Get the pod name
$ kubectl get pod
NAME READY STATUS RESTARTS AGE
autoscale-test-59d66dcbf7-9fqr8 1/1 Running 0 9m
2. Install the “stress” utility on the pod
kubectl exec autoscale-test-59d66dcbf7-9fqr8 -- apt-get update
kubectl exec autoscale-test-59d66dcbf7-9fqr8 -- apt-get install stress
3. Run a cpu workload on the pod for 5 minutes
$ kubectl exec autoscale-test-59d66dcbf7-9fqr8 -- stress --cpu 2 --timeout 600s &
stress: info: [227] dispatching hogs: 2 cpu, 0 io, 0 vm, 0 hdd
4. Check the status of the autoscaler
Repeatedly checking the status of the autoscaler (hpa), you should see the number of pods to increase (up to the maximum) while the stress is running, and the number of pods will decrease (down to the minimum) after the stress is done.
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
autoscale-test Deployment/autoscale-test 199% / 25% 1 5 1 13m
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
autoscale-test Deployment/autoscale-test 49% / 25% 1 5 4 16m
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
autoscale-test Deployment/autoscale-test 39% / 25% 1 5 5 20m
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
autoscale-test Deployment/autoscale-test 0% / 25% 1 5 1 25m
5. Cleanup
Delete the deployment created for testing and its corresponding Horizontal Pod Autoscaler:
$ kubectl delete hpa autoscale-test
horizontalpodautoscaler "autoscale-test" deleted
$ kubectl delete deploy autoscale-test
deployment "autoscale-test" deleted
If you don’t intend to continue using autoscaling on your Kubernetes cluster, delete the Heapster, Grafana, and InfluxDB services:
$ kubectl delete -f heapster-rbac.yaml
clusterrolebinding "heapster" deleted
$ kubectl delete -f grafana.yaml
deployment "monitoring-grafana" deleted
service "monitoring-grafana" deleted
$ kubectl delete -f heapster.yaml
serviceaccount "heapster" deleted
deployment "heapster" deleted
service "heapster" deleted
$ kubectl delete -f influxdb.yaml
deployment "monitoring-influxdb" deleted
service "monitoring-influxdb" deleted
Summary
This guide has illustrated how to set up monitoring using Heapster, using InfluxDB as a backend and Grafana as a user interface, to monitor the resource metrics on Kubernetes. It also described the steps to configure Horizontal Pod Autoscaler to automatically scale the number of pods for a deployment on Kubernetes, based on CPU utilization.
Share our content
-
- Learning objectives
- Prerequisites
- Estimated time
- Steps
- 1. Setup Heapster to collect pod metrics
- 2. Create a deployment
- 3. Setup a Horizontal Pod Autoscaler
- 4. Validate autoscaler operation
- 1. Get the pod name
- 2. Install the "stress" utility on the pod
- 3. Run a cpu workload on the pod for 5 minutes
- 4. Check the status of the autoscaler
- 5. Cleanup
- Summary