Introduction to KEDA
Automatically scale your containers on Kubernetes
Although one of the benefits of event-driven applications is that they are easy to scale in nature, the solutions offered on Kubernetes are limited in terms of overhead or practicality. If you’re looking to scale your event-driven applications more elegantly, KEDA is a lightweight, open source solution that automatically scales your containers on Kubernetes.
In this article, I introduce you to the basics of KEDA using a Kafka trigger, explain what the properties mean, and how KEDA can affect the scaling of your pods.
All the sample code used in this article is also available in this GitHub repository.
So, what exactly is KEDA? KEDA (Kubernetes-based Event Driven Autoscaler) is an Apache 2.0-licensed open source project that was created by Microsoft and Red Hat, but has since become a Cloud Native Computing Foundation (CNCF) sandbox project. The aim of KEDA is to provide better scaling options for your event-driven applications on Kubernetes.
Let’s take a look at what this means.
Currently on Kubernetes, the Horizontal Pod Autoscaler (HPA) only reacts to resource-based metrics such as CPU or memory usage or custom metrics. For event-driven applications where there could suddenly be a stream of data, this could be quite slow to scale up. Plus, it has to scale back down once the data stream is slower and remove the extra pods.
Paying for those unneeded resources all the time isn’t fun!
KEDA is more proactive. It monitors your event source and feeds this data back to the HPA resource as a custom metric for you. This way, KEDA can scale any container based on the number of events that need to be processed, before the CPU or memory usage goes up. You can also explicitly set which deployments KEDA should scale for you. So, you can tell it to only scale a specific application, for example, the consumer.
You can add KEDA to your existing clusters, so there is a lot of flexibility for how you want to use it. You don’t need to do a code change and you don’t need to change your other containers. It only needs to be able to look at your event source and the deployments you are interested in scaling.
Now that we’re past the long introduction, let’s look at a diagram for a high-level view of what KEDA does.
KEDA monitors your event source and regularly checks if there are any events. When needed, KEDA activates or deactivates your pod depending on whether there are any events by setting the deployment’s replica count to 1 or 0, depending on your minimum replica count. KEDA also exposes metrics data to the HPA which handles the scaling to and from 1.
Now that I’ve covered the basics, let’s look at how to deploy and use KEDA.
The instructions for deploying KEDA are very simple and can be found on KEDA’s deploy page.
There are three ways to deploy KEDA into your Kubernetes cluster:
- Operator Hub
- Deploy YAMLs
This article focuses on the third option, although the contents should be the same.
So, what gets deployed in a KEDA deployment? The deployment contains the KEDA operator, roles, and role bindings, as well as these custom resources:
ScaledObject: The ScaledObject maps an event source to the deployment that you want to scale.
TriggerAuthentication: If required, this resource contains the authentication configuration needed for monitoring the event source.
The scaled object controller also creates the HPA for you.
Let’s take a closer look at the
This is a code snippet of the one I use in my sample repository.
apiVersion: keda.k8s.io/v1alpha1 kind: ScaledObject metadata: name: consumer-scaler labels: deploymentName: consumer-service namespace: keda-sample spec: scaleTargetRef: deploymentName: consumer-service pollingInterval: 1 cooldownPeriod: 30 minReplicaCount: 0 maxReplicaCount: 10 triggers: - type: kafka metadata: topic: messages brokerList: my-cluster-kafka-bootstrap.kafka:9092 consumerGroup: testSample lagThreshold: '5'
ScaledObject and the deployment referenced in
deploymentName need to be in the same namespace.
So let’s look at each property in the spec section and see what they are used for.
scaleTargetRef is the reference to the deployment that you want to scale. In this example, I have a
consumer-service app that I want to scale depending on the number of events coming through to Kafka.
scaleTargetRef: deploymentName: consumer-service
The polling interval is in seconds. This is the interval in which KEDA checks the triggers for the queue length or the stream lag.
pollingInterval: 1 # Default is 30
The cooldown period is also in seconds, and it is the period of time to wait after the last trigger activated before scaling back down to 0.
cooldownPeriod: 30 # Default is 300
But what does activated mean and when is a trigger activated? Having a look at the code and the documentation, activated is the time at which KEDA last checked the event source and found that there were events. At that point, the trigger is set to active.
The next time KEDA looks at the event source and finds it empty, the trigger is set to inactive and kicks off the cooldown period before scaling down to 0.
This timer is cancelled if any events are detected again in the event source.
This could be interesting to balance with the polling interval to make sure it doesn’t scale down too fast before the events are done being consumed!
The following code sets the minimum number of replicas that KEDA will scale a deployment down to.
minReplicaCount: 0 # Default is 0
maxReplicaCount sets the maximum number of replicas that KEDA will scale up to, as you can see here:
maxReplicaCount: 10 # Default is 100
This is the list of triggers to use to activate the scaling. In this example, I use Kafka as my event source.
triggers: - type: kafka
Although KEDA supports multiple types of event sources, this article uses the Kafka scaler. You can see the YAML for this below:
triggers: - type: kafka metadata: topic: messages brokerList: my-cluster-kafka-bootstrap.kafka:9092 consumerGroup: testSample lagThreshold: '5'
Kafka trigger properties
The following code snippets show you the Kafka trigger properties to use with KEDA:
The following code is the name of the topic that you want to check the events in.
Here you can list the brokers that KEDA should monitor on as a comma-separated list.
This is the name of the consumer group and should be the same one as the one that is consuming the events from the topic so that KEDA knows which offsets to look at.
The following code took me a while to figure out, but that is probably down to my inexperience in this area. In the documentation, the
lagThreshold property is described as how much the event stream is lagging. So, I thought it was something with time.
In reality, the lag refers to the number of records that haven’t been read yet by the consumer.
KEDA checks against the total number of records in each of the partitions and the last consumed record. After some calculations, this is used to identify how much it should scale the deployments.
lagThreshold: '3' # Default is 10
For Kafka, the number of partitions in your topic affects how KEDA handles the scaling as it will not scale beyond the number of partitions you requested for your topic.
See KEDA in practice
So, what does this look like in practice? In the sample repository, you can find a very simple consumer service using Kafka as the event source. Grab the code from the repo and I’ll walk you through some experiments with KEDA. You can find the instructions to start up the services in the README.
The repository contains a basic consumer service that simply outputs the messages from the Kafka topic and our KEDA scaler.
Here is how the keda-sample namespace looks like before KEDA is started:
$ kubectl get all -n keda-sample NAME READY STATUS RESTARTS AGE pod/consumer-service-7d4bd5df95-l474d 1/1 Running 0 2m56s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/consumer-service ClusterIP 10.107.101.221 <none> 8090/TCP 2m56s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/consumer-service 1/1 1 1 2m56s NAME DESIRED CURRENT READY AGE replicaset.apps/consumer-service-7d4bd5df95 1 1 1 2m56s
You can see that there is one pod for the consumer-service currently active.
So, what happens after you start up the KEDA scaler?
$ kubectl get all -n keda-sample NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/consumer-service ClusterIP 10.107.101.221 <none> 8090/TCP 5m23s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/consumer-service 0/0 0 0 5m23s NAME DESIRED CURRENT READY AGE replicaset.apps/consumer-service-7d4bd5df95 0 0 0 5m23s NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE horizontalpodautoscaler.autoscaling/keda-hpa-consumer-service Deployment/consumer-service <unknown>/5 (avg) 1 10 0 3s
You can see that the HPA is created and the consumer-service pod disappeared.
Let’s try and send a message to the Kafka topic.
$ kubectl -n kafka run kafka-producer -ti --image=strimzi/kafka:0.18.0-kafka-2.5.0 --rm=true --restart=Never -- bin/kafka-console-producer.sh --broker-list my-cluster-kafka-bootstrap:9092 --topic messages If you don't see a command prompt, try pressing enter. >Hello World
$ kubectl get pods -n keda-sample NAME READY STATUS RESTARTS AGE consumer-service-7d4bd5df95-j96w8 1/1 Running 0 33s
A new consumer-service pod is back up! Once the cooldown period has passed, you can see that the pod is removed again as there were no more events in the topic.
$ kubectl get pods -n keda-sample No resources found in keda-sample namespace.
What happens if I send many messages at once to Kafka? Let’s see!
$ kubectl get pods -n keda-sample NAME READY STATUS RESTARTS AGE consumer-service-7d4bd5df95-5pp78 1/1 Running 0 17s consumer-service-7d4bd5df95-hs8qh 1/1 Running 0 8s
There are two pods up!
So, I see what happens when I manually send messages here and there, but what happens in a more real situation when there is a stream of messages? When that happens, you need to ensure only a specific number of messages get through for a specified amount of time.
The following command sends 100 messages to the topic, throttled at 3 per second.
kubectl -n kafka run kafka-producer -ti --image=strimzi/kafka:0.18.0-kafka-2.5.0 --rm=true --restart=Never -- bin/kafka-producer-perf-test.sh --topic messages --throughput 3 --num-records 100 --record-size 4 --producer-props bootstrap.servers=my-cluster-kafka-bootstrap:9092
Let’s give the command a go!
You can see more pods getting created over time to help handle the events that are coming in.
$ kubectl get pods -n keda-sample NAME READY STATUS RESTARTS AGE consumer-service-7d4bd5df95-j2s5s 0/1 ContainerCreating 0 1s consumer-service-7d4bd5df95-sq2cb 1/1 Running 0 15s consumer-service-7d4bd5df95-w6j7g 0/1 ContainerCreating 0 1s consumer-service-7d4bd5df95-whghl 0/1 ContainerCreating 0 1s
Now there are five pods up. It won’t create more than five as I specified five partitions for my Kafka topic.
$ kubectl get pods -n keda-sample NAME READY STATUS RESTARTS AGE consumer-service-7d4bd5df95-j2s5s 1/1 Running 0 86s consumer-service-7d4bd5df95-sq2cb 1/1 Running 0 100s consumer-service-7d4bd5df95-vfc9k 1/1 Running 0 71s consumer-service-7d4bd5df95-w6j7g 1/1 Running 0 86s consumer-service-7d4bd5df95-whghl 1/1 Running 0 86s
And once no more events are found in the topic, the deployments get scaled back down.
$ kubectl get pods -n keda-sample NAME READY STATUS RESTARTS AGE consumer-service-7d4bd5df95-j2s5s 0/1 Terminating 0 3m12s consumer-service-7d4bd5df95-sq2cb 0/1 Terminating 0 3m26s consumer-service-7d4bd5df95-vfc9k 0/1 Terminating 0 2m57s consumer-service-7d4bd5df95-w6j7g 0/1 Terminating 0 3m12s
Scaling Kubernetes jobs
KEDA doesn’t just scale deployments, but it can also scale your Kubernetes jobs. Instead of having many events processed in your deployment and scaling up and down based on the number of messages needing to be consumed, KEDA can spin up a job for each message in the event source.
Once a job completes processing its single message, it terminates.
You can configure how many parallel jobs should be run at a time as well, similar to the maximum number of replicas you want in a deployment.
KEDA offers this as a solution to handling long-running executions as the job only terminates once the message processing has completed as opposed to deployments which terminate based on a timer.
You can also join the KEDA community on their dedicated Slack channel or participate in their community meetings. The information on how exactly to get involved in the community can be found on KEDA’s community page.