Monitoring a Kubernetes cluster

Use this tutorial to learn how to configure an IBM Cloud® Kubernetes Service cluster to forward metrics to the IBM Cloud® Monitoring service. You can monitor clusters in IBM Cloud, on-prem, and in other clouds.

To configure a cluster to forward metrics, you must install a monitoring agent onto each worker node in your Kubernetes cluster by using a DaemonSet. The monitoring agent uses an access key (token) to authenticate with the IBM Cloud Monitoring instance. The monitoring agent acts as a data collector. It automatically collects metrics such as worker node CPU and worker node memory usage, HTTP traffic into and out of your containers, and data about several infrastructure components. In addition, the agent can collect custom application metrics by using either a Prometheus-compatible scraper or a StatsD facade.

Figure 1. Components overview on the IBM Cloud

For example, to configure your Kubernetes cluster to forward metrics to your IBM Cloud Monitoring instance, you can deploy the agent by using Helm or a script:

Helm
Script

The Monitoring agent automatically collects the following types of system metrics per host:

System hosts metrics provide information about CPU, memory, and storage usage metrics, that you can use to analyze the performance and resource utilization of all your processes.
File and File System metrics provide information about files and file system that you can use to analyze file interactions that occur in your system. For example, you can find information about your open files, bytes going in and out, or the percentage of usage of a given file system.
Process metrics provide information about the processes that run in your servers. For example, you can use these metrics to explore the number of processes, or get client or server information.
Network metrics provide information about the network. They offer insight to the connections that are established between your applications, containers, and servers. For example, you can find information about the bytes that are being sent or received, or the number of HTTP requests, connections, and latency. In addition, for SQL or MongoDB, the agent collects additional information when it is configured in troubleshooting mode.

The Monitoring agent automatically collects the following types of metrics per Kubernetes cluster:

State metrics: Kube state metrics report on the health and state of the various objects that run inside Kubernetes components, such as deployments, nodes and pods. To see the list of metrics that are collected by default, see Kubernetes State.
Resource usage metrics: Resource usage metrics reports on the health and state of CPU and memory for workers (nodes) and pods that are running in the cluster. The data can be analyzed by namespace, by worker, by pod, by workload object such as deployments, daemonSets, and more.

For a list of collected metrics, see Metrics Available for orchestrated environments.

Through the Monitoring UI, you can analyze data in the Advisor tab, the Explore tab, and in the Dashboard tab. You monitor the data through metric views and dashboards.

Consider the following information when monitoring your data:

In the Explorer tab, you can monitor individual metrics.
In the Advisor tab, you can monitor Kubernetes or host level metrics.

This tab is only available for users that belong to a team that has access to monitor Kubernetes or host level metrics.
In the Dashboard tab, you can monitor through panels predefined dashboards or custom ones and get a specialized insight into network data, application data, topology, services, hosts, and containers. A panel displays a metric or group of metrics in a dashboard.

For each metric view and dashboard, you can define the scope of the data, how to aggregate data, and what time and group filters to apply to the data. For more information, see Managing panels.

You can configure a dashboard as the default entry point for a team, unifying a team's experience, and allowing users to focus their immediate attention on the most relevant information for them.

For more information, see Viewing metrics.

Objectives

In this tutorial, you configure metrics for your IBM Cloud® Kubernetes Service cluster. In particular, you:

Provision an IBM Cloud Monitoring instance.
Configure the monitoring agent in your cluster to sent metrics.
Use the monitoring UI to analyze your cluster metrics.

Before you begin

Read about Monitoring.
Have a user ID that is a member or an owner of an IBM Cloud account. To get an IBM Cloud user ID, go to: Registration.
Get information about Kubernetes monitoring agent images.
Check the Kubernetes resource requirements.
Install the IBM Cloud CLI and plug-ins:
- IBM Cloud CLI (ibmcloud)
- IBM Cloud Kubernetes Service plug-in (ibmcloud ks)
- IBM Cloud Container Registry plug-in (ibmcloud cr)
- IBM Cloud Kubernetes Service observability plug-in (ibmcloud ob)
Install the Kubernetes CLI (kubectl)

Make sure that the kubectl version is compatible with your cluster version. If the kubectl version is not compatible, you can get an error such as kubectl create clusterrolebinding failed!. You can use kubectl version --short to check versions of your cluster and your kubectl client.
Create a cluster or use an existing IBM Cloud Kubernetes Service cluster.
Make sure that your user ID is assigned the following IBM Cloud® Identity and Access Management policies:

Table 1. List of IAM policies required to complete the tutorial
Resource	Scope of the access policy	Role	Region	Information
Resource group default	Resource group	Viewer	US-south	This policy is required to allow the user to see service instances in the Default resource group.
IBM Cloud Monitoring service	Resource group	Editor	Us-south	This policy is required to allow the user to provision and administer the IBM Cloud Monitoring service in the default resource group.
Kubernetes cluster instance	Resource	Editor	Us-south	This policy is required to configure the secret and the monitoring agent in the Kubernetes cluster.

For more information about the IBM Cloud® Kubernetes Service IAM roles, see User access permissions.

Provision an IBM Cloud Monitoring instance

In this getting tutorial, instructions are provided to provision an instance of the IBM Cloud Monitoring in the US-South region. For more information about supported regions, see Regions.

To provision an instance of IBM Cloud Monitoring through the IBM Cloud UI, complete the following steps:

Log in to your IBM Cloud account.

After you log in with your user ID and password, the IBM Cloud UI opens.
Click Catalog. The list of the services that are available in IBM Cloud opens.
To filter the list of services that is displayed, select the Logging and monitoring category.
Click the IBM Cloud Monitoring tile.
Select a location and a service plan.

By default, the Lite plan is set.

For more information about other service plans, see Pricing plans.
Configure the resource.

Enter a name for the service instance.

Select a resource group.

Optionally add tags.
Click Create.

After you provision an instance, the Observability dashboard opens and shows details for your Monitoring instances.

To provision an instance through the CLI, see Provisioning an instance through the IBM Cloud CLI.

Configure your Kubernetes cluster to send metrics to your instance

To configure your Kubernetes cluster to send metrics to your IBM Cloud Monitoring instance, you must install a monitoring agent pod on each node of your cluster. The monitoring agent is installed via a DaemonSet which ensures an instance of the agent is running on every worker node. The monitoring agent collects metrics from the pod where it is installed, and forwards the data to your instance.

In order to provide the full suite of system metrics, the monitoring agent needs to have a privileged status.

Complete the following steps from the command-line to deploy the agent by using a script:

Open a terminal. Then, log in to the IBM Cloud. Run the following command and follow the prompts:
```
ibmcloud login -a cloud.ibm.com
```
Select the account where the cluster is available.
Set up the cluster environment. Run the following commands:

First, get the command to set the environment variable and download the Kubernetes configuration files.
```
ibmcloud ks cluster config --cluster <cluster_name_or_ID>
```
When the download of the configuration files is finished, a command is displayed that you can use to set the path to the local Kubernetes configuration file as an environment variable. Copy and paste the command that is displayed in your terminal to set the KUBECONFIG environment variable.

Every time you log in to the IBM Cloud® Kubernetes Service CLI to work with clusters, you must run these commands to set the path to the cluster's configuration file as a session variable. The Kubernetes CLI uses this variable to find a local configuration file and certificates that are necessary to connect with the cluster in IBM Cloud.
Obtain the access key. For more information, see Getting the access key through the IBM Cloud UI.
Obtain the ingestion URL from the collector endpoints.
Deploy the monitoring agent. Run the following command:
```
curl -sL https://raw.githubusercontent.com/draios/sysdig-cloud-scripts/master/agent_deploy/IBMCloud-Kubernetes-Service/install-agent-k8s.sh | bash -s -- -a ACCESS_KEY -c COLLECTOR_ENDPOINT -t TAG_DATA -ac 'sysdig_capture_enabled: false'
```
Where
- ACCESS_KEY is the ingestion key for the instance that you previously retrieved.
- COLLECTOR_ENDPOINT is the ingestion URL for the region where the monitoring instance is available that you previously retrieved.
- TAG_DATA are comma-separated tags that are formatted as TAG_NAME:TAG_VALUE. You can associate one or more tags to your monitoring agent. For example: role:serviceX,location:us-south. Later on, you can use these tags to identify metrics from the environment where the agent is running.
- Set sysdig_capture_enabled to false to disable the capture feature. By default is set to true. For more information, see Working with captures.
Verify that the monitoring agent is created successfully and its status. Run the following command:
```
kubectl get pods -n ibm-observe
```
The deployment is successful when you see one or more sysdig-agent pods. The number of sysdig-agent pods equals the number of worker nodes in your cluster. All pods must be in a Running state.

Launch the monitoring UI

To launch the monitoring UI through the IBM Cloud console, complete the following steps.

Log in to your IBM Cloud account.

After you log in with your user ID and password, the IBM Cloud Dashboard opens.
From the menu , select Observability.
Select Monitoring. The list of instances that are available on IBM Cloud is displayed.
Find your instance and click Open dashboard. The web UI opens.

It may take some time before you see the cluster entry while the information is initally collected and processed by the monitoring agent.

You only can monitor one instance per browser. You could have multiple tabs for the same instance.

Monitor your cluster

In the Advisor tab, you can monitor and troubleshoot the health, risk, and capacity of hosts and Kubernetes clusters.

Data is refreshed every 10 minutes.
Metrics are prioritized by event count and severity.
For more information, see Advisor.

In the Advisor section, you can choose to monitor your Kubernetes clusters by cluster, by node, by namespace, or by workload. Each option offers a set of predefined dashboards that you can use to monitor the health of your resources. You can also select to monitor by host.

Monitoring Kubernetes clusters by cluster

When you choose to monitor your Kubernetes clusters by cluster, you can select more filters to display data by node or by namespace, or you can choose any of the following dashboards:

Workload Status & Performance
Node Status & Performance
Pod Rightsizing & Workload Capacity Optimization
Cluster Capacity Planning
Cluster / Namespace Available Resources
Cluster Overview
CPU Allocation Optimization
Memory Allocation Optimization

Advisor predefined dashboards by cluster

For more information on how to interpret this view, see About Clusters Overview.

Monitoring Kubernetes clusters by node

When you choose to monitor your Kubernetes clusters by node, you can choose any of the following dashboards:

Node Status & Performance
Pod Scheduling Troubleshooting
Node Overview
CPU Allocation Optimization
Memory Allocation Optimization

For more information on how to interpret this view, see About Nodes Overview.

Monitoring Kubernetes clusters by namespace

When you choose to monitor your Kubernetes clusters by namespace, you can select more filters to display data by workload, or you can choose any of the following dashboards:

Workload Status & Performance
Pod Status & Performance
Pod Rightsizing & Workload Capacity Optimization
Namespace Overview
Workloads CPU Usage and Allocation
Workloads Memory Usage and Allocation

For more information on how to interpret this view, see About Namespaces Overview.

Monitoring Kubernetes clusters by workloads

When you choose to monitor your Kubernetes clusters by workloads, you can choose any of the following dashboards:

Container Resource Usage & Troubleshooting
Pod Status & Performance
Pod Rightsizing & Workload Capacity Optimization
Workload Status & Performance
Deployment Overview
Pod Overview
Workloads CPU Usage and Allocation
Workloads Memory Usage and Allocation

For more information on how to interpret this view, see About Workloads Overview.

Next steps

Create a custom dashboard. For more information, see Working with dashboards.
Learn about alerts. For more information, see Working with alerts.
Learn how to manage logs from your cluster. See Logging with Kubernetes clusters.
Learn about the IBM Cloud Monitoring Workload Protection functionality to find and prioritize software vulnerabilities, detect and respond to threats, and manage configurations, permissions and compliance from source to run. See Getting started with IBM Cloud® Security and Compliance Center Workload Protection.