Monitor application availability using Prometheus, BlackBox Exporter, and Grafana

Red Hat OpenShift Container Platform includes a preconfigured, preinstalled, and self-updating monitoring stack that provides monitoring for core platform components. Default dashboards in the OpenShift Container Platform web console include visual representations of cluster metrics to help you to quickly understand the state of your cluster. However, this stack does not monitor user-defined applications out of the box nor does it provide a way to scrape endpoints and monitor responses for service availability.

To be able to view the health and availability of user-defined applications and probe endpoints, we developed an open source monitoring stack that captures baseline metrics using Prometheus from apps installed on the cluster and that determines service availability by probing endpoints with Blackbox Exporter. Finally, we can display service availability on a Grafana dashboard.

Blackbox Exporter is a probing exporter that allows monitoring network endpoints using HTTP, HTTPS, DNS, ICMP, or TCP protocols. It generates multiple metrics on your configured targets, like general endpoint status, response time, redirect information, or certificate expiration dates. Application developers can use the Blackbox Exporter to measure response times, check the availability of services, check the uptime of their services, analyze the latency of specific targets and paths of services running in the same cluster, and overall network health.

Prerequisites

A Red Hat OpenShift cluster
OpenShift CLI client for your version of Openshift
Helm v3
Git CLI

Your cluster must have a default route configured. Use the following command in your OpenShift cluster to check the default route:

oc get route default-route -n openshift-image-registry --template=‘{{ .spec.host }}’

If the default route is not specified, you can configure it by using the following command:

oc patchconfigs.imageregistry.operator.openshift.io/cluster--patch ‘{“spec”:{“defaultRoute”:true}}’--type=merge

Steps

Log in to your OpenShift cluster.
Install Prometheus and Grafana.
Install Prometheus Blackbox Exporter.
Deploy a demo application.
Configure Blackbox Exporter for the demo app.
Configure Grafana
Add a Prometheus data source to Grafana
Add a new Grafana dashboard

Log in to your OpenShift cluster with the OC CLI

Navigate to your Openshift console and select your identity at the upper right of the cluster. It should start with IAM#.
In the drop-down menu that appears, select Copy login command.
On the next page, click Display Token.
Copy the line that starts with oc login ...
Paste that oc login command into your terminal to log in to your cluster.

Install Prometheus and Grafana

Get the code.

 git clone https://github.com/IBM/openshift-observability-with-blackbox-exporter.git

Set the namespace for the demo application and observability stack.

 export NAMESPACE = <name of your namespace (you want to create)>
 echo $NAMESPACE

Run the script to create the namespace and install the monitoring stack.
```
 ./observability.sh
```

Install Prometheus Blackbox Exporter

Add the BlackBox Exporter helm repo.

 helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

Update Helm repo using the following command.
```
 helm repo update
```
Install the Blackbox Exporter helm chart into your cluster using the following command:
```
 helm install blackbox-exporter prometheus-community/prometheus-blackbox-exporter
```

Deploy a demo application

From the OpenShift console, click the Administrator drop-down in the upper left, and then select the Developer perspective. Then, click Add in the left navigation pane, and then in the Samples screen, select the Basic Python sample.
Click Create on the Import from Git screen.
Click the Developer drop-down at the upper left, and then select the Administrator perspective. Then, select the Networking tab in the left navigation pane, and click Routes. Finally, select the route for the Basic Python sample app you just created.
From the overview page of the application route, copy the Location value that contains the address to your demo application.

Configure Blackbox Exporter for the demo app

Open the prometheus-job-blackbox-configuration.yaml file from the repo you cloned earlier in your code editor and for the targets entry under static_config, specify the route URL that you copied in the previous step.
Create a secret that contains our blackbox configuration. Run the following command:
```
 oc create secret generic blackbox-secret --from-file=prometheus-job-blackbox-configuration.yaml
```
Update the Prometheus custom resource in OpenShift by doing the following steps:
1. From the OpenShift Console, in the left navigation panel, click Operators and select Installed Operators.
2. Verify that you are in the namespace that you set at the beginning.
3. Select the Prometheus operator. Select the tab that says Prometheus and then click on the prometheus resource that appears.
4. On the next page, select the tab that says YAML. Then, add the following lines to that yaml in the spec section.
```
 additionalScrapeConfigs:
     key: prometheus-job-blackbox-configuration.yaml 
     name: blackbox-secret
```
5. Click Save when done.

Configure Grafana

In the left navigation panel, click Workloads and then select Secrets.
Find the secret called grafana-admin-credentials
Scroll down and copy the value for GF_SECURITY_ADMIN_PASSWORD
In the left navigation panel, click Networking and then select Routes.
Find the entry for grafana-route and click on the url in the Location column.
Go to the Grafana route URL, and log in using the grafana admin credentials taken from the secret in the previous steps.

Add a Prometheus data source to Grafana

In the Grafana dashboard, in the left navigation panel, click Configuration icon (the cog wheel). Ensure that you are on the Data Sources tab.
Click the Add data source button.
Select the option for a Prometheus data source.
In your browser tab, go back to the Openshift Console, and in the left navigation bar, click Networking and then select Routes. Select the prometheus-operated route.
Copy the value in the Location section.
Go back to the browser tab that has the Grafana dashboard open. (You should be on the configuration page for a Prometheus data source.)
In the URL field, enter the value of the Prometheus route that you copied in the previous step.
Click the Save & Test button. You should see a green message appears if the test was successful as seen in the following screen capture.

Add a new Grafana dashboard

In Grafana, in the left navigation bar, click the + to create a new dashboard.
Click the box for Add an empty panel.
Go back to your browser tab for the Openshift Console where you copied the Prometheus route. Then, click the link in the location section to take you to the Prometheus Dashboard.
In the Prometheus Dashbaord, find the search/query field and enter probe_success and click the Execute button to the right of the field.
Copy the first result that is returned that ends with name=Python App. Your result should look similar to the following image, however, you will only see one result and the instance value will be different.
Go back to your browser tab that has Grafana. (You should be on the New dashboard / Edit Panel page.)
On the right side of the page under the Panel settings, enter a name for the panel. In our example, we used the name Observability.
Click to expand the Visualization section in the panel settings, and select the option for Gauge.
Find the text box with the Metrics drop-down by it in the lower-middle of the page. Enter the result from the Prometheus query executed in a previous step.
Click Field in the upper right of the page to switch to the Field settings.
Find the field for Display name and enter Python App.
Click the Save button in the upper right.
You will then be prompted for a Dashboard name, enter Observability and click Save.

Your dashboard should have one gauge monitoring if the Python application is up and running.

When you are done, your dashboard should appear like the following image. This gauge represents if the Blackbox Exporter probe was successful against the python app's endpoint with 1 being a success and 0 being a failure.

Grafana final dashboard

Optional: Testing

Now that your dashboard is live, you will be able to see when one of your applications goes down.

To test this out, do the following:

Navigate to your python application Deployment in Openshift
Scale the Deployment down to 0 pods
Navigate back to the Grafana dashboard and if necessary, click the refresh icon in the upper right. You should see the availability of the respective application lower to 0 to indicate that the application is not reachable. You can also click on the dropdown next to the refresh icon to have the dashboard refresh periodically.

Summary

In this tutorial, you deployed a basic observability stack to monitor application availability using Prometheus, BlackBox Exporter, and Grafana.

However, this approach does have its limitations. For example, it doesn't really scale well as it is a very manual process which requires configuration to be done for each endpoint that you would like to monitor.

If you are looking for an enterprise-ready monitoring solution that handles endpoint monitoring at scale, check out Instana. Instana is an enterprise-ready tool for observability, application performance monitoring, and more. One of the features of Instana is automatic discovery where Instana can automatically detect application requests to figure out what needs to be monitored. Learn more about how Instana supports observability-driven development.