Tutorial

Monitor application availability using Prometheus, BlackBox Exporter, and Grafana

Go beyond the basic monitoring stack in Red Hat OpenShift Container Platform

By

Bhavesh Sadhu,

Brijesh Doshi,

Oliver Rodriguez

Red Hat OpenShift Container Platform includes a preconfigured, preinstalled, and self-updating monitoring stack that provides monitoring for core platform components. Default dashboards in the OpenShift Container Platform web console include visual representations of cluster metrics to help you to quickly understand the state of your cluster. However, this stack does not monitor user-defined applications out of the box nor does it provide a way to scrape endpoints and monitor responses for service availability.

To be able to view the health and availability of user-defined applications and probe endpoints, we developed an open source monitoring stack that captures baseline metrics using Prometheus from apps installed on the cluster and that determines service availability by probing endpoints with Blackbox Exporter. Finally, we can display service availability on a Grafana dashboard.

Blackbox Exporter is a probing exporter that allows monitoring network endpoints using HTTP, HTTPS, DNS, ICMP, or TCP protocols. It generates multiple metrics on your configured targets, like general endpoint status, response time, redirect information, or certificate expiration dates. Application developers can use the Blackbox Exporter to measure response times, check the availability of services, check the uptime of their services, analyze the latency of specific targets and paths of services running in the same cluster, and overall network health.

Prerequisites

Your cluster must have a default route configured. Use the following command in your OpenShift cluster to check the default route:

oc get route default-route -n openshift-image-registry --template=‘{{ .spec.host }}’

If the default route is not specified, you can configure it by using the following command:

oc patchconfigs.imageregistry.operator.openshift.io/cluster--patch ‘{“spec”:{“defaultRoute”:true}}’--type=merge

Steps

  1. Log in to your OpenShift cluster.
  2. Install Prometheus and Grafana.
  3. Install Prometheus Blackbox Exporter.
  4. Deploy a demo application.
  5. Configure Blackbox Exporter for the demo app.
  6. Configure Grafana
  7. Add a Prometheus data source to Grafana
  8. Add a new Grafana dashboard

Log in to your OpenShift cluster with the OC CLI

  1. Navigate to your Openshift console and select your identity at the upper right of the cluster. It should start with IAM#.

  2. In the drop-down menu that appears, select Copy login command.

    Copy Login command

  3. On the next page, click Display Token.

  4. Copy the line that starts with oc login ...

    oc login command

  5. Paste that oc login command into your terminal to log in to your cluster.

Install Prometheus and Grafana

  1. Get the code.

     git clone https://github.com/IBM/openshift-observability-with-blackbox-exporter.git
    
  2. Set the namespace for the demo application and observability stack.

     export NAMESPACE = <name of your namespace (you want to create)>
     echo $NAMESPACE
    
  3. Run the script to create the namespace and install the monitoring stack.

     ./observability.sh
    

Install Prometheus Blackbox Exporter

  1. Add the BlackBox Exporter helm repo.

     helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
    
  2. Update Helm repo using the following command.

     helm repo update
    
  3. Install the Blackbox Exporter helm chart into your cluster using the following command:

     helm install blackbox-exporter prometheus-community/prometheus-blackbox-exporter
    

Deploy a demo application

  1. From the OpenShift console, click the Administrator drop-down in the upper left, and then select the Developer perspective. Then, click Add in the left navigation pane, and then in the Samples screen, select the Basic Python sample.

    Python app install, add basic python sample

  2. Click Create on the Import from Git screen.

    Python app install, create

  3. Click the Developer drop-down at the upper left, and then select the Administrator perspective. Then, select the Networking tab in the left navigation pane, and click Routes. Finally, select the route for the Basic Python sample app you just created.

    Python app install, select route

  4. From the overview page of the application route, copy the Location value that contains the address to your demo application.

    Python app install, copy location

Configure Blackbox Exporter for the demo app

  1. Open the prometheus-job-blackbox-configuration.yaml file from the repo you cloned earlier in your code editor and for the targets entry under static_config, specify the route URL that you copied in the previous step.

    Blackbox config with Prometheus

  2. Create a secret that contains our blackbox configuration. Run the following command:

     oc create secret generic blackbox-secret --from-file=prometheus-job-blackbox-configuration.yaml
    
  3. Update the Prometheus custom resource in OpenShift by doing the following steps:

    1. From the OpenShift Console, in the left navigation panel, click Operators and select Installed Operators.
    2. Verify that you are in the namespace that you set at the beginning.
    3. Select the Prometheus operator. Select the tab that says Prometheus and then click on the prometheus resource that appears.
    4. On the next page, select the tab that says YAML. Then, add the following lines to that yaml in the spec section.

       additionalScrapeConfigs:
           key: prometheus-job-blackbox-configuration.yaml 
           name: blackbox-secret
      

      Blackbox config in yaml file

    5. Click Save when done.

Configure Grafana

  1. In the left navigation panel, click Workloads and then select Secrets.

  2. Find the secret called grafana-admin-credentials

    Grafana admin credentials

  3. Scroll down and copy the value for GF_SECURITY_ADMIN_PASSWORD

  4. In the left navigation panel, click Networking and then select Routes.

  5. Find the entry for grafana-route and click on the url in the Location column.

  6. Go to the Grafana route URL, and log in using the grafana admin credentials taken from the secret in the previous steps.

    Grafana basic dashboard

Add a Prometheus data source to Grafana

  1. In the Grafana dashboard, in the left navigation panel, click Configuration icon (the cog wheel). Ensure that you are on the Data Sources tab.

  2. Click the Add data source button.

    Grafana add data source

  3. Select the option for a Prometheus data source.

    Select Grafana data source

  4. In your browser tab, go back to the Openshift Console, and in the left navigation bar, click Networking and then select Routes. Select the prometheus-operated route.

  5. Copy the value in the Location section.

    Copy Prometheus route

  6. Go back to the browser tab that has the Grafana dashboard open. (You should be on the configuration page for a Prometheus data source.)

  7. In the URL field, enter the value of the Prometheus route that you copied in the previous step.

    Grafana add data source

  8. Click the Save & Test button. You should see a green message appears if the test was successful as seen in the following screen capture.

    Grafana added data sourfce

Add a new Grafana dashboard

  1. In Grafana, in the left navigation bar, click the + to create a new dashboard.

  2. Click the box for Add an empty panel.

    Grafana add panel

  3. Go back to your browser tab for the Openshift Console where you copied the Prometheus route. Then, click the link in the location section to take you to the Prometheus Dashboard.

    Prometheus route

  4. In the Prometheus Dashbaord, find the search/query field and enter probe_success and click the Execute button to the right of the field.

    Search for probe_success

  5. Copy the first result that is returned that ends with name=Python App. Your result should look similar to the following image, however, you will only see one result and the instance value will be different.

    Get probe

  6. Go back to your browser tab that has Grafana. (You should be on the New dashboard / Edit Panel page.)

  7. On the right side of the page under the Panel settings, enter a name for the panel. In our example, we used the name Observability.

  8. Click to expand the Visualization section in the panel settings, and select the option for Gauge.

  9. Find the text box with the Metrics drop-down by it in the lower-middle of the page. Enter the result from the Prometheus query executed in a previous step.

    Add Prometheus query

  10. Click Field in the upper right of the page to switch to the Field settings.

  11. Find the field for Display name and enter Python App.

  12. Click the Save button in the upper right.

  13. You will then be prompted for a Dashboard name, enter Observability and click Save.

    Dashboard name

    Your dashboard should have one gauge monitoring if the Python application is up and running.

When you are done, your dashboard should appear like the following image. This gauge represents if the Blackbox Exporter probe was successful against the python app's endpoint with 1 being a success and 0 being a failure.

Grafana final dashboard

Optional: Testing

Now that your dashboard is live, you will be able to see when one of your applications goes down.

To test this out, do the following:

  1. Navigate to your python application Deployment in Openshift

    Navigate to python app deployment in OpenShift dashboard

  2. Scale the Deployment down to 0 pods

    Scale python application deployment to 0 pods

  3. Navigate back to the Grafana dashboard and if necessary, click the refresh icon in the upper right. You should see the availability of the respective application lower to 0 to indicate that the application is not reachable. You can also click on the dropdown next to the refresh icon to have the dashboard refresh periodically.

    Dashboard showing simulated outage

Summary

In this tutorial, you deployed a basic observability stack to monitor application availability using Prometheus, BlackBox Exporter, and Grafana.

However, this approach does have its limitations. For example, it doesn't really scale well as it is a very manual process which requires configuration to be done for each endpoint that you would like to monitor.

If you are looking for an enterprise-ready monitoring solution that handles endpoint monitoring at scale, check out Instana. Instana is an enterprise-ready tool for observability, application performance monitoring, and more. One of the features of Instana is automatic discovery where Instana can automatically detect application requests to figure out what needs to be monitored. Learn more about how Instana supports observability-driven development.