Overview

Skill Level: Any Skill Level

This recipe explains about the following.

1. Incident details - Web UI Responding Slow
2. Configuring Threshold
3. Configuring Synthetic Test
4. Configuring Runbook
5. Configuring Event Policies
6. Configuring Incident Policies

Ingredients

IBM Cloud Pak for Multicloud Management 1.3.0 (MCM Hub)
Redhat Openshift Container Plantform 4.3 (Managed Clusters)

Step-by-step

  1. Introduction

    MCM leverages the following objects for Application Monitoring and Incident management.

    • Thresholds
    • Synthetic Tests
    • Runbooks
    • Event Policies
    • Incident Policies

    This document explains about how to create and configure those objects for an use case Web UI Responding Slow.

    Here is the use case and incident handling flow.

    026-response-flow

     

     

    The detailed explanation of the usecase and how SRE anaylsing and resolve the incident is explained in the recipe

    https://developer.ibm.com/recipes/tutorials/mcm-monitoring-use-case-web-ui-is-becoming-slow/

     

  2. Incident - Web UI Responding Slow

    We are going to create and configure the following.

    • Threshold for Web UI service Response time high
    • Synthetic test for Web UI service
    • Runbook for How to Increase POD replica
    • Event Policies for Web UI Response time high
    • Incident Policies for Web UI Response time high

    Here is the sample incident with events and runbook.

     

    Incident Summary

    017-incident-response-inbox

    Incident Details – Events list

    018-incident-response-events

     

    Incident Details – Synthetic Test Event

    019-incident-response-events-synthetic

     

    Incident Details – Web UI Response time high Event

    020-incident-response-events-response

    Incident Details – Runbook associated

    021-incident-response-events-runbook

  3. Goto Administration Page in MCM Console

    Click on the Infrastructure Monitoring

    001-menu

  4. Configuring Threshold

    Goto Threshold Page

    Click on the Threshold card

    002-card-threshold

    Create Threshold for Web UI service Response time high

    Here is the list of threshold created. You can click on Create button to create new.

    003-threshold-home

    Here is the threshold configuration for Web UI service response time high.

    Enter the parameters as highlighted.

    Resource Type = Kubernetes Service
    Lantency > 2000 milliseconds
    POD Name contains wealthcare-web

    006-threshold-response-1

    006-threshold-response-2

    006-threshold-response-3

  5. Configuring Synthetic Test

    Goto Synthetic test Page

    Click on the Synthetic card

    031-card-synthetic

     

    Create Synthetic test for Web UI service

     

    Here is the list of Synthetic Test created. You can click on Create button to create new.

    007-synthetic-1

    Here is the Synthetic Test configuration for Web UI service.

    Enter the parameters as highlighted.

    GET URL : <application url>
    Response Time > 1 seconds as warning
    Response Time > 2 seconds as Critical
    Interval : 10 seconds

     

    007-synthetic-2

    007-synthetic-3

    007-synthetic-4

  6. Configuring Runbook

    Goto Runbook Page

    Click on the Runbook card

    009-runbook-replica-1

    Create Runbook for How to Increase POD replica

    Here is the list of Runbook created. You can click on Create button to create new.

    009-runbook-replica-1

    Here is the Runbook configuration for How to Increase POD replica.

    Enter the parameters as highlighted.

     

    009-runbook-replica-2

    009-runbook-replica-3

    009-runbook-replica-4

    009-runbook-replica-5

    009-runbook-replica-6

     

  7. Configuring Event Policies

    Goto Event Policies Page

    Click on the Policies card

    030-card-policies

    Create Event Policies for Web UI Response time high

    Here is the list of event policies created. You can click on Create button to create new.

    010-event-home

    Here is the Event Policy configuration for Web UI Response time high.

    Enter the parameters as highlighted.

    • Here the sender name attribute value refers to the synthetic test that we created in the previous step.
    • Here the summary attribute value refers to the threshold that we created in the previous step.

    011-event-response-1

    011-event-response-2

    011-event-response-3

    011-event-response-5

    011-event-response-6

    Enrich
    Here we enrich the event with Application = Wealthcare UI Responding Slow.

    This event enrichment will ensure all the events with the Application = Wealthcare UI Responding Slow are correlated into one incident with the name Wealthcare UI Responding Slow

     
    Runbook
    Here assign the previously created runbook How to Increase POD replica

  8. Configuring Incident Policies

    Goto Incident Policies Page

    Click on the Policies card

    030-card-policies

    Create Incident Policies for Web UI Response time high

    Here is the list of Incident policies created. You can click on Create button to create new.

    014-incident-home

    Here is the Incident Policy configuration for Web UI Response time high.

    Enter the parameters as highlighted.

    Here the Application attribute value refers to the Wealthcare UI Responding Slow enrichment that was done in the previous step (event policy).

    016-incident-response-1

    016-incident-response-2

    016-incident-response-3

    Group
    Assinged the incident to wealthcare group.

     
    Priority
    Set the priority 1 for this incident.

Join The Discussion