Monitoring critical tasks

This example shows how you can use RTA to monitor your CICS regions to make sure that a certain transaction ID is always running, and to issue a message for the operations staff when the transaction is found to be no longer running.

In this scenario, you have an application transaction that is called ABCD that you want to make sure is always running in your backend application owning regions (AORs). Your CICS topology might look like Figure 1.

Figure 1. Example topology of a CICSplex
Example topology of a CICSPlex
To accomplish this monitoring, you define three main resources in CICSPlex® SM, plus two ‘container’ type resources:
EVALDEF (Evaluation Definition)
Describes what resource you are monitoring, the criteria to be used to determine whether the condition is true, and how frequently CICSPlex SM needs to perform its checking in the CICS regions. See Create the EVALDEF for instructions on how to do this.
ACTION (Action Definition)
Describes what action to take when the above Evaluation Definitions change status from true to false, and from false to true. See Create the ACTION definition for instructions on how to do this.
RTADEF (Real Time Analysis Definition)
Ties the EVALDEF and ACTION together. Also, determines how frequently CICSPlex SM looks at the status (true or false) of the EVALDEF. See Create the RTADEF and add it to the RTAGROUP for instructions on how to do this.
RTAGROUP (RTA Group)
RTAGROUP is a container type group that serves only to hold the RTADEFs that you want to combine. It is still required even if it contains only one RTADEF. See Create the RTAGROUP for instructions on how to do this.
RTASPEC (RTA Specification)
RTASPEC is the other container type group. It points to the RTAGROUP on one side, and is associated with the target of where you want to implement the monitoring on the other. See Create the RTASPEC for instructions on how to do this.

You can create these definitions online through the CICSPlex SM Web User Interface (WUI), or through a batch job. This example shows the WUI definitions and you must be logged on to the WUI to take the actions that are described.

This example illustrates the use of RTA to monitor a certain transaction ID. CICSPlex SM EVALDEFs can be set up to monitor a wide variety of resources. The EVALDEF field Resource table in this example is for the TASK table. Find a full list of these tables, the fields in them and what these fields represent in CICSPlex SM resource tables.

CICSPlex SM RTA can take action as well as issue messages, but the actions it takes are directed to the table that is being monitored in the EVALDEF. The action to be performed is specified in the Modification String field. For example, you might monitor the TASK table for an active task, and if it used more than a preset amount of CPU, you might have CICSPlex SM issue a SET TASK PURGE. You cannot perform an action on a different CPSM table than the one you were monitoring, such as quiesce it as a target region. To do that, you either need to write your own CICSPlex SM API program or use an automation tool. The specific action that can be taken on each table is listed at the beginning of each table that is described in CICSPlex SM resource tables. For example, the TASK table that is described in this example supports PURGE, FORCEPURGE and KILL actions.

Create the RTASPEC

  1. Expand +Administration and select RTA MAS resource monitoring.
    CICSPlex SM Web User Interface view
  2. Create the RTA specification. From the RTA MAS resource monitoring menu, select Specifications:
    CICSPlex SM Web User Interface view
  3. Select Create. On the create screen, you need to complete only two fields and they are right at the top:
    RTA specification name
    An 8-character specification name of your choice. The example shows MYRTASPC.
    Description
    This field is for your own documentation only.
  4. Select Yes at the bottom of the screen to complete the action. You see the created RTASPEC:
    CICSPlex SM Web User Interface view

Create the RTAGROUP

  1. Select Go Back To Last Menu at the top:
    CICSPlex SM Web User Interface return option
  2. Select Groups.
    CICSPlex SM Web User Interface view
  3. On the RTA Groups screen, select Create.
    CICSPlex SM Web User Interface view
    Complete the fields as follows:
    RTA specification name
    An 8-character group name of your choice. The example shows MYRTAGRP.
    Description
    This field is for your own documentation only.
  4. Select Yes to complete the create action. You see confirmation, and the new entry listed like this:
    CICSPlex SM Web User Interface view

Tie the RTAGROUP to the RTASPEC

  1. With the RTAGROUP listed, put a check mark next to it and select Add to RTA specification.
    CICSPlex SM Web User Interface view
  2. RTA specname is the only field for you to complete. Type in the name of the RTA specification you created. This example uses MYRTASPC.
  3. When you have completed the RTA specname field, select Yes to complete the action.

Create the EVALDEF

  1. Go back to the RTA MAS resource monitoring menu and select Evaluations:
    CICSPlex SM Web User Interface view
  2. Select CREATE. You see the EYUSTARTEVALDEF.CREATE view (the view or menu names are always at the lower right of the browser).
    CICSPlex SM Web User Interface view
    Complete the following fields:
    Name
    Pick your own EVALDEF name. The example shows TRANGONE.
    Description
    This field is for your own documentation only.
    Sample interval
    This field is important. It determines how often CICSPlex SM sends a probe down to the MAS and perform the checking that is requested by this EVALDEF. The default is 300 seconds, or 5 minutes. If you leave it at this value for this example, every 5 minutes CICSPlex SM sends a probe to the requested regions and check to see whether transaction ABCD is running. If for some reason, the task ended 1 second after CICSPlex SM checked, it would be another 4 minutes 59 seconds before CICSPlex SM came back and found that it was gone. If you are testing a new setup, you might set it down to 10 seconds so you do not have to wait so long to see results. Remember, though, that the lower this number, the more frequently CICSPlex SM performs checking, and the more overhead there is.
    Resource table
    This example is to watch active tasks and ensure that a specific transaction ID is running, so select the TASK table. You do not have to remember all the various names of the tables. In the WUI, anytime that CICSPlex SM has a list of choices for you, select the pencil selector to the right of the input field to get a list of all possible valid values. For descriptions of all the resource tables, see CICSPlex SM resource tables.
    Instance identifier of evaluated resource
    You need to specify something here, but use an asterisk (*) to say that you want any instance. This field applies to the primary key of whatever Resource Table you selected above. In the TASK table, the primary key is the task number. Since it doesn't matter what task number the ABCD transaction is, leave this set to ‘*’.
    Method of evaluating results in result set
    There are a number of options to choose from here (for example: All, Any, Sum, Min, and Max), but the scope of this example is simply to make sure that at least one ABCD transaction is running. Therefore, select Cnt for Count.
    Separate task indicator
    This field defaults to ‘No’. The probe sent down to do the checking then runs under one of the existing CICSPlex SM long running tasks.
    Field being evaluated
    Leave this field blank. This example checks only that an ABCD transaction is running so this field is not applicable.
    Evaluation type
    Select Value.
    Evaluation logical operator
    Set this field to Eqfor Equal.
    Evaluation logical operator
    Set this field to 0 (zero).
    Severity assigned when result meets criteria
    The allocation of severity is your choice. The available values range from VLS (Very low severe) to VHS (Very high severe). The default that this example takes is Lw for Low warning.
    Threshold evaluation type parameters
    You can leave all of these parameters blank because you chose Evaluation type of Value.
    View that may provide extra information
    Leave this field blank.
    Filter string
    Finally, this field is where you specify the transaction code. In this field, type TRANID=ABCD., including the terminating period. You can change ABCD to be whatever transaction is important to you.
  3. Select Yes to create the EVALDEF. Your screen looks like this:
    CICSPlex SM Web User Interface view

Create the ACTION definition

  1. Go back and select Actions:
    CICSPlex SM Web User Interface view
  2. Select Create. You see the EYUSTARTACTION.CREATE view:
    CICSPlex SM Web User Interface view
    Complete the following fields:
    Action
    The example uses TRANGONE. It is the same name as the EVALDEF. It can be different.
    Description
    This field is for your own documentation only.
    Generate event
    Select Yes.
    Action Priority
    This field defaults to 1, and can range 1 - 255. It is used to determine the sort order on the Outstanding Events view (WUI Main Menu > Real Time Analysis (RTA) > Outstanding events).
    External message sent when event occurs and External message sent when event is cleared
    These fields give you 30 characters that can show up in the external message itself, when the Event is raised and when it is cleared. The example fields aim to be as meaningful as possible within the 30 characters restriction. You can see the results later in this example.
    Generate SNA generic alert
    Select No. CICSPlex SM can generate alerts that go into Tivoli NetView for z/OS. If you are using Tivoli NetView for z/OS and want to use this feature, see Enabling a CMAS to send generic alerts to NetView.
    CMAS to which NetView attached, Message text when alert is raised, Message text when alert is cleared
    Leave these three fields blank because this example does not use Tivoli NetView for z/OS.
    MVS automatic restart
    This field defaults to No, and you probably want to leave it that way. If you choose Yes, and the event occurs, CICSPlex SM immediately cancels the region and initiates a restart through MVS automated restart manager (ARM.) There are a number of requirements for this to occur. For details, see Implementing MVS automatic restart management.
  3. Select Yes to create the ACTION definition. Your screen looks like this:
    CICSPlex SM Web User Interface view

Create the RTADEF and add it to the RTAGROUP

  1. Select Go Back To Last Menu at the top, then select Definitions.
    CICSPlex SM Web User Interface view
  2. Select Create. You see the EYUSTARTRTADEF.CREATE view:
    CICSPlex SM Web User Interface view
    Complete the following fields:
    Name
    The example uses TRANGONE again. It is the same name as the EVALDEF and ACTION definitions, but it can be different.
    Description
    This field is for your own documentation only.
    Execute evaluation modification string
    Select No. CICSPlex SM can take action, such as purging a task, but for this example, the scope is limited to simply putting out a message.
    Analysis interval
    The default is 300 seconds, or 5 minutes. This field is similar to the EVALDEF field Sample interval, which says how frequently CICSPlex SM is going to perform checking. This field checks in an RTADEF on the status (true or false) of the EVALDEF pointed to in the next field. Pay close attention to these polling time fields (Analysis interval and Sample interval.) They interact and work with each other, and need to make sense in that context. For example, it would make no sense to have the EVALDEF Sample interval set to 300 seconds and the RTADEF Analysis interval set to 10 seconds. That would mean that every 10 seconds CICSPlex SM would be checking the state (true or false) of the EVALDEF, when it only had the possibility to change once every 300 seconds. Usually, keep these two polling time fields the same. This example uses 10 seconds for each to simplify and speed up the testing. Choose values that alert you of a condition in sufficient time but minimize the overhead of polling and checking too frequently.
    Action definition name
    Type TRANGONE here, or whatever name you chose for the ACTION definition you created in the last step.
    COUNT fields
    By default, these are all set to 1. You can change them if needed. For example, if your Analysis interval was 10 seconds, and Count of true evaluations before LW raised was set to 3, it would take three consecutive 10-second Analysis intervals of finding the condition in the EVALDEF “true” before an alert was raised. This might be useful if you wanted to monitor a task that runs continuously, but periodically it terminates itself and restarts a few seconds later. A 10 - 20-second gap might be acceptable, but if it goes on longer than that, you would want to raise an alert. Alternatively, you might simply change the Analysis interval to take that into account. There are many options here.
    Evaluation expression
    Type TRANGONE, or whatever name you chose for the EVALDEF you created in the first step above.
  3. Select Yes to create the RTADEF definition. You see the following screen:
    CICSPlex SM Web User Interface view
  4. Add the RTADEF to the RTAGROUP that you created earlier. Put a check mark next to the RTADEF, and select Add to RTA Group:
    CICSPlex SM Web User Interface view
  5. On the resulting view, type the RTA group name that you created earlier:
    CICSPlex SM Web User Interface view
  6. Select Yes to complete the action.

See a diagram of the resources

With the resources defined, you might want to see a graphical representation of how they are tied together. That is where the WUI MAP feature can help. Return to the RTA Specifications screen and put a check in the box next to the MYRTASPC RTA specification and then select Map.

CICSPlex SM Web User Interface view

The resulting display looks like this:

CICSPlex SM Web User Interface view

This example is a simple scenario, so the display in this case simply allows you to verify that the pieces are tied together properly. If you use other parts of CICSPlex SM, such as Workload Manager (WLM), you can use the Map feature on those definitions as well.

Test the configuration

  1. Install the RTASPEC. On the RTA Specifications screen, put a check in the box next to the RTASPEC MYRTASPC, then select Associate CICS System.
    CICSPlex SM Web User Interface view
    You see the EYUSTARTRTASPEC.ADDSYSDEF view:
    CICSPlex SM Web User Interface view
    Complete the following fields:
    CICS system
    This example uses AOR001 as the single region name to be associated with this RTASPEC. You might have also associated the RTA specification with a group of CICS regions by selecting Associate CICS Group instead of Associate CICS System on the RTA Specifications screen.
  2. Select Yes to complete the association. You receive the following confirmation:
    CICSPlex SM Web User Interface view
At this point, everything is done. If AOR001 is not running, start it up. Then, go to the EYULOG for the CMAS001 where AOR001 connected, and you can see confirmation that the RTASPEC is installed:
EYUCL0012I CMAS001 Connection of CMAS001 to AOR001 complete.
EYUTS0003I CMAS001 Topology Connect for AOR001 Complete - APPLID(AOR001) CICSplex(PRODPLEX).
EYUPM0003I CMAS001 RTA Specification (MYRTASPC) successfully installed for Context(PRODPLEX) Scope(AOR001).
So what happens when transaction ABCD is found not to be running in region AOR001? To test that, cancel all ABCD transactions in the region. Within a few seconds (determined by the Sampling interval and Analysis interval from the definitions), you see the following message in the EYULOG for the CMAS:
03/23/2016 17:12:06 EYUPN0007W CMAS001 Notify created for RTADEF TRANGONE by MRM, Context=PRODPLEX, Target=AOR001, Sev=LW,
03/23/2016 17:12:06 EYUPN0007W CMAS001  Resource=TASK, Key=*, Text=ABCD not running.
In the MVS SYSLOG and CMAS job log, you can see similar messages:
+EYUPN0007W CMAS001 Notify created for RTADEF TRANGONE by MRM,  061
 Context=PRODPLEX, Target=AOR001, Sev=LW, Resource=TASK, Key=*,
 Text=ABCD not running.

Your z/OS automation or monitoring package can pick up on this message and alert your operations staff. An RTA Alert is also raised that can be viewed in the WUI itself. From the main WUI entry pane, select Real Time Analysis (RTA), then Outstanding Events. The RTA outstanding events view is displayed:

CICSPlex SM Web User Interface view

Select an event name to get more details, such as when the event happened:

CICSPlex SM Web User Interface view
After you are alerted to the event, you can get transaction ABCD running again. Shortly after that occurred (again determined by the Sampling interval and Analysis interval from the definitions), the alert was cleared. The EYULOG shows:
03/23/2016 17:19:56 EYUPN0013W CMAS001 Notify resolved for RTADEF TRANGONE by MRM, Context=PRODPLEX, Target=AOR001, Sev=LW,
03/23/2016 17:19:56 EYUPN0013W CMAS001  Resource=TASK, Key=*, Text=ABCD running again.

A similar message is in the job log and SYSLOG. The alert also is cleared from the RTA outstanding events view in the WUI.