To integrate Predictive Insights with various data sources, you must mediate data into the system.

Mediation Packs and Community Addition examples can be found on the Predictive Insights Resources area.

Watch

Mediation packs

Mediation packs help you to integrate the KPI data streams of various performance managers. In each mediation pack, you’ll find a pre-built Predictive Insights model that you can modify to suit your needs and some helpful guides. All are available for download for entitled customers from IBM Passport Advantage.

For information about individual Mediation Packs, go to the Predictive Insights Resources area. For community examples, toggle the Additional sample mediation packs section.




Additional sample mediation packs

The following are sample mediation packs that can be download and modified as needed to address IBM Operations Analytics Predictive Insights customer needs.

Sample Mediation Pack Support
Predictive Insights for IBM Performance Management “As-Is”
Predictive Insights for IBM Integration Bus “As-Is”
Predictive Insights – Juniper Networks Cloud Analytics Engine “As-Is”

 

Reference the Sample Mediation Pack and Integration area for ideas for handling additional content in Operations Analytics Predictive Insights.

Use the Operations Analytics the Community forum for questions and information on Operations Analytics Mediation Packs.

Mediation Packs can be found on the Operations Analytics Community Resources area.

Using Logstash to mediate data

You can use Logstash as an ETL (Extract, Transform, Load) tool to transform data to required format to mediate it into Predictive Insights.

Read Logstash for PI Mediation presentation for an overview and to see some usage examples.

Download scaLogstash121214.tar to access some custom Logstash plugin code.

Download logstashSCAPIdocs.tar to access references pages for the custom plugins.

Tips for selecting metric data

Before you build your Predictive Insights (PI) Key Performance Indicator (KPI) model, having an understanding of the basic building blocks on the model is important. In this topic, we answer these questions:

  • What is a KPI model?
  • How is a KPI identifier formed?
  • How to determine the number of KPIs your your model?

KPI model

A KPI is simply a time series, and a time series a just a series of values assumed to have been collected at constant interval.

All incoming metrics are mapped to PI’s notion of KPI. The number of KPIs arriving as well as the rate of arrival are basic dimensions of PI scalability. KPI data moves through the system, is subject to analytics and the resulting Anomaly Events presented on the Active Event List (AEL) involve KPI derived information. For these, and many many more reasons, KPI is a fundamental notion.

KPI Identity

A key aspect of the KPI is its identity. The ientity is formed at the point of entry into the system using rules defined by the user in the Mediation tool.

A KPI identity has 3 parts:

  • Resource: ‘The thing which we are measuring’. Typical examples might be server1, host5, AppServer5.
  • Group: A unit of organization by which the KPI data arrives … a database table, a file type, and so on.
  • Metric: The traditional statistic that is being gathered, for example, CPU utilization, TCP Error Rate, Hit Rate, Response time.

As data arrives at PI, the mediator applies rules to extract and set values for each of these fields, giving unique KPI identities. The simple concatenation of these 3 values results in that unique KPI identity it

KPI Identity = Resource + Group + Metric

Example:

PI is extracting data from a Linux_CPU table with the following columns and values:

Timestamp, ServerName, User_CPU, System_CPU
12:23, serverABC,12,22
12:23,serverXYZ,58,5

Assuming we had configured mediation to indicate that ‘ServerName’ column was the resource ID, and User_CPU and System_CPU were the metrics, and ‘Linux_CPU_Group was our group, we would end up with:

KPI1 : serverABC_Linux_CPU_Group_User_CPU
KPI2 : serverABC_Linux_CPU_Group_System_CPU
KPI3 : serverXYZ_Linux_CPU_Group_User_CPU
KPI4 : serverXYZ_Linux_CPU_Group_System_CPU

The result is 4 distinct KPIs.

With the Mediation Tool, it is possible to specify that multiple columns of data be used to form Resource identity. For example, if we had server123, and it had 4 ethernet devices eth0, eth1, eth2, eth3 we might decide to form Resource Identify by

ServerName + EthernetID .. ie . server123_eth0, server123_eth1 etc and have metrics (e.g. Packets_Received) for each of them.

The choice of mapping strategies, in other words, how you are going to form KPI identity, is one of the key decisions/milestones in a trial. It has implications all the way through the system, and in the information that is presented to the user on the output (AEL, KPI viewer side)

Sizing considerations

With a sense of how KPI identity is formed, it’s a small step to get a sense of how many KPIs are implied by your mapping.

The memory and cpu requirements are proportional to the square of the number of KPIs – in other words, doubling the number of KPIs implies 4x the memory.

Sometimes, when faced with exceeding allowable capacities, alternative mapping strategies should be considered. For example, if we had a business service, which was being measured by a number of external probes, distributed in various geos, we might have something like the following:

Timestamp, ServiceName, ProbeGeo, responseTime

12:23, MortgageProcessing, NewYork, 12

12:24, MortgageProcessing,Texas,32

12:24, MortgageProcessing, California, 20

12:25, MortgageProcessing, Florida, 18

With one mapping strategy … Resource = ServiceName + ProbeGeo we’d end up with 4 KPIs. Now if, as often happens the probe is probing a finer grain aspect of the service, e.g. specific web pages e.g.

Timestamp, ServiceName, URL, ProbeGeo, responseTime

12:23, MortgageProcessing, /accountStatus.php, NewYork, 14

12:23, MortgageProcessing, /accountLogin.php, NewYork, 8

Then we’d have another multiplier.. Imagine 50 different URLs for the service, then with Resource = ServiceName + ProbeGeo + URL …. we’d have 4 x50 = 200 KPIs … for this one service.

Sometimes it will make sense to aggregate the metrics .. summarize by ServiceAlone+Geo, ignoring individual urls, where started from, or indeed, aggregating up to the actual service itself

Resource = MorgageProcessing.

This consideration occurs in other places too, hosts with multiple CPUs, JVM clusters ( keep individual data, or aggregate to identifiable clusters ), servers with multiple disk drives.

It is important to watch out for places where there are such opportunities for aggregation as a means to limit the number of KPIs

Metrics known to generate alerts

The goal of this section is not to dictate what metrics you need to put through Predictive Insights but rather to share what metrics have proved most useful when alerted on in the past.

It is restricted to IBM products and is a starting point rather that a comprehensive list. This information is not maintained.

ITM_Linux

Linux CPU
User_CPU,System_CPU, Busy_CPU, Wait_IO_CPU

Linux Network
Bytes_Received_per_sec, Bytes_Transmitted_per_sec

Linux System Statistics
Ctxt_Switches_per_sec, System_Load_15min, Pages_paged_out_per_sec, Pages_Swapped_in, Pages_Swapped_out, Page_Faults_per_sec, Total_Number_Processes

Linux VM Stats
Swap_Space_Used, Memory_Used, Memory_in_Buffers, Memory_Cached

ITM_Windows

NT Server
Bytes_Received/sec, Bytes_Transmitted/sec, Context_Blocks_Queued/sec, Server_Sessions, Bytes_Received/sec_64, Bytes_Transmitted/sec_64,Total_Ended_Sessions_64

NT System
%_Total_Privileged_Time, %_Total_Processor_Time, %_Total_User_Time, Context_Switches/Sec,File_Control_Operations/Sec, File_Data_Operations/Sec, File_Read_Operations/Sec, File_Write_Operations/Sec,Processor_Queue_Length, System_Calls/Sec, Processor_Queue_Length_Excess,File_Control_Bytes/Sec_64, File_Read_Bytes/Sec_64,File_Write_Bytes/Sec_64

NT Memory 64
Committed_kBytes, Page_Faults/Sec, Page_Reads/Sec, Page_Writes/Sec, Pages_Input/Sec, Pages_Output/Sec, Pages/sec, Pool_Nonpaged_Allocs, Pool_Nonpaged_kBytes, Pool_Paged_Allocs, Pool_Paged_kBytes, Pool_Paged_Resident_Bytes, Total_Working_Set_kBytes

NT Network Interface
Packets_Received/sec, Packets_Received_Discarded, Packets_Received_Errors, Packets_Received_Unicast /sec, Packets_Received_Unknown, Packets_Sent/sec

ITCAM for Web Response Time:

WRT Application Status,
Percent_Good,Total_Requests, Maximum_Response_Time, Average_Client_Time, Average_Load_Time,Average_Network_Time

ITCAM for WebSphere Application Servers:

Application Server
JVM_Memory_Used, CPU_Used,System_Paging_Rate, Platform_CPU_Used

Datasources
Total_Wait_Time, Connection_Rate, Connection_Max_Wait_Time, Query_Rate,Average_Query_Processing_Time, Update_Rate,Average_Update_Processing_Time,Average_Processing_Time

EJB_Containers
Method_Average_Response_Time, Method_Invocation_Count, Method_Invocation_Rate, Create_Count,Remove_Count, Creation_Rate, Removal_Rate, Total_Method_Invocation_Time, Request_Count

Enterprise_Java_Beans
Method_Average_Response_Time, Method_Invocations, Method_Invocation_Rate

Garbage_Collection_Analysis
Times_Run, Objects_Freed, Objects_Moved, Kbytes_Total_Freed_by_GC, Kbytes_Used, Kbytes_Used_Delta, Kbytes_Free,Real_Time, GC_Rate,Heap_Used_Percent

J2C_Connection_Pools
Connection_Creation_Rate, Connection_Destruction_Rate, Average_Free_Connections, Average_Pool_Size, Average_Usage_Time

Thread_Pools
Average_Pool_Size

Web_Applications
Request_Count, Average_Response_Time, Request_Rate

ITCAM for Robotic Response Time:

RRT_Application_Status

Percent_Failed,Percent_Slow, Percent_Good,Percent_Available, Average_Response_Time, Failed_Requests, Total_Requests, Slow_Requests,

Good_Requests, Minimum_Response_Time, Maximum_Response_Time, Total_Server_Response_Time, Total_Connect_Time, Total_DNS_Time, Total_Resolve_Time,

Average_Server_Response_Time, Average_Connect_Time, Average_DNS_TIME, Average_Resolve_Time, Client_Time, Network_Time, Server_Time

RRT_Robotic_Playback_Status
Last_Run_Duration

RRT_Transaction_Status
Percent_Failed,Percent_Slow, Percent_Good, Percent_Available, Average_Response_Time, Failed_Requests, Total_Requests, Slow_Requests,

Good_Requests,Minimum_Response_Time,Maximum_Response_Time,Total_Server_Response_Time,Total_Connect_Time,Total_DNS_Time,Total_Resolve_Time,

Average_Server_Response_Time,Average_Connect_Time, Average_DNS_TIME, Average_Resolve_Time, Client_Time,Network_Time, Server_Time

Off the shelf customer models

Note: this information is not maintained.

In general, for any datasource, it is important to choose metrics for which you wish to see anomalies. Off the Shelf models are created based on metric groups and metrics used previously on customer trials. These models can easily be imported into projects.

The models contain table names taken from a customer TDW and may be different in another database.

The set of available prepared model files

The following are the set of available pamodel files (“save link as” to download):

ITCAM.pamodel: A model containing a subset of ITCAM KPIs.
ITM.pamodel: A model containing a subset of ITM KPIs.
WAREHOUS.pamodel: A model containing a subset of IBM ITM and ITCAM tables.
omegamon.pamodel: A model containing a subset of zOS OMEGAMON KPIs.
robotics.pamodel: A model containing a subset of ITCAM for Transactions KPIs.

Importing a prepared model file

Steps to Import OTS models:

1. Save model WAREHOUS.pamodel to server running mediation tool.

2. Launch the mediation tool, and press OK to create a workspace.

3. Create a new Predictive Insights project

4. Choose File > Import > General > File System > Next > Browse, and choose the location of WAREHOUS.pamodel file is saved and press OK

5. Tick WAREHOUS.pamodel and Finish

6. Open chosen model in the editor

7. Configure the model for your database (Update Driver, Time Zone, Host Name, Database, port in URL, Schema, Username and Password in Connection Details tab) and Test Connection.

Next, it will be necessary to “Synchronize Schema” from the Model Design tab. This will check the database to ensure that the tables specified in the model exist in your database. If they don’t exist, you will have to select a check box to “Remove Unreferenced Elements”. You can also choose “Add” tables so that you can create additional metric groups at this point.

8. Click Model Design tab

9. Right click on datasource and choose Synchronize Schema.

10. Tick to remove referenced elements and Click OK

Next, you will have to remove the associated Metric Groups for tables that do not exist in your database – these will show up as Problems in the Problems View.

11. Delete metric groups that are now empty due to their source element being removed.

Verify:

12. Do a metric group preview on the populated metric groups to ensure they are correct.

13. It would be advisable to include some filtering at this point in your model to reduce the number of resources and metrics that will be extracted and analyzed.

14. If the remaining groups preview successfully then you can deploy.

The WAREHOUS.pamodel file

You should verify metric groups are correct using Preview Extracted data. On successful Extraction Preview for each Metric Group it’s ready to be deployed.

Table 1.1 Describes the WAREHOUS.pamodel. It contains the list of tables from IBM ITM and ITCAM. The Metric Groups contained in WAREHOUS.pamodel were created using best practice metrics which were proven quality metrics on customer trials. One Metric Group has been created for each datasource element (table) in model.

Table 1.1

WAREHOUS.pamodel
Unix_Memory
System
Network
KLZ_CPU
KLZ_Network
KLZ_System_Statistics
KLZ_VM_Stats
Linux_CPU
Linux_Network
Linux_System_Statistics
Linux_VM_Stats
NT_Memory_64
NT_Network_Interface
NT_Server
NT_System
SMP_CPU
Application_Server
DB_Connection_Pools
EJB_Containers
Enterprise_Java_Beans
Garbage_Collection_Analysis
J2C_Connection_Pools
Thread_Pools
Web_Applications

Glossary

unreferenced elements => Elements that are contained in the imported .pamodel file that do not exist in your database. This can be a table or a metric group created from that non existing table.

Read more

Read more
  • Read more about configuring traditional data mediation (i.e using a CSV fie or a database) from scratch in IBM Knowledge Center.
  • Read more about configuring data mediation using the REST mediation service in IBM Knowledge Center.

Join The Discussion

Your email address will not be published. Required fields are marked *