In my last blog post, I introduced Netcool scope-based event grouping, and explained how to get off to a fast start with it. In this blog post I will describe how to take event grouping to the next level, by leveraging Netcool Operations Insights Event Analytics on top of scope-based event grouping groups to create super groups.

The story so far…

Scope-based event grouping works on the theory that, if you have a number of events that come from the “same place” at the “same time”, then they are probably related to the same problem, and so we should group them together. The notion of “same place” might be a geographic one, such as cell site ID, or it might be logical grouping, such as from a common node. Scope-based grouping yields tremendously significant results, and most customers enjoy upwards of 70% event reduction and correlation. This results in quicker problem triage, reduced ticket counts, and ultimately significant savings to operations.

Scope based event groups

Which grouping mechanism should I use?

With Netcool Operations Insights, another event grouping capability is available: Event Analytics based event grouping (also known as “Related Events”). Event Analytics based event grouping looks at the event history and tries to determine which events historically occur together, and then uses these insights to group related events together, if they should occur again in future. With both event grouping options available, customers have understandably wondered which one should they use.

Answer: use both

Recent work done by IBM with customers have delivered enhancements in Netcool that support an approach to event grouping that leverages both mechanisms at once. First, tribal domain knowledge is applied to the event estate to define what scope makes sense in the environment and, based on this scope, scope-based event grouping is implemented. This requires a certain investment of effort; however, as my last blog describes, the return on investment is usually large, for relatively little effort. The system is then allowed time to run, so that event history involving scope-based grouping synthetic parent events can be captured in the REPORTER event history database. Next, Netcool Operations Insights Event Analytics is run against the scope-based grouping synthetic parent events.

How to configure Event Analytics to leverage scope-based grouping

Use the following configuration steps in an Event Analytics configuration to apply Event Analytics to the synthetic parent events generated by scope-based event grouping.

General tab

Set the following filter in the General tab:

  • Filter: AlertGroup = 'ScopeIDParent'

Configure filter

Relate Events tab

Set the following relationship profile in the Related Events tab:

  • Relationship profile: STRONG (recommended)

Configure relationship profile

Advanced tab

Set the following event identify in the Advanced tab:

  • Event identity: set to: SCOPEID

Configure Event Identity

How the settings work

These settings cause Event Analytics to only look at the scope-based event grouping synthetic parent events, by virtue of the filter. Event Analytics looks at the event history to consider which of the scope-based event groups always and only alarm together

Next, using a Relationship Profile of STRONG will ensure that anything found is more or less conclusively correct. There are two features of a STRONG Relationship Profile setting that mean it finds only conclusively correct groupings. Consider a scenario where the analytics has found a group of events that always seem to occur together. Before the analytics will confirm it as a valid grouping, it must pass the following conditions:

  • Each member of the group must be present every time the grouping is seen in the event history;
  • None of the members ever occur in isolation of each other; only ever together with the others.

Hence a STRONG Relationship Profile will ensure only confident groupings, which is why it is recommended as a first step.

Finally, because the scope-based grouping synthetic parent events always have a unique and different Identifier field, the Identifier can not be used as the Event Identity for the analysis. This is because the analytics will then not recognise all the instances of the synthetic parent for a given ScopeID as being the same event, which is what we need. Hence, the Event Identity is changed to be SCOPEID instead. This ensures that each synthetic parent event for the same ScopeID is recognised as a different instance of the same event, which is what we want.

Super groups

The term “super groups” has been coined to describe where groups of events are themselves grouped together under a single synthetic parent event. After a Event Analytics grouping has been deployed, Netcool Operations Insights will automatically group together the relevant scope-based event grouping synthetic parents under a top-level synthetic parent event, should they occur again in the future.

Although the two grouping mechanisms both use the ParentIdentifier field to link parent events to child events, the two grouping mechanisms will not clash, if configured in the manner described. This is because the ParentIdentifier field of the scope-based event grouping synthetic parent event is created with a blank ParentIdentifier field, hence it is not a problem for the Event Analytics mechanism to set this field value, and hence link it to a higher level synthetic parent event.

super groups

How the grouping automations now service super groups

The scope-based event grouping automation code base has been extended in Netcool/OMNIbus 8.1 Fix Pack 17 to support this notion of “super groups”. It has been extended to do the following additional tasks:

  • The event with the highest Severity in the entire sub-tree of the super parent event will automatically propagate up to the top level super parent event.
  • The highest CauseWeight and ImpactWeight of the entire sub-tree of the super parent event will automatically propagate up to the top level super parent event.
  • The CustomText of the priority child event of the entire sub-tree of the super parent event will automatically propagate up to the top level super parent event.
  • The trouble ticket number will automatically propagate down to the entire sub-tree of unticketed events, under the super parent event.
  • The Acknowledged, OwnerUID, and OwnerGID field values of the super parent event will automatically propagate down to the entire sub-tree of unticketed events, under the super parent event.
  • For each ScopeIDParent event, it will check to see if it is itself a child event. It is is, it will gather the event information under it and roll those up into journal entries in the “super parent” event, using the same mechanism that is used to roll its child event detail into its own journals. This enables a trouble ticket to be cut from the top-level super parent event, and it will already contain all the underlying child event information. Two new properties have been introduced to enable this feature: SEGJournalToSuperParent (default: 0) to enable journalling to super-parents, and SEGMaxSuperParentJournals (default: 100) to set the maximum number of child events to journal to the super-parent.

Super group journal

What are the benefits?

As discussed in the previous blog, it is advantageous to set your ScopeID to be a big enough “net” to catch all events that relate to a single incident. Of course there is a limit where, if you set the ScopeID to be too broadly encompassing, the scope-based event grouping mechanism will start to gather together events that are unrelated. Scope-based event grouping therefore, however tremendously valuable in terms of event and ticket reduction, has its limitations.

Further event and ticket reduction

Applying Event Analytics based event grouping to the ScopeIDParent events enables the automatic creation of super groups. Not only does this further compress the events on the Event Viewer making it easier for operators to work from, it also potentially reduces the number of tickets being opened. Instead of opening a ticket for each scope-based event grouping group, a single ticket can instead be opened for the super group, since all the underlying events will all relate to the same incident.

Exposing of unknown relationships

Applying Event Analytics to the ScopeIDParent events provides insights into how ScopeID groupings alarm in relation to each other. This can expose previously unknown relationships between the event groups. Many customers do not have a reliable CMDB or stored representation of event dependencies, and so this information is not typically available for correlation purposes. Event Analytics will discover these patterns of relationship, based on the event history alone.

EXAMPLE: Imagine you are setting ScopeID across the board, based on cell site location. However, there are three smaller cell sites that are very close to each other and rely on the same underlying sub-systems. In this case, when there is a problem felt by one, it is felt by all three, and alarms are generated from all. Event Analytics will detect this relationship based on the event patterns and hence create one group instead of three, each time there is a problem at these sites.

Ease of management

Netcool Operations Insights Event Analytics typically finds a great number of event groupings in the event history, as well as a large number of Seasonal events. Recently a customer discovered over 18,000 Seasonal events within 6 weeks worth of data. Even though only around a third of them were “highly seasonal”, the comment was understandably, “so now I have to go through all of these?” Applying scope-based event grouping first, and then applying Event Analytics based event grouping second, means that you are now only having to work with groups of events, rather than individuals. The same analysis on the groups instead yielded around 200 ScopeID groups, and a number of those were also related. Reviewing and creating handling rules for that number of results is a lot more manageable from a user point-of-view, by leveraging the work that scope-based event grouping has already done.

Field example

At a customer site, this super grouping technique was applied and exposed a relationship between 12 ScopeID groups. The Event Analytics found that these 12 groupings always and only ever occurred together in the event history. The Event Analytics were run over 6 weeks worth of data and it was found that this particular scenario occurred 66 times – which equates to more than once per day. In each occurrence, there were around 72 raw events, and around 36 tickets opened as a result. By applying scope-based event grouping first, this was reduced to 12 groupings of events, and hence 12 tickets for each occurrence. By deploying the discovered Event Analytics grouping with the click of a mouse, it was possible to reduce these 12 groups to just a single group, and hence just a single ticket. That’s essentially a ticket reduction from around 36 to 1, for this example.

Super groups example

Summary

This technique of leveraging both event grouping capabilities makes for a very elegant and compelling story. Further it appears that the process of defining ScopeID for an event estate normalises the event data. Raw event data is notoriously “dirty”, and the process of setting ScopeID for each event seems to cleans it; making it well suited for consumption by Event Analytics. Using the two event grouping capabilities together in this way is highly complementary and yields great results. If the work of setting ScopeID up has already been done, then applying Netcool Operations Insights Event Analytics to the results takes minutes. The results are highly insightful, exposing previously unknown relationships. The results are also highly valuable, by further reducing the event rows presented to operators, and further reducing the number of tickets opened. This means direct savings to operations; both financially through reduced tickets, and in terms of reduced Mean Time To Repair (MTTR).

Join The Discussion

Your email address will not be published. Required fields are marked *