We’ve recently released IBM Netcool Operations Insight v1.6 (NOI 1.6) with a host of great new capabilities. They’re designed to help you manage any environment including on-premise and hybrid cloud.

One of the capabilities of NOI is Agile Service Manager. ASM gives you near-real-time and historical topological visibility of your environment. With ASM you gain even richer insight into events and metrics. Plus you can see how the structure of your environment changes over time.

ASM can help with each of the steps in the following figure. In this article we’ll focus on getting relevant context.

ITOps to AIOps in 5 steps
ITOps to AIOps in 5 steps

Digital transformation

Businesses today find themselves in a climate of pervasive digital transformation where:

  • Development teams are shifting applications to cloud platforms
  • Development processes are evolving to be more responsive
  • Development and Operations are joining forces to form DevOps teams

Cloud and container platform usage is increasing. Complexity is increasing. The traditional build, test, deploy, and monitor cycle must evolve. There is a demand for greater speed, better quality and more control.

AIOps is an evolution of traditional operations in support of this digital transformation. And Netcool itself is transforming, embracing both traditional ‘ITOps’ and ‘AIOps’.

Improved tooling

What are the benefits of improved Operations tooling?

  1. Improved response time. Virtualised or cloud-based infrastructures can be very unpredictable and ephemeral. Modern workloads have to scale, heal and move in response to demands and problems. This dynamism dilutes the Operations team’s tribal knowledge. They have to manage infrastructure and services they’ve never seen before. Up-to-the-minute information is essential. AIOps helps catch things that traditional tools might miss.
  2. Richer insights. The more data available, the more AIOps can provide valuable insights. Machine learning algorithms can find patterns in events that traditional systems might discard.

To illustrate, here’s a couple of scenarios:

  1. Managing new environments such as Kubernetes. When performed alongside traditional methods, true end-to-end visibility is possible. For example, you can determine which servers are running your Kube-based apps. You can see the network connectivity of those servers. You can discover the systems-of-records used. You can view the dependencies between other cloud-based systems.
  2. Managing Devops pipelines. In-depth visibility of build pipelines such as Jenkins is now possible. These pipelines, delivering business-critical artefacts into the environment, are now essential to manage. You can see the build artefacts that comprise the live application. You can view build history, analysing changes over time. You can determine who initiated the changes and when, constructing an audit trail.

Everything is connected

We live in a highly-connected world. Relationships between ‘things’ can reveal a great deal of insight into how our world works. The best way to model the world around us is to depict it as a set of related entities, just as you’d draw it on a whiteboard.

Such topologies are beneficial in an enormous number of use-cases such as:

  • Understanding how diseases spread
  • Finding criminals
  • Mapping our social relationships
  • Helping us navigate from A to B
  • Visualising internet connectivity

Topologies that span knowledge domains can provide even greater insight. For example, combining a road network with live traffic data to improve navigation.

Identifying weak spots

A key problem in many industries is understanding weak spots or vulnerabilities. What is a weak spot? It’s a ‘cut vertex’, or a single-point-of-failure. If removed, the topology would split into two or more disconnected components. Some examples are:

  • An organisation relying on a highly-skilled employee
  • A road network depending on only one bridge to cross a river
  • A communications network relying on a single network device

To address such weak spots, you might build another bridge, deploy another device or hire another person. But this may not always be possible. And there’s always the risk of change — what if the bridge becomes congested, or that network device fails?

Now imagine the topology represents an application. You might want to identify weak spots to aid troubleshooting. You can ask these ‘what if?’ type questions for planning purposes. For example What if I remove this device?

How ASM can find single points of failure and help remove them

To illustrate, let’s have some fun. We’ll use IBM Agile Service Manager and a model of the London Underground Network. We’ll use ASM to:

  • Find a single point of failure
  • Build a new railway line
  • Verify we’ve lost the single point of failure

Here’s the topology before we build the new line. Let’s assume that this section of the network needs more resilience and capacity. Note how fragile this fragment of the network appears to be.

Finding a topology is easy to do in ASM. In this case we will make use of ASM’s query APIs. They can identify vulnerabilities if the _markVulnerabilities parameter is set to ‘true’.

Next we will create a new right-click tool. This will let us re-launch the current view with an extra parameter specified

Finally we will constrain which resource types this tool is available from. In this case we want only underground stations.

The JavaScript for the right-click tool is as follows:-

var url = window.location.href + ‘&markVulnerabilities=true’;
url = url.replace(‘contentRender.do’, ‘topology.jsp’);
window.open(url,’asmspof’);

Now we’ve got our right-click tool defined, we need to tell ASM to mark vulnerable resources in an obvious way. We’ll set the background colour of the resource to orange if the resource has the _asmCutVertex property present. This is because we know that property is set only when we launch our right-click tool. We could also adaptively alter the border colour, its pattern and the size of the resource if we wanted.

The JavaScript for the resource background colour styling is as follows. Note that relationship styling can also be configured based on the _asmBridgeEdge property being present.

if(asmProperties.hasOwnProperty(‘_asmCutVertex’)) { return ‘#FF6600’; } else { return ‘’; }

Now we’ve got our styling and tool defined, let’s take a look at where the vulnerabilities are within that network. The affected nodes are shown in orange.

This fragment network has a large number of weak spots. For example I can’t go from Redbridge to Epping without risk of disruption if there’s a problem with any station between. Let’s ask a what-if question. What if we build a new line from Redbridge to Epping?

We’ve now got far fewer single points of failure in this fragment of the topology. Our new line increases the resilience of the network. This is because there’s now a choice of routes between Redbridge and Epping. Also note – that choice of route permeates to other stations that were previously weak spots.

How does this relate to operations management?

Well, the ability to question the integrity of a topology is very useful. First it helps identify vulnerable resources. If there is a problem, these are the resources you are likely to want to investigate first. Second it helps in planning and risk mitigation. You can ask what-if questions to reveal the impact of changes.

Then there is understanding how an IT environment changes over time. It’s very important. Imagine an application running on Kubernetes. Let’s say it’s using a traditional database cluster as its system of record. At 3pm the cluster is running fine. At 4pm two of the three cluster instances fail. With ASM you have a history about how the topology has changed over time. This allows you to identify the cluster as being vulnerable (or not) and supports post-mortem and root-cause analysis.

I hope that you’ve found this useful. Next time we’ll take a look at how ASM’s APIs can help find groups of related resources and what they mean to your operations team.

Find out more

If you would like to know more about how NOI can help you reduce noise and incidents, and help you work more efficiently, please feel free to contact us! 

  • Discover how to modernize IT operations management (ITOM) with artificial intelligence operations (AIOps) and hybrid deployment options at the IBM ITOM site.
  • For details and pricing information about IBM Netcool Operations Insight (NOI), visit the IBM NOI site
  • NOI 1.6 released on June 31 2019. Read more about the new features of NOI 1.6
  • Explore NOI 1.6 further in this series of blogs from the technical team who created this release.

Join The Discussion

Your email address will not be published. Required fields are marked *