IBM Operations Analytics â€“ Log Analysis allows long term storage of data across several months and years across several hundred on Terabytes on the Apache Hadoop platform. This is important if you want to store large amounts of IT Operational data for long term (several years) for compliance or for historical analysis.
There is out of the box support for HDFS distributions including IBM BigInsights Hadoop 3.x and Cloudera Hadoop 5.3.x.
The way to enable this feature is to install Log Analysis server using regular installer. Then the Hadoop distribution should be installed â€“ ideally on a separate server or VM. Then you execute Log Analysis scripts that integrate Log Analysis with Hadoop. For more information see Log Analysis 1.3.1 Product documentation.
Before we go further, we should understand how Log Analysis stores data across tiers. There are 3 data tiers â€“ Hot, Cold & Frozen.
Hot Tier – This tier olds most recently indexed data. A larger fraction of data is stored in memory. Interactive search is supported and this resultsâ€“ faster searches with more memory and processor allocation. The indexed data is stored on Apache Solr.
Cold Tier â€“ This tier holds a few weeks or couple of months of indexed data. This supports Disk based access with lower memory utilization than the Hot tier. Incremental searches are fast with moderate memory and processor allocation. The indexed data is stored on Apache Solr.
Frozen Tier â€“ This tier enables long term storage of highly compressed data on Hadoop File System (HDFS). This tier has low storage & memory requirements. You can search, report, model and mine over historical data. The searches are scan based and slower than the Cold tier. Data on HDFS is partitioned by time and data source.
Once Log Analysis is configured to work with Hadoop, any data that is ingested is compressed and stored on Hadoop in the frozen tier. End users can use the web user interface or REST APIs to search data stored on Hadoop. You can save search results or create dashboards for data stored on HDFS. The search interface is smart and can synthesize and de-duplicate data from the hot, cold and frozen tier before presenting the results.