After creating an Insight Pack project, one needs to write a splitter to identify the logical boundary of a log record. This article describes how to create a splitter using Java.

Pre-requisites

Understand the log boundary by looking at sample log file.

Log records can be single line log records as shown below. It can be split by a newline.


2015-04-18 08:09:35,721 ERROR contentlibrary.exceptions.CLException - Content Library Exception Caught: ERROR CODE: 45012 :: BIPUBLISHER_SERVICE_IS_FOLDER_EXIST_ERROR; message: (404)Not Found
2015-04-18 08:09:35,721 ERROR contentlibrary.exceptions.CLException - Content Library Exception Caught: ERROR CODE: 45012 :: BIPUBLISHER_SERVICE_IS_FOLDER_EXIST_ERROR; message: (404)Not Found
2015-04-18 08:10:53,157 ERROR contentlibrary.exceptions.CLException - Content Library Exception Caught: ERROR CODE: 45012 ::</p>

Log records can be multi-line as shown below. It can be split by a timestamp.


9/11/2014 13:15:08 - Process(4852.3) User(mqmadmin) Program(runmqdnm.exe)
Host(host1) Installation(Installation1)
VRMF(7.5.0.3) QMgr(qmgr1)</p>
<p style="padding-left: 30px;">AMQ8377: Unexpected error 2354 was received by the application.</p>
<p style="padding-left: 30px;">EXPLANATION:
The error 2354 was returned unexpectedly to the application.
ACTION:
Save any generated output files and use either the MQ Support site:
http://www.ibm.com/software/integration/wmq/support/...</p>

Store the sample logfile under logSamples folder as shown below

Create Splitter Figure 1

Create a new Java package for the splitter

On src folder, right click –> New –> Package as shown below. Provide the name of the package and click on Finish.

Create Splitter Figure 2

Create a new class to implement the splitter interface provided by IBM Operations Analytics – Log Analysis product.

On the package created in the previous step, right click –> New –> Class.  Provide the class name and click Finish.

Create Splitter Figure 3

splitter interface

The Java splitter interface is defined as follows:


package com.ibm.tivoli.unity.splitterannotator.splitter;
/************************************************************************
* This interface defines the APIs for Java based Splitters and is used
* by third party custom Java Splitter developers
*
***********************************************************************/
public interface IJavaSplitter
{
/******************************************************************
* Split a batch of log records packaged in the input JSON
*
* @param batch
* @return
* @throws JavaSplitterException
******************************************************************/
public ArrayList<JSONObject> split( JSONObject batch ) throws Exception ;

Implement splitter interface by copy pasting the following code to MyAppLogSplitter class created before  

package com.ibm.la.insightpack.myAppLogSplitter;

import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.concurrent.ConcurrentHashMap;

import org.apache.commons.lang.StringUtils;

import com.ibm.json.java.JSONArray;
import com.ibm.json.java.JSONObject;
import com.ibm.tivoli.unity.common.logging.LoggerConstants;
import com.ibm.tivoli.unity.common.logging.UnityLogger;
import com.ibm.tivoli.unity.splitterannotator.splitter.IJavaSplitter;

public class MyAppLogSplitter implements IJavaSplitter {
	private static ConcurrentHashMap<String, SimpleDateFormat> dateFormats = new ConcurrentHashMap<String, SimpleDateFormat>();

	private static UnityLogger logger = (UnityLogger) UnityLogger
			.getLogger(LoggerConstants.GENERIC_RECEIVER_LOG_APPENDER);

	public ArrayList<JSONObject> split(JSONObject batch) throws Exception {
		if (logger.isDebugEnabled()) {
			logger.debug(getClass(), "--start of MyAppLogSplitter splitter");
			logger.debug(getClass(), batch.toString());
		}
		JSONObject content = (JSONObject) batch.get("content");
		JSONArray timestampFormats = (JSONArray) batch.get("timestampFormats");

		String datasource = (String) batch.get("datasource");
		if (datasource == null) {
			throw new Exception("Missing datasource in splitter input");
		}

		String stream = (String) batch.get("stream");
		String streamIdentifier = datasource;
		if (stream != null && !stream.trim().isEmpty())
			streamIdentifier += "_" + stream.trim();

		SimpleDateFormat timestampSdf = dateFormats.get(streamIdentifier);
		if (timestampSdf == null) {
			if (timestampFormats == null || timestampFormats.size() == 0) {
				throw new Exception("Missing timestamp formats in splitter input");
			}

			String timestampFormat = (String) timestampFormats.get(0);
			timestampSdf = new SimpleDateFormat(timestampFormat);
			timestampSdf.setLenient(false);
			dateFormats.put(streamIdentifier, timestampSdf);
		}

		// get the input text
		String text = (String) content.get("text");
		boolean lastLineComplete = text.endsWith("\n");

		// break input text to individual lines
		String[] lines = StringUtils.split(text, '\n');
		ArrayList<JSONObject> output = new ArrayList<JSONObject>();

		SimpleDateFormat dateFormat = timestampSdf;
		int i = 0;

		// go through each line
		for (; i < lines.length; i++) {
			// Extract timstamp from log line
			// The first 22 characters contain timestmap.
			// 2015-04-18 08:09:35,721 ERROR
			// contentlibrary.exceptions.CLException - Content Library Exception
			// Caught: ERROR CODE: 45012 ::
			// System.out.println(lines[i]);
			String line = rtrim(lines[i]);
			String timestamp = StringUtils.substring(line, 0, 23);

			// check if the timestamp is in correct format
			try {
				dateFormat.parse(timestamp);

				// create a JSON Object for each line to pass to annotator
				JSONObject object = new JSONObject();
				JSONObject lineContent = new JSONObject();
				object.put("content", lineContent);
				lineContent.put("text", line);
				JSONObject lineMetadata = new JSONObject();
				object.put("metadata", lineMetadata);

				// if its last line and does not contain newline at the end then
				// its partial line
				// mark it as type B
				if (lastLineComplete == false &#038;&#038; i == (lines.length - 1))
					lineMetadata.put("type", "B");
				else
					lineMetadata.put("type", "A");
				lineMetadata.put("timestamp", timestamp);
				output.add(object);
			} catch (ParseException e) {
				logger.warn(getClass(), "Ignoring potential invalid timestamp value '" + timestamp + "'"
						+ " for datasource '" + datasource + "'");
				e.printStackTrace();
			}
		}

		// System.out.println(output.toString());
		if (logger.isDebugEnabled()) {
			logger.debug(getClass(), output.toString());
			logger.debug(getClass(), "--end of MyAppLogSplitter splitter");
		}
		return output;
	}

	private String rtrim(String line) {
		if (!StringUtils.endsWith(line, "\r"))
			return line;
		if (line.isEmpty())
			return line;
		int lastPosition = line.length() - 1;
		while (line.charAt(lastPosition) == '\r' &#038;&#038; --lastPosition >= 0)
			;
		if (lastPosition < 0)
			return "";
		return line.substring(0, lastPosition + 1);
	}

	public static void main(String[] args) {
		JSONObject object = new JSONObject();
		JSONObject content = new JSONObject();
		JSONObject metadata = new JSONObject();
		String text = "2015-04-18 08:09:35,721 ERROR contentlibrary.exceptions.CLException - Content Library Exception Caught:  ERROR CODE: 45012 :: BIPUBLISHER_SERVICE_IS_FOLDER_EXIST_ERROR; message: (404)Not Found \n";
		text = text
				+ "2015-04-18 08:09:35,721 ERROR contentlibrary.exceptions.CLException - Content Library Exception Caught:  ERROR CODE: 45012 :: BIPUBLISHER_SERVICE_IS_FOLDER_EXIST_ERROR; message: (404)Not Found\n";
		text = text
				+ "2015-04-18 08:09:35,721 ERROR contentlibrary.exceptions.CLException - Content Library Exception Caught:";
		content.put("text", text);
		JSONArray timestampFormats = new JSONArray();
		timestampFormats.add("yyyy-MM-dd HH:mm:ss,SSS");
		object.put("timestampFormats", timestampFormats);
		metadata.put("timestamp", "2015-04-18 08:09:35,721");
		content.put("metadata", metadata);
		object.put("content", content);
		object.put("datasource", "try_applog");

		MyAppLogSplitter splitter = new MyAppLogSplitter();
	    ArrayList<JSONObject> splitLogRecords = null;
		System.out.println("-------------Input text");
		System.out.println(object.toString());
		try {
			splitLogRecords = splitter.split(object);
			System.out.println("-------------Splitter output");
			System.out.println(splitLogRecords.toString());
		} catch (Exception e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
	}
}

Change the String text in main method to contain sample text from your log file.

Change the following to suit the timeformat in your log file

timestampFormats.add(“yyyy-MM-dd HH:mm:ss,SSS”);

metadata.put(“timestamp”,”2015-04-18 08:09:35,721″);

<<include javasimple date format link here>>

The main method is present for testing Splitter.  After testing the main method can be commented out.

 Test the splitter interface

Right click in Splitter class –> Run As –> Java Application

Create Splitter Figure 4

If all is good, you will see the following output.

————-Input text
{“content”:{“text”:”2015-04-18 08:09:35,721 ERROR contentlibrary.exceptions.CLException – Content Library Exception Caught: ERROR CODE: 45012 :: BIPUBLISHER_SERVICE_IS_FOLDER_EXIST_ERROR; message: (404)Not Found \n2015-04-18 08:09:35,721 ERROR contentlibrary.exceptions.CLException – Content Library Exception Caught: ERROR CODE: 45012 :: BIPUBLISHER_SERVICE_IS_FOLDER_EXIST_ERROR; message: (404)Not Found\n2015-04-18 08:09:35,721 ERROR contentlibrary.exceptions.CLException – Content Library Exception Caught:”,”metadata”:{“timestamp”:”2015-04-18 08:09:35,721″}},”timestampFormats”:[“yyyy-MM-dd HH:mm:ss,SSS”],”datasource”:”try_applog”}

————-Splitter output
[{“content”:{“text”:”2015-04-18 08:09:35,721 ERROR contentlibrary.exceptions.CLException – Content Library Exception Caught: ERROR CODE: 45012 :: BIPUBLISHER_SERVICE_IS_FOLDER_EXIST_ERROR; message: (404)Not Found “},”metadata”:{“timestamp”:”2015-04-18 08:09:35,721″,”type”:”A”}}, {“content”:{“text”:”2015-04-18 08:09:35,721 ERROR contentlibrary.exceptions.CLException – Content Library Exception Caught: ERROR CODE: 45012 :: BIPUBLISHER_SERVICE_IS_FOLDER_EXIST_ERROR; message: (404)Not Found”},”metadata”:{“timestamp”:”2015-04-18 08:09:35,721″,”type”:”A”}}, {“content”:{“text”:”2015-04-18 08:09:35,721 ERROR contentlibrary.exceptions.CLException – Content Library Exception Caught:”},”metadata”:{“timestamp”:”2015-04-18 08:09:35,721″,”type”:”B”}}]

This will be formatted better if you have JSON support for Eclipse.

Join The Discussion

Your email address will not be published. Required fields are marked *