Making Hadoop run like a scared rabbit on fire

scared_bugs_bunny_by_yetioner-d6asv54

This is the first in a series of three blogs explaining Adaptive MapReduce, an important feature in IBM’s Enterprise-grade Hadoop offering. In this initial blog, I’ll explain what Adaptive MapReduce is, and why it may be of interest. In follow-on blogs I’ll provide details about performance and explain how to install it, configure it, and use it. Alternative frameworks like Spark (available in our current BigInsights beta) get much of the attention, but MapReduce is still important for many existing Hadoop applications.

About Adaptive MapReduce

Those who have been following IBM’s work related to Hadoop can be forgiven for being a little confused about Adaptive MapReduce. Some context will help. In 2012, talented folks from Hewlett-Packard Laboratories and IBM’s Almaden Research Center collaborated on a Research Report titled “Adaptive MapReduce using Situation-Aware Mappers”. A key idea advanced in this paper was to enable mappers to communicate through a distributed metadata store to make MapReduce process jobs faster and more efficiently.

Concurrent with this work, IBM was in the midst of acquiring Platform Computing – a leader in high-performance computing. Platform Computing had their own MapReduce offering based on Platform Symphony, a low-latency grid manager designed for high-performance SOA applications. To make a long story short, this low-latency MapReduce implementation in Platform Symphony was absorbed into IBM InfoSphere BigInsights Enterprise Edition. For IBM, the meaning of the term Adaptive MapReduce changed, so that Adaptive MapReduce came to refer to the Platform Symphony MapReduce implementation rather than the earlier research effort. As of InfoSphere BigInsights 2.1, customers can choose whether they want to use the Apache Hadoop MapReduce implementation (installed by default), or Adaptive MapReduce based on Platform Symphony technology.

IBM Platform Symphony

Platform Symphony is impressive software. Like Hadoop it runs on distributed clusters, but it provides dynamic service orchestration for time-critical, computationally intensive workloads. Symphony’s name refers to its ability to orchestrate services in response to time-critical events.

Traditional HPC workloads are often “capacity oriented”. Figuring out how to optimally deplete an oil reservoir, model airflow over an aircraft wing, or design a vehicle suspension system are problems that require enormous amounts of computing power. Although important, these problems are seldom urgent.

In some applications (for example, financial risk or fraud analytics) requirements are different. Whoever has the capacity to simulate market outcomes over large numbers of scenarios, or predict future portfolio values with the highest confidence in the shortest time wins.

Unlike batch-oriented workload managers that dispatch jobs from queues in tens of seconds or even minutes, Platform Symphony was designed to provide deterministic performance, dispatching tens of thousands of tasks with sub-millisecond latency. By eliminating scheduling overhead, resource usage is maximized, and results get computed as quickly as possible.

Platform Symphony was also designed with multitenancy and fast job pre-emption in mind to ensure that time-critical jobs need not wait and that security is job one.

The limitations of MapReduce

By 2008, investment banks were considering Hadoop as framework for a variety of data-intensive problems. While Hadoop is well suited to many types of applications, many jobs in this industry were time critical, making MapReduce a less than ideal solution.

Open source MapReduce, which is included in most Hadoop distributions, employs a batch scheduler and uses a simple polling model. Basically, task trackers executing on cluster nodes send periodic method calls to a central job tracker. As part of this heartbeat, a task tracker will report progress and indicate whether it is ready to run a new task. In early implementations of MapReduce, this default polling interval was set to three seconds to avoid overwhelming the central job tracker, however recognizing that most clusters are small, the setting mapreduce.jobtracker.heartbeats was tuned down to 300 milliseconds (see MAPREDUCE-1906) in Hadoop 1.1.0 and later.

Adaptive_MR_polling_diag

Figure 1: A polling architecture reduces utilization and extends runtime

Figure 1 illustrates the challenge with a polling mechanism. Data nodes on a cluster with short running tasks will frequently sit idle awaiting work. Even 300 milliseconds is a long time when you consider that Hadoop jobs are often comprised of many thousands of tasks. Also, as clusters get larger it becomes necessary to tune this figure higher (meaning slowing down the heartbeats) to make sure that the job tracker is not overwhelmed. As a result, we find ourselves in the situation where the larger the cluster, the less suitable MapReduce is for handling latency-sensitive jobs.

Deploying MapReduce on a low-latency scheduling framework

An obvious answer to this problem was to deploy open-source Hadoop on low-latency middleware. This solution has the benefit that it allows developers to write to standard Hadoop APIs ensuring portability, but it also allows investment banks to maximize existing investments and to share cluster resources between Hadoop and non-Hadoop applications.

At its core, Platform Symphony is fast middleware written in C++ although it presents programming interfaces in multiple languages including Java, C++, C#, and various scripting languages.

Client applications interact with a session manager though a client-side API, and the session manager guarantees the reliable execution of tasks distributed to various service instances. Services instances are orchestrated dynamically based on application demand and resource sharing policies.

From a Platform Symphony perspective, MapReduce is just another distributed workload pattern. Rather than orchestrating proprietary or commercial compute engines to price securities under specific market scenarios, the framework can just as easily start a Java Virtual Machine (JVM) to support a TaskTracker and run a specific Hadoop job. To maintain compatibility with open-source Hadoop, select foundation classes in Hadoop were re-implemented to interact with the Platform Symphony middleware, but most of Hadoop is untouched in Adaptive MapReduce. The result is that applications written to open-source Hadoop run without modification on Adaptive MapReduce because the Java classes that applications interact with in the Hadoop application are unchanged.

In the next article I’ll look at some of the performance benefits and functionality advantages you might realize by using Adaptive MapReduce in place of Apache MapReduce.

Join The Discussion

Your email address will not be published. Required fields are marked *