IBM Support

Improve LOAD Hadoop performance for Big SQL by adjusting YARN memory

Technical Blog Post


Abstract

Improve LOAD Hadoop performance for Big SQL by adjusting YARN memory

Body

Big SQL LOAD HADOOP uses MapReduce jobs. Like any other MapReduce job, configuring YARN and MapReduce memory properties is key to running LOAD optimally. MapReduce jobs tend run into OutOfMemory java errors if YARN and MapReduce memory settings are too small; If the properties are too large, the number of concurrent map and reduce tasks will decrease, also negatively impacting performance and wasting memory.

Proper YARN and MapReduce memory sizes from lab tests

First, let’s look at which properties impact LOAD. We tested LOAD HADOOP in our labs using the following table definitions derived from TPCH. They are partitioned tables, and the scale factor is 1TB.

Partitioned table definition:

CREATE HADOOP TABLE LINEITEM   (    L_ORDERKEY      BIGINT NOT NULL,    L_PARTKEY       INTEGER NOT NULL,    L_SUPPKEY       INTEGER NOT NULL,    L_LINENUMBER    INTEGER NOT NULL,    L_QUANTITY      FLOAT NOT NULL,    L_EXTENDEDPRICE FLOAT NOT NULL,    L_DISCOUNT      FLOAT NOT NULL,    L_TAX           FLOAT NOT NULL,    L_RETURNFLAG    VARCHAR(1) NOT NULL,    L_LINESTATUS    VARCHAR(1) NOT NULL,    L_COMMITDATE    VARCHAR(10) NOT NULL,    L_RECEIPTDATE   VARCHAR(10) NOT NULL,    L_SHIPINSTRUCT  VARCHAR(25) NOT NULL,    L_SHIPMODE      VARCHAR(10) NOT NULL,    L_COMMENT       VARCHAR(44) NOT NULL)    PARTITIONED BY (L_SHIPDATE VARCHAR(10)  )  STORED AS PARQUETFILE;    CREATE HADOOP TABLE ORDERS  (    O_ORDERKEY      BIGINT NOT NULL,    O_CUSTKEY       INTEGER NOT NULL,    O_ORDERSTATUS   VARCHAR(1) NOT NULL,    O_TOTALPRICE    FLOAT NOT NULL,    O_ORDERPRIORITY VARCHAR(15) NOT NULL,    O_CLERK         VARCHAR(15) NOT NULL,    O_SHIPPRIORITY  INTEGER NOT NULL,    O_COMMENT       VARCHAR(79) NOT NULL)    PARTITIONED BY (O_ORDERDATE VARCHAR(10)  )  STORED AS PARQUETFILE;  

When Big SQL starts, it will occupy some memory exclusively. This memory is not under YARN control. Its size is defined by bigsql_resource_percent (default is 25% of physical memory); this means YARN and MapReduce can use the remaining 75% of total memory on each node by default.  Here is summary of best values to use:

Table 1: Recommended YARN and MapReduce memory configuration

TPCH 1TB partitioned table Default Value Recommended Value Explanation
1. Big SQL    
bigsql_resource_percent 25% 25% to 80% Amount of cluster resources to dedicate to Big SQL
2. Map Memory      
mapreduce.map.memory.mb 1024 3400 Memory (in MB) allocated for each map task
mapreduce.map.java.opts -Xmx756m -Xmx2720m Set max Java heap size of map process with -Xmx option. Suggest 80% of mapreduce.map.memory.mb
3. Reduce Memory    
mapreduce.reduce.memory.mb 1024 9800 Memory (in MB) allocated for each reduce task
mapreduce.reduce.java.opts -Xmx756m -Xmx7840m Set max Java heap size of reduce process with -Xmx option. Suggest 80% of mapreduce.reduce.memory.mb
4. Application Master Memory    
yarn.app.mapreduce.am.resource.mb 512 2000 Memory (in MB) allocated for Application Master
yarn.app.mapreduce.am.command-opts -Xmx312m -Xmx1600m Set max Java heap size of AM process with -Xmx option. Suggest 80% of yarn.app.mapreduce.am.resource.mb
5. YARN Node Manager Memory  
yarn.nodemanager.resource.memory-mb 163840 163840 [Calculate according to physical memory] Total memory (in MB) that YARN can be allocated for containers on one node. Here is 75% of total memory on one node. Remaining 25%( bigsql_resource_percent) is allocated to Big SQL
yarn.scheduler.minimum-allocation-mb 512 512 The smallest amount of memory (MB) that can be requested for a container.
yarn.scheduler.maximum-allocation-mb 163840 163840 [Calculate according to physical memory] The largest amount of memory (MB) that can be requested for a container. Suggest the same as  yarn.nodemanager.resource.memory-mb
LOAD HADOOP table Result OOM error Successful  

Methods used to determine the best values for your workload

The properties list in the above table can be divided into 2 types:

Type 1. Calculated. They can be set once. E.g.

  • yarn.nodemanager.resource.memory-mb=163840 . It is Total physical memory size (in this case 216GB)x( 1 – 25% ). 
  • yarn.scheduler.maximum-allocation-mb = yarn.nodemanager.resource.memory-mb
  • yarn.scheduler.minimum-allocation-mb =512 (fixed value)

For properties of type 1, it is straightforward to set them according to node physical memory.  Note: If Big SQL memory percentage is changed (to be higher or lower than 25%), then we need to re-calculate yarn.nodemanager.resource.memory-mb and yarn.scheduler.maximum-allocation-mb accordingly

Type 2. Tuned based on workload and data size. They are:

  • mapreduce.map.memory.mb, mapreduce.reduce.memory.mb and yarn.app.mapreduce.am.resource.mb
  • mapreduce.map.java.opts = 80% x mapreduce.map.memory.mb
  • mapreduce.reduce.java.opts = 80% x mapreduce.reduce.java.opts
  • yarn.app.mapreduce.am.command-opts = 80% x yarn.app.mapreduce.am.resource.mb

Properties of type 2 need several iterations to lock down the most efficient values. Here is the process I used to tune these properties for workload TPCH 1TB.

1. Run LOAD HADOOP with default configuration (see table 1). Encountered OOM (out of memory) errors for map and reduce tasks. You will see an error message like the one below in the MapReduce task logs, which means the configured map memory size (mapreduce.map.memory.mb=1024) is too small.

Error Message: Container [pid=26783,containerID=container_1389136889967_0009 _01_000002] is running beyond physical memory limits. Current usage: 1.2 GB of 1 GB physical memory used; 1.2 GB of 5 GB virtual memory used. Killing container. 

2. Increase all type 2 memory related properties to resolve OOM error and make LOAD HADOOP run successfully. I recommended increasing the value with double size each time until the OOM error no longer exists.

3. Although the current configuration supports LOAD HADOOP table of TPCH 1TB, do we over-configure them? Over-configuring hurts performance as it decreases the number of containers that can be run concurrently. To answer this question, we need check how much memory each kind (Map, Reduce and AM) of container actually allocates.  We can do this by checking YARN Node Manager log. Node Manager monitors all container resource allocations and writes info into its log. You can find the log files under /var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-<hostname>.log on data nodes. The log entry format looks like:

Log Entry: 2015-12-23 07:40:14,262 INFO  monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(458)) - Memory usage of ProcessTree 471707 for container-id container_1450840321024_0002_01_003626: 7.0 GB of 11 GB physical memory used; 11.2 GB of 55 GB virtual memory used

This log entry is for one reduce container and shows this container allocated 7GB from a maximum of 11GB. We can grep and sort all reduce container memory allocation information from the Node Manager log to check what is the max size needed by reduce whilst the job is running. Here is a command to perform this check:

grep "of 11 GB" yarn-yarn-nodemanager-.log | awk '{print $15 $16 }'|grep GB | sort

Outputs are like:

…  …  8.7GB  8.7GB  8.7GB  8.7GB  8.7GB  8.8GB  8.9GB  9.0GB

 

The output above shows that the max size is 9GB on this particular data node. The same check must be performed on all data nodes, after which we see that the max size is 9.4GB across all data nodes. So setting mapreduce.reduce.memory.mb=9800 (slightly larger than 9.4GB) should be sufficient. Do the same thing to check the max memory allocation for Map and AM memory and tune the related configuration values. Finally, re-run LOAD HADOOP commands to verify that the new values are okay.

 

Note:
1. LOAD table  memory requirements are different for partitioned tables and non-partitioned tables. In general, partitioned table need more memory for Reduce and Application Master containers.
2. A new version of ANALYZE (V2.0) is already provided since Big SQL V4.2. It is a totally new framework to run Analyze, which does not rely on MapReduce – and will require no MapReduce tuning.
 

Summary

Proper YARN and MapReduce memory settings are very important for Big SQL LOAD HADOOP performance. This article has provided a technique to tune these properties, along with a set of recommendations based on LOAD HADOOP of 1TB of TPCH data on the specified data nodes.

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSCRJT","label":"IBM Db2 Big SQL"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

UID

ibm16259971