Introduction

: Recently I have been posed questions by customers , who wanted to understand “the impact of namenode transitions in a HA environment on connected clients”. This topic is sparsely documented but people are very much intrigued by it.

For testing the scenario I integrated InfoSphere Streams 4.0 with InfoSphere BigInsights 3.0.0.2 using HDFSFileSink2 operator.

For the audience who are not aware of namenode HA and state transitions let me explain the concept briefly:
In a hadoop HA environment there are two instances of Namenode running , active and standby .
Whenever active namenode goes down , standby namenode transitions to active and starts processing requests from clients.

Detailed Description

Before we analyze the results of experiment lets understand hadoop configurations which are specific to HA environment .

When HA is enabled you will notice that property fs.defaultFS in core-site.xml is replaced from being physical URL to a logical URI

<property>
    <!-- The default file system used by Hadoop -->
    <name>fs.defaultFS</name>
    <value>hdfs://BICluster</value>
  </property>

This logical name of the cluster is defined in file hdfs-site.xml using following property

 <property>
    <name>dfs.nameservices</name>
    <value>BICluster</value>
  </property>

Also note the following properties, the namenodes in this cluster is assigned a logical aliases (nn1,nn2)

 <property>
    <name>dfs.ha.namenodes.BICluster</name>
    <value>nn1,nn2</value>
  </property>

All the properties referencing each of the namenode’s is scoped “property.clusterName.namenodealias” as shown following

<property>
    <name>dfs.namenode.http-address.BICluster.nn1</name>
    <value>datanode1.ibm.com:50070</value>
  </property>
  <property>
    <name>dfs.namenode.http-address.BICluster.nn2</name>
    <value>datanode2.ibm.com:50070</value>
  </property>

From HDFSclients point of view it is communicating with hadoop cluster it is transparent to multiple namenodes running in the cluster.
This is achived using the hadoop failover proxy provider , as defined by the following property

<property>
    <name>dfs.client.failover.proxy.provider.BICluster</name>
    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
  </property>

HDFSClients are interacting with the proxy and the job of the proxy is to interact with the active namenode and service client requests.

Now lets test the failover proxy with Streams and BigInsights integration.
In Streams I authored the following SPL application using HDFS2FileSink operator

composite HDFS2FileSinkFormat
{
	graph
		// read file "LineInput.txt" from sample's data directory
		stream linBeacon = Beacon()
		{
			param
				period:2.5f;
			output
				linBeacon:lines="Hello world!";
		}
		// Write content of input file to /user//pattern0%FILENUM.txt
		// Each file is about 100 bytes in size.
		() as lineSink1 = HDFS2FileSink(linBeacon)
		{
			param
				file : "pattern0%FILENUM.txt" ;
				bytesPerFile : 100l ;				
		}

}

This application is ingesting tuples indefinitely to HDFS the filenames are incremented once they reach limit of 100 bytes per file.

The output of status.sh hadoop command is as follows

[INFO] Progress - 30%
[INFO] @datanode1.ibm.com - namenode(active) started, pid 4071250
[INFO] @datanode2.ibm.com - namenode(standby) started, pid 2263798

datanode1 is running active instance of namenode whereas datanode2 is running in standby

Once I deployed the streams application , the files started appearing on my hdfs file system.

[biadmin@namenode ~]$ hadoop fs -ls
Found 4 items
drwx------   - biadmin biadmin          0 2015-07-01 11:32 .staging
-rw-r--r--   3 biadmin biadmin         91 2015-07-02 16:08 pattern00.txt
-rw-r--r--   3 biadmin biadmin         91 2015-07-02 16:08 pattern01.txt

Then I manually killed active namenode to force transition of namenode

[biadmin@datanode1 ~]$ sudo kill -9 4071250

The updated status of hadoop was

[INFO] @datanode1.ibm.com - namenode stopped
[INFO] @datanode2.ibm.com - namenode(active) started, pid 2263798

notice datanode1 is stopped and datanode2 is transitioned to active.

This transition had no impact on my streams client it continued to run and ingest tuples

[biadmin@namenode ~]$ hadoop fs -ls
Found 4 items
drwx------   - biadmin biadmin          0 2015-07-01 11:32 .staging
-rw-r--r--   3 biadmin biadmin         91 2015-07-02 16:08 pattern00.txt
-rw-r--r--   3 biadmin biadmin         91 2015-07-02 16:08 pattern01.txt
-rw-r--r--   3 biadmin biadmin         91 2015-07-02 16:09 pattern02.txt

Conclusion

: From this experiment we can conclude that hadoop proxy provider handles the transition in a transparent and flawless manner. There will be no impact to connected clients.

Exception Conditions

There is one caveat to this experiment which I would like to state.
For the entire experiment I used hadoop config files to fetch the hdfs details. This scenario will not work if you have connected to an instance of namenode directly.
Example,
The same SPL application if modified as follows , will only bind to an instance of namenode and it cannot handle transitions.

() as lineSink1 = HDFS2FileSink(linBeacon)
		{
			param
				file : "pattern0%FILENUM.txt" ;
				bytesPerFile : 100l ;
				hdfsUri: "hdfs://datanode1.ibm.com:9000";
		}

Join The Discussion

Your email address will not be published. Required fields are marked *