Why change the ZooKeeper Transaction Log Location?

Pointing ZooKeeper’s transaction log location to a disk with fast storage can dramatically increase your Streams performance from an administrative perspective (streamtool, Streams Console, Domain Manger, etc.). The effect on running applications is minimal. Common causes of poor¬†ZooKeeper performance are:

  • The home directory of the Streams user is on a shared filesystem and embedded ZooKeeper is being used. The default embedded ZooKeeper transaction log location is in the user’s home directory, and a slow shared filesystem will hurt performance.
  • At least one of the hosts in an external ZooKeeper ensemble is running on slow storage. If one host in a¬†ZooKeeper ensemble is slow, that can slow down the entire ensemble. An example of this would be having a transaction log location on a shared filesystem.
  • Streams application processes are using the same disk where ZooKeeper writes transaction logs.Streams application processes from submitted jobs can make the disk busy and cause delays when ZooKeeper syncs transactions to media. Collocating ZooKeeper with a small number of management services is less likely to cause a problem.

Changing the external ZooKeeper Transaction Log Location

1. Stop the ZooKeeper server:

ZooKeeper-installation-directory/bin/zkServer.sh stop

2. Set the dataLogDir parameter in the ZooKeeper-installation-directory/conf/zoo.cfg file to the location of your fast storage.

Here is an example of the line to add to the config file:
dataLogDir=/fast-storage-location/zk/dataLogDir

3. Save the config file.

4. Start the ZooKeeper server to pick up the updated config changes

ZooKeeper-installation-directory/bin/zkServer.sh start

Changing the embedded ZooKeeper Transaction Log Location

1. Stop the embedded ZooKeeper

streamtool embeddedzk --stop

If there are any active domains for that embedded ZooKeeper, they must be stopped first.

streamtool stopdomain -d <domain-id> --embeddedzk

2. Set the streams.zookeeper.property.dataLogDir bootstrap property by using the streamtool setbootproperty command. Sample command:

streamtool setbootproperty streams.zookeeper.property.dataLogDir=/fast-storage-location/zk/dataLogDir

3. Start the embedded ZooKeeper or domain to pickup the updated config changes:

To start embedded ZooKeeper:

streamtool embeddedzk --start

To start the domain:

streamtool startdomain -d <domain-id> --embeddedzk

Keep in mind

Domains created before moving the transaction log location will not be available after making the change.

 

 

Join The Discussion