In InfoSphere Streams 4.0, you can use Redis replication as a part of your HA strategy. Although Redis itself supports master-slave replication (http://redis.io/topics/replication ) , this is NOT supported in Streams. Streams needs to manage the replication instead of using Redis’ own replication. Below is an example how to setup this up.

1.First you will need a minimum of 3 Redis hosts to setup replication. The number of replication servers must be an odd number greater or equal than 3.

2. Start each Redis server as a standalone server. Never use redis’ “slave of” parameter.

I have redis running as standalone servers in 3 hosts, listening on the default port 6379

d0521b08
d0521b09
d0521b10

Confirming redis server is running on the 3 hosts. Again, make sure you started the Redis server as standalone in each one of the 3 hosts and do not configure Redis master-slave replication.


ivan@d0428b06:~/InfoSphere_Streams > ssh d0521b08 ps -ef|grep redis ivan   17870     1  0 11:53 ?        00:00:00 ./redis-server *:6379 ivan@d0428b06:~/InfoSphere_Streams > ssh d0521b09 ps -ef|grep redis ivan     454     1  0 11:54 ?        00:00:00 ./redis-server *:6379 ivan@d0428b06:~/InfoSphere_Streams > ssh d0521b10 ps -ef|grep redis ivan    8120     1  0 11:55 ?        00:00:00 ./redis-server *:6379

Setting the Streams instance parameters:


 > streamtool setproperty instance.checkpointRepository=redis CDISC0009I The instance.checkpointRepository property was set to the following value: "redis". The previous property value was "notSpecified". This change affects the StreamsInstance instance in the domain_ivan domain when the instance is restarted.

Now to set the checkpoint repository configuration for the 3 Redis server probably easier to create a small script for it.


 > cat setredis3 streamtool setproperty instance.checkpointRepositoryConfiguration=" {

\"replicas\" : 3, \"shards\" : 1, \"replicaGroups\" : [ { \"servers\" : [\"d0521b08:6379\"], \"description\" : \"rack1\" }, { \"servers\" : [\"d0521b09:6379\"], \"description\" : \"rack2\" }, { \"servers\" : [\"d0521b10:6379\"], \"description\" : \"rack3\" }

] }"

 > ./setredis3 CDISC0009I The instance.checkpointRepositoryConfiguration property was set to the following value: "{

"replicas" : 3, "shards" : 1, "replicaGroups" : [ { "servers" : ["d0521b08:6379"], "description" : "rack1" }, { "servers" : ["d0521b09:6379"], "description" : "rack2" }, { "servers" : ["d0521b10:6379"], "description" : "rack3" }

] }". The previous property value was "<undefined>". This change affects the StreamsInstance instance in the domain_ivan domain when the instance is restarted.

Start or restart your instance and then check if the values are correct


> st getproperty -a

instance.checkpointRepository=redis instance.checkpointRepositoryConfiguration= {

"replicas" : 3, "shards" : 1, "replicaGroups" : [ { "servers" : ["d0521b08:6379"], "description" : "rack1" }, { "servers" : ["d0521b09:6379"], "description" : "rack2" }, { "servers" : ["d0521b10:6379"], "description" : "rack2" }

] }

Now let’s submit the MultipleSources sample app to verify everything is OK with the Redis setup.


 > st submitjob output/MultipleSources/sample.MultipleSources.sab CDISC0079I The following number of applications were submitted to the StreamsInstance instance: 1. The instance is in the domain_ivan domain. CDISC0080I The 0 job was submitted for the following application: sample.MultipleSources.sab. The job was submitted to the StreamsInstance instance in the domain_ivan domain. CDISC0020I Submitted job IDs: 0

> st lsjobs Instance: StreamsInstance Id State   Healthy User   Date                     Name                      Group 0 Running yes     ivan 2015-02-17T12:13:22-0500 sample::MultipleSources_0 default

 > ls -al ./data total 516 drwxr-xr-x 2 ivan ccgroup     40 Feb 17 11:21 . drwxr-xr-x 6 ivan ccgroup   4096 Feb 17 12:10 .. -rw-r--r-- 1 ivan ccgroup 280271 Feb 17 13:43 multipleSources.dat

Now let’s simulate a failure in one of the redis server:


ivan@d0428b06:~ > ssh d0521b09 ps -ef|grep redis ivan     454     1  0 11:54 ?        00:00:04 ./redis-server *:6379 ivan@d0428b06:~ > ssh d0521b09 kill -9 454 ivan@d0428b06:~ > ssh d0521b09 ps -ef|grep redis ivan@d0428b06:~ > ssh d0521b09 ps -ef|grep redis|wc -l 0

Let’s check the application is still running OK.


 > st lsjobs Instance: StreamsInstance Id State   Healthy User   Date                     Name                      Group 0 Running yes     ivan 2015-02-17T12:13:22-0500 sample::MultipleSources_0 default

> ls -al ./data total 516 drwxr-xr-x 2 ivan ccgroup     40 Feb 17 11:21 . drwxr-xr-x 6 ivan ccgroup   4096 Feb 17 12:10 .. -rw-r--r-- 1 ivan ccgroup 289680 Feb 17 13:46 multipleSources.dat

So you can see the application continues running fine despite the fact that one of the Redis servers of the replication is down. If you look at the logs, you can see the message indicating a problem communicating with the Redis server on d0521b09:

f0719b10/instances/StreamsInstance/jobs/0/pe.5.out:17 Feb 2015 13:50:51.989 [20308] DEBUG spl_ckpt M[RedisServerPoolActiveReplica.cpp:get:167]  - get() on key H769 received exception with server (d0521b09:6379): Cannot get key (H769) from Data Store Entry domain_ivan/StreamsInstance/0/5 (Caused by SPL::DataStoreException (Cannot create connection (Caused by SPL::DataStoreException (getConnection() failed: server (d0521b09:6379) replied error: Connection refused) at 'redisContext* SPL::RedisServer::getConnection()' [./src/SPL/Runtime/Operator/State/Adapters/RedisAdapter/RedisServer.cpp:153])) at 'SPL::RedisConnection::RedisConnection(SPL::RedisServer*)' [./src/SPL/Runtime/Operator/State/Adapters/RedisAdapter/RedisConnection.cpp:49])

Join The Discussion