This blog provides information about setting up HBase Read HA in IBM Open Platform 4.1

Overview: HBase Read HA
HBase Read High Availability maintains multiple copies of the data in Region Replicas: Primary and Secondary. With this feature enabled, the data on a failed RegionServer can be read from another RegionServer. This leads to high read availability. This feature becomes extremely useful for applications that are read-intensive, requires constant read uptime and flexible consistency. HBase Read HA helps provide 99.99% availability.

Since Data is replicated in a primary region along with one ore more secondary region, the read requests can be served by any of the regions. One thing to make a note of is that the data can only be written to the primary region and not to the secondary. Each of the replicas receives updates in the same order.

Pros
Depending on user’s requirement, they can specify whether stale data is acceptable or they need strict consistency. Providing the flag, Consistency.TIMELINE or Consistency.STRONG respectively can change the behavior.
There is no need for any concurrency control for writes, since writes happen to the primary region only. Clients can read data from any replica. With at least one replica and proper rack configuration, data is made available for read even at the time of failures.

Configuring HBase Read HA
For configuring Read HA feature, user needs to add couple of properties to do client-side as well as server-side configurations in Advanced HBase-site.xml (see the Snaps below).

Add/Update the following server-side properties in hbase-site.xml via Ambari. Ambari →HBase → Configs

table_hbase
Note: The values assigned are sample values.

Add/Update the following client-side properties in hbase-site.xml via Ambari. Ambari → HBase → Configs

table_hbase_2
Note: The values assigned are sample values.

Below is the snippet of HBase config page after the required server-side and client-side properties are added.

Table_Hbase_3

After updating the configurations, next step would be to restart HBase service and create read HA HBase tables.

Create the read HA HBase tables using HBase Shell as follows:
Syntax
create ‘tableOne’, ‘familyOne’, {REGION_REPLICATION => 2}
create ‘tableOne’, ‘familyOne’, {REGION_REPLICATION => 3}

Example:
hbase(main):003:0> create ‘tableOne’, ‘familyOne’, {REGION_REPLICATION => 3}
0 row(s) in 2.2990 seconds
=> Hbase::Table – tableOne

Where REGION_REPLICATION is a keyword and can have 1, 2 or 3 as the value, which indicates the number of replicas being requested. The default value is 1. A value of 3 implies, two replicas to secondary regions.

The snippet below shows a table, “tableOne”, with 3 replicas. The replica with ReplicaID 0 being the primary whereas the other two being the secondary.

table_hbase_4

Querying Secondary Regions

To specify the desired data consistency for each query, use the HBase shell:
hbase(main):001:0> get ‘tableOne’, ’16’, {CONSISTENCY => “TIMELINE”}

table_hbase_5

hbase(main):001:0> get ‘tableOne’, ’16’, {CONSISTENCY => “STRONG”}
table_hbase_6

where ‘16’ is the row number.

User can request for a specific region replica as well.
hbase> get ‘tableOne’, ’16’, {REGION_REPLICA_ID=>2, CONSISTENCY=>’TIMELINE’}
table_hbase_7

Join The Discussion

Your email address will not be published. Required fields are marked *