2021 Call for Code Awards: Live from New York, with SNL’s Colin Jost! Learn more

Containing an Elephant: How We Took Hadoop/HBase into Kubernetes and Public Cloud

Salesforce runs a large footprint of HBase and HDFS clusters in its data centers with multiple petabytes of data, billions of queries per day and thousands of machines.

After more than a decade of running its own data centers, Salesforce has started moving into public cloud. As part of this foray, they made the bold decision to move their HBase clusters from staid bare metal hosts to the dynamic and immutable world of containers and Kubernetes.

Along the way, they have had to overcome issues in HBase due to changing IP addresses of containers, ensure service availability in the face of zone level failures in Public Cloud, deal with the limitations of Kubernetes when it comes to stateful applications and introduce encryption to secure big data traffic without changing the application.

This presentation describes how these issues were overcome and the benefits of this shift.