IBM BigInsights team recently announced successful release of IOP 4.1 (IBM Open Platform) and BigInsights 4.1 on Intel and Power platforms simultaneously. This is a good time to reminisce about what the team has accomplished and how IBM is taking Hadoop ecosystem forward along with the Big Data community. According to many analyst reports, IBM now is one of the top three distributions of Hadoop ecosystem, in the form of IOP.
I can classify IBMâ€™s contributions in the last 15 months or so into following categories:
â€¢ Contributions to strengthen Hadoop enterprise security readiness, where IBM knows how large customers deploy and manage large systems.
â€¢ Improvements to Ambari, by making Ambari easier to manage HDFS (AMBARI-12349) and other distributed services (AMBARI-10333). IBM has also made Ambari more distribution agnostic, to help all ODP distributions. More contribution of this work is in progress.
â€¢ Getting HBase tables to backup and restore (HBASE-10900, HBASE-11085 and more) to gain mobility of HBase data, by full table backup or incremental backup. Adding bulk load (HBASE-10902) and replication (HBASE-9047) between clusters improved performance and availability.
â€¢ Improve Hive Web Interface (HIVE-5132) and usability (HIVE-2957).
â€¢ Getting Hadoop ecosystem on Cloud where security environment (KNOX-565) is quite different from on-premise deployments. These changes help authenticate user interfaces via Knox.
â€¢ Broadening Hadoop availability on to new platforms. Getting entire Hadoop ecosystem running on IBM JDK and on Power and zLinux platforms have not been trivial.
â€¢ Contributing to setting up of ODPi infrastructure by IBM SoftLayer clusters and successfully building and testing a reference implementation to help standardize Hadoop distributions.
Just to quantify JIRA enhancement and defect fixes, IBM team has contributed 81 patches to HBase, 21 patches to Ambari and Hadoop, 20 patches to Hive and Pig, along with contributions to Sentry, Flume, Knox, Zookeeper, Parquet, Oozie, Avro and more in this period.
I would like to THANK many IBM Committers and Contributors for their work. While there are several dozens of developers and QA engineers who contributed, I would like to specially recognize IBM committers for their work in the community: Tanping Wang, Kan Zhang, Jing Chen He, Tuong Truong, Eric Yang, Yan Zhou, Richard Ding, Shai Erera and several other past IBM committers.
We are looking to further increase IBM contributions to strengthen Hadoop ecosystem and enable it to make further inroads into large enterprises.