A few weeks ago, we announced a new initiative to release IBMÂ® Open Platform with Apache Hadoop (BigInsightsâ„˘) version 4.0 to help drive collaboration, innovation, and standardization across Hadoop and big data technologies as well as specific packages targeted at Data Scientists, Analysts and Administrators who work with Big Data.
Today, we are absolutely thrilled to announce the GA release of our IBM Open Platform with Apache Hadoop (BigInsights) version 4.0 as well as the packages.
IBM Open Platform with Apache Hadoop
IBM Open Platform with Apache Hadoop is composed of 100 percent open source components for use in big data analysis. This product offering includes open source components such as Ambari, Hadoop, YARN, Spark, Hive, HBase, Knox, Avro, Flume, Pig, Slider, Sqoop, ZooKeeper, Oozie, Avro, Nagios, and more. IBM Open Platform has incorporated the most recent releases across all of the components of Enterprise Hadoop, including Hadoop 2.6.
After installing IBM Open Platform, you can choose from the following additional packages provided by IBM to accelerate the conversion of all types of data into business insight and action:
- BigInsights Business Analyst contains Big SQL, IBMâ€™s SQL engine, and BigSheets, IBMâ€™s intuitive spreadsheet and visualization tool to find data quickly and easily.
- BigInsights Data Scientist contains the analyst module capabilities as well as the following:
- Web-based Text Analytics tooling that allows users to extract data from unstructured text without writing any code.
- New machine-learning engine (SystemML) that automatically tunes its performance based on the size of the data,Â plus over a dozen highly-tuned algorithms such as Decision Trees, PageRank and Clustering.
- It will also provide native support for open source R statistical computing helping clients leverage their existing R algorithms and run them across data stored in a Hadoop cluster utilizing Big R frames.
- BigInsights Enterprise Management introduces new management tools for clients to realize faster time to results. Designed to help allocate resources and optimize workflows, these tools will allow deployments that can scale to large numbers of users and clusters, and will help satisfy high workload demand. These tools will provide multi-tenancy and multi-instance support in a cluster.
As always, we are happy to hear your feedback. Please send your comments and suggestions to the user group or through our community forums.