IBM and Hortonworks previously announced a partnership to certify Hortonworks Data Platform (HDP) with IBM Spectrum Scale (Link). IBM Spectrum Scale will be certified with HDP running on IBM Power systems by the end of June, and on x86 systems by the end of July.
HDP is the leading Hadoop and Spark distribution that supports enterprises on their analytics journey. This analytics journey typically starts with consolidation of different data silos to form an Active Archive. The Active Archive is then used to get a single view of the customer and derive different business insights. With IBM Spectrum Scale, customers can build scalable and global data lakes for their Active Archives.
Here are the top five benefits of IBM Spectrum Scale with HDP:
1.) Extreme scalability with parallel file system architecture
IBM Spectrum Scale uses a parallel file system architecture, by contrast with alternative solutions that employ a scale-out architecture. With a parallel architecture, there is no single metadata node that can become a bottleneck. Every node in the cluster serves both data and metadata, allowing a single IBM Spectrum Scale filesystem to store billions of files. This enables clients to grow their HDP environments seamlessly as the data grows.
2.) Global namespace that can span geographies
Using the IBM Spectrum Scale global namespace, clients can create active, remote data copies and enable real-time, global collaboration. This allows multinational corporations to form ‘data lakes’ across the globe, and host their distributed data under one namespace.
3.) Reduce datacenter footprint with the industry’s best in-place analytics
IBM Spectrum Scale has the most comprehensive support for data access protocols. It supports data access using NFS, SMB, Object, POSIX and the HDFS API. This eliminates the need to maintain separate copies of the same data for traditional applications and for purposes of analytics.
4.) True software defined storage – purchase as software OR as a pre-integrated system
Clients can purchase IBM Spectrum Scale as software to install directly on commodity hardware running the HDP stack and create a distributed filesystem in a shared-nothing storage mode. Clients can also purchase IBM Spectrum Scale as part of a pre-integrated system, the IBM Elastic Storage Server (ESS), and connect as shared storage behind commodity hardware running HDP. Clients can use software-only options to start small, while still leveraging enterprise storage benefits. With the ESS, clients can control cluster sprawl and grow storage independently of compute infrastructure.
5.) IBM hardware advantage
IBM ESS includes a software RAID function that eliminates the need for the three-way replication for data protection that is required with other solutions. Instead, IBM ESS requires just 30% extra capacity to offer similar data protection benefits. IBM Power Systems along with the IBM ESS offer the most optimized hardware stack for running analytics workloads. Clients can enjoy up to 3x reduction of storage and compute infrastructure by moving to Power Systems and IBM ESS compared to commodity scale-out x86 systems.
To support security and regulatory compliance requirements of organizations, IBM Spectrum Scale offers filesystem encryption for secure data at rest, policy based tiering, compression, replication, secure backup and secure erase. HDP’s Atlas and Ranger components provide additional data governance capabilities and the ability to define and manage security policies.
The forthcoming formal certification from Hortonworks for IBM Spectrum Scale to run with HDP will help our joint clients to deploy this solution with confidence.