Announcing the immediate release of IBM Db2 Big SQL v5.0.2 – maintenance release.

IBM Db2 Big SQL, an advanced SQL engine on Hadoop, has been making strides with the fast-evolving open source ecosystem. The core capabilities of Big SQL focusses on federation, SQL compatibility, scalability, performance, and of course enterprise security, making it a desirable query engine to seek insights from disparate data sources including Hadoop.

Big SQL v5.0.2 is now being released with new capabilities to provide enterprise readiness and ease of use.

Enterprise Readiness 

Key focus for Big SQL v5.0.2 is to provide capabilities that enable customers to be enterprise ready. Being enterprise ready means being secure, scalable, stable and easy to run in production. Some of the new capabilities added in 5.0.2 are:

  • Introducing Materialized Query tables (MQTs) for improved query optimization

MQT is a special type of a table that can be created and used to enhance query performance. This is a well-established capability in RDBMS world. With Big SQL, we bring this advanced capability to Big Data by creating native user-maintained Hadoop materialized query tables (MQTs) over Hadoop tables. With MQTs created across worker nodes, just like other Hadoop tables, data does not have to be shipped from the head node to worker nodes during query evaluation. Using Hadoop MQTs can significantly improve performance especially complex queries.

  • Big SQL Ranger plugin is now GA

Big SQL Ranger plugin provides a centralized framework to enable, monitor, and manage comprehensive data security across the Hadoop platform. Having been in Technical Preview, the Big SQL Ranger plugin is now GA to provide enhanced security administration in Big Data ecosystem.

  • Introducing edge node deployment (in the same cluster)

Big SQL 5.0.2 can be installed on dedicated edge nodes (non-datanodes) in the same cluster. This helps separate compute and storage and enables customers to choose this deployment when their need is for compute intense workloads. So, now customers can pick the best deployment type based on the cluster’s resources, network speed, and the enabled Big SQL features.

  • Enhanced SQL capabilities

The SQL capabilities in Big SQL are constantly updated to provide enhanced SQL compatibility and tolerance. Some notable enhancement made in this release are:

  • Added support for native Boolean data type, which provides improved SQL compatibility with Hive, Netezza, and Oracle, and improved compatibility with the IBM Common SQL Engine.
  • Big SQL now provides data type toleration for the Oracle proprietary data types VARCHAR2 and NUMBER in DDL and PL/SQL when the SQL_COMPAT global variable is set to ‘ORA’. 

 

Ease of Use

The other focus for this release has been to improve the Ease of Use of the product which directly benefits the customers day to day activities. Ease of use brings provides efficient, effective, engaging, error tolerant and easy to learn capabilities in the product. Some capabilities added or enhanced to provide ease of use are:

  • Java reader improvements

The Java reader provides better out-of-the-box performance, improved resource utilization, improved serviceability, and extensibility. As a result of these improvements, and for optimal performance and functionality, the ORC file format has been added to the list of recommended file formats for Big SQL.

  • Improved log usability
    • Big SQL integration with Ambari Log Search enables quick access to logs and search logs from a single location. Cluster administrators can analyze and monitor the main Big SQL logs from a central page.
    • By exposing Big SQL’s log4j properties files and the log directory through Ambari configurations, Big SQL’s logs can be configured to non-default paths and also capture different log levels by changing the Big SQL service configurations through Ambari.
  • Improved YARN/Slider usability
    • Big SQL’s Slider application now supports YARN node labels. This provides a way to schedule workloads on specified nodes and achieve resource isolation among workloads/organizations in the cluster
    • Configure one container per host. So, you can allocate one large container for Big SQL’s enhanced execution
    • Resource allocation can be modified to enable Big SQL to handle changing workloads efficiently
  • Installation and upgrade improvements

The following improvements were made to the Big SQL upgrade process and to the migration from IBM Open Platform (IOP) to the Hortonworks Data Platform (HDP): 

  • An upgrade pre-checker runs as part of an upgrade and can also be run independently of an upgrade
  • Several steps that you needed to perform manually when upgrading Big SQL are now fully automated.
  • Upgrades of HDP is now independent of Big SQL upgrades. Now you can upgrade HDP without upgrading Big SQL. 
  • Integration with data analysis and visualization tools

Data-driven and interactive data analytics can be done using web-based notebooks. With Big SQL’s integration, data from disparate sources and ML models can be used in queries. Big SQL is now integrated with:

  • Apache Zeppelin
  • IBM Data Science Experience Local

 

Technical documentation can be found in IBM Knowledge Center.

Big SQL 5.0.2 is available for download from Passport Advantage and Passport Advantage Express website. 

More information on Hortonworks Data Platform (HDP) 2.6.3: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.3/index.html

 

Join The Discussion

Your email address will not be published. Required fields are marked *