Exploring data modeling options

In this lab, you’ll explore a few options for modeling data exported from a relational DBMS into Big SQL tables managed by HBase. In the previous lab, you saw how you could map data that was originally modeled as a single relational DBMS table into a Big SQL table. You even saw how you could integrate some minor HBase design enhancements into your Big SQL table, such as grouping frequently queried columns together into a single column family and using short names for HBase columns and column families. However, for simplicity, you looked at your dimension table in isolation; you didn’t consider how this one table related to other tables in the original database.

Although many HBase applications involve managing data that’s outside the typical scope of a relational database, some organizations look to HBase as a potential storage mechanism for offloading seldom-queried relational data, including “cold” or “stale” data in a relational data warehouse. Such data is often spread across multiple tables in a normalized (or somewhat normalized) manner. Relationships among these tables are captured through primary key/foreign key constraints. Indeed, star schema or snowflake schema database designs are common in such scenarios. Furthermore, in some relational database designs, primary keys may not be present in all tables or may consist of multiple SQL columns. Furthermore, many relational tables are densely populated (i.e., they contain relatively few nulls). Finally, queries often involve multi-way joins.

Such characteristics can pose challenges when attempting to port the relational design to Big SQL tables managed by HBase. This lab introduces you to some of these challenges and explores options for addressing them. While a full discussion of Big SQL and HBase data modeling topics is beyond the scope of this introductory lab, you’ll have a chance to explore:

  • Many-to-one mapping of relational tables to an HBase table (de-normalization).
  • Many-to-one mapping of relational table columns to an HBase column.
  • Generation of unique row key values.

Sample data for this lab is based on sample warehouse data available as part of your BigInsights installation and publicly at https://hub.jazz.net/project/jayatheerthan/BigSQLSamples/overview#https://hub.jazz.net/git/jayatheerthan%252FBigSQLSamples/list/master/samples/data.

Allow 1/2 Р1 hour to complete this lab. Prior to starting this lab, you need to know how to connect to your BigInsights Big SQL database and how to issue a Big SQL query using JSqsh or another supported query tool.   Please post questions or comments about this lab or the technologies it describes to the forum on Hadoop Dev at https://developer.ibm.com/hadoop/.

4.1.          De-normalizing relational tables (mapping multiple tables to one HBase table)

Since multi-way join processing isn’t a strength of HBase, you may need to de-normalize a traditional relational database design to implement an efficient HBase design for your Big SQL tables. Of course, Big SQL doesn’t mandate the use of HBase — you can use Hive or simple DFS files — but let’s assume you concluded that you wanted to use HBase as the underlying storage manager.

In this exercise, you will explore one approach for mapping two tables along the PRODUCT dimension of the relational data warehouse into a single Big SQL table. Since the content for the source tables is stored in two separate files that each contain different sets of fields, you won’t be able to directly load these files into your target table. Instead, you’ll need to pre-process or transform the data before using it to populate your Big SQL table. You’ll use Big SQL to help you with that task.

Specifically, you’ll upload the source files into your DFS using standard Hadoop file system commands. Next, you’ll create Big SQL externally managed tables over these files. Doing so simply layers a SQL schema over these files — it does not cause the data to be duplicated or copied into the Hive warehouse.¬†¬† Finally, you’ll select the data you want from these external tables and create a Big SQL HBase table based on that result set.

__1.     If necessary, open a terminal window.

__2.     Check the directory permissions for your DFS.

hdfs dfs -ls /

image55

If the /user directory cannot be written by the public (as shown in the example above), you will need to change these permissions so that you can create the necessary subdirectories for this lab using your standard lab user account.

From the command line, issue this command to switch to the root user ID temporarily:

su root

When prompted, enter the password for this account. Then switch to the hdfs account.

su hdfs

While logged in as user hdfs, issue this command:

hdfs dfs -chmod 777 /user

Next, confirm the effect of your change:

hdfs dfs -ls /

image56

Exit the hdfs user account. return to your standard user account:

Exit

Finally, exit the root user account and return to your standard user account:

exit

__3.     Create directories in your distributed file system for the source data files and ensure public read/write access to these directories.  (If desired, alter the DFS information as appropriate for your environment.)

hdfs dfs -mkdir /user/biadmin
hdfs dfs -mkdir /user/biadmin/hbase_lab
hdfs dfs -mkdir /user/biadmin/hbase_lab/sls_product_dim
hdfs dfs -mkdir /user/biadmin/hbase_lab/sls_product_line_lookup
hdfs dfs -chmod -R 777 /user/biadmin

__4.          Upload the source data files into their respective DFS directories. Change the local and DFS directories information below to match your environment.

hdfs dfs -copyFromLocal /your-dir/data/GOSALESDW.SLS_PRODUCT_DIM.txt /user/biadmin/hbase_lab/sls_product_dim/SLS_PRODUCT_DIM.txt
hdfs dfs -copyFromLocal /your-dir/data/GOSALESDW.SLS_PRODUCT_LINE_LOOKUP.txt /user/biadmin/hbase_lab/sls_product_line_lookup/SLS_PRODUCT_LINE_LOOKUP.txt

__5.          List the contents of the DFS directories into which you copied the files to validate your work.

hdfs dfs -ls /user/biadmin/hbase_lab/sls_product_dim

image57

hdfs dfs -ls /user/biadmin/hbase_lab/sls_product_line_lookup

image58

__6.          If necessary, launch your Big SQL query execution environment.

__7.          Create external Big SQL tables for the sales product dimension (extern.sls_product_dim) and the sales product line lookup (extern.sls_product_line_lookup) tables. Note that the LOCATION clause in each statement references the DFS directory into which you copied the sample data.

-- product dimension table
CREATE EXTERNAL HADOOP TABLE IF NOT EXISTS extern.sls_product_dim
( product_key INT NOT NULL
, product_line_code INT NOT NULL
, product_type_key INT NOT NULL
, product_type_code INT NOT NULL
, product_number INT NOT NULL
, base_product_key INT NOT NULL
, base_product_number INT NOT NULL
, product_color_code INT
, product_size_code INT
, product_brand_key INT NOT NULL
, product_brand_code INT NOT NULL
, product_image VARCHAR(60)
, introduction_date TIMESTAMP
, discontinued_date TIMESTAMP
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
location '/user/biadmin/hbase_lab/sls_product_dim';

-- look up table with product line info in various languages
CREATE EXTERNAL HADOOP TABLE IF NOT EXISTS extern.sls_product_line_lookup
( product_line_code INT NOT NULL
, product_line_en VARCHAR(90) NOT NULL
, product_line_de VARCHAR(90), product_line_fr VARCHAR(90)
, product_line_ja VARCHAR(90), product_line_cs VARCHAR(90)
, product_line_da VARCHAR(90), product_line_el VARCHAR(90)
, product_line_es VARCHAR(90), product_line_fi VARCHAR(90)
, product_line_hu VARCHAR(90), product_line_id VARCHAR(90)
, product_line_it VARCHAR(90), product_line_ko VARCHAR(90)
, product_line_ms VARCHAR(90), product_line_nl VARCHAR(90)
, product_line_no VARCHAR(90), product_line_pl VARCHAR(90)
, product_line_pt VARCHAR(90), product_line_ru VARCHAR(90)
, product_line_sc VARCHAR(90), product_line_sv VARCHAR(90)
, product_line_tc VARCHAR(90), product_line_th VARCHAR(90)
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
location '/user/biadmin/hbase_lab/sls_product_line_lookup';

 

If you encounter a SQL -5105 error message such as the one shown below, the DFS directory permissions for your target directory (e.g., /user/biadmin) may be too restrictive.image59From an OS terminal window, issue this command:hdfs dfs -ls /user/biadmin/Your permissions must include rw settings. Consult the earlier steps in this lab for instructions on how to reset DFS directory permissions.

 

__8.          Verify that you can query each table.

-- total rows in EXTERN.SLS_PRODUCT_DIM = 274
 select count(*) from EXTERN.SLS_PRODUCT_DIM;

 

-- total rows in EXTERN.SLS_PRODUCT_LINE_LOOKUP = 5
 select count(*) from EXTERN.SLS_PRODUCT_LINE_LOOKUP;

__9.          Become familiar with the content of the extern.sls_product_dim table.

select * from extern.sls_product_dim fetch first 3 rows only;

image60

Note that most of the columns contain only numeric codes; these columns typically serve as join keys for other tables in the PRODUCT dimension that contain more descriptive information.

__10.       Become familiar with the contents of the extern.sls_product_line_lookup table by selecting data in a few of its columns:

select product_line_code, product_line_en, product_line_de, product_line_fr from extern.sls_product_line_lookup;

image61

Note that the PRODUCT_LINE_CODE column in this table is likely to be joined with the PRODUCT_LINE_CODE column of the sales product dimension table you created earlier. This will make it easy for your to de-normalize (or flatten) these tables into a single HBase table with a unique row key based on the product line code.

__11. ¬†¬†¬†¬†¬† Execute a SELECT statement to join the two tables. In a moment, you’ll use this statement as a basis for creating a new HBase table modeled after the result set of this query. (For simplicity, you will only retrieve a subset of columns in each.)

select product_key, d.product_line_code, product_type_key,
product_type_code, product_line_en, product_line_de
from extern.sls_product_dim d, extern.sls_product_line_lookup l
where d.product_line_code = l.product_line_code
fetch first 3 rows only;

image62

 

__12. ¬†¬†¬†¬†¬† Verify that the following join of these tables will produce a result set with 274 rows — the same number as the extern.sls_product_dim table. After all, you simply want to add more descriptive information (extracted from the extern.sls_product_line_lookup table) to its contents in your new HBase table.

-- this query should return 274
select count(*)
from extern.sls_product_dim d, extern.sls_product_line_lookup l
where d.product_line_code = l.product_line_code;

__13.       Now you’re ready to create a Big SQL HBase table derived from your join query. In effect, you’re using Big SQL to transform source data in 2 files into a format suitable for a single HBase table. Issue this statement:

-- flatten 2 product tables into 1 hbase table
CREATE hbase TABLE IF NOT EXISTS bigsqllab.sls_product_flat
( product_key INT NOT NULL
, product_line_code INT NOT NULL
, product_type_key INT NOT NULL
, product_type_code INT NOT NULL
, product_line_en VARCHAR(90)
, product_line_de VARCHAR(90)
)
column mapping
(
key   mapped by (product_key),
data:c2 mapped by (product_line_code),
data:c3 mapped by (product_type_key),
data:c4 mapped by (product_type_code),
data:c5 mapped by (product_line_en),
data:c6 mapped by (product_line_de)
)
as select product_key, d.product_line_code, product_type_key,
product_type_code, product_line_en, product_line_de
from extern.sls_product_dim d, extern.sls_product_line_lookup l
where d.product_line_code = l.product_line_code;

__14.       Verify that there are 274 rows in the table you just created.

select count(*) from bigsqllab.sls_product_flat;

__15.       Query the bigsqllab.sls_product_flat table.

select product_key, product_line_code, product_line_en
from bigsqllab.sls_product_flat
where product_key > 30270;

image63

 

4.2.          Generating unique row key values

In this exercise, you’ll consider a situation in which the original relational table didn’t have a single column serving as its primary key. Thus, a one-to-one mapping of this table to a Big SQL table won’t be appropriate unless you take some additional action, such as using the FORCE UNIQUE KEY clause of the Big SQL CREATE HBASE TABLE statement. You’ll explore that approach here.

__1.     Become familiar with the schema for the SLS_SALES_FACT table exported from a relational database. Inspect the details below, and note that the primary key for this table is comprised of several columns.

Table: SLS_SALES_FACT

 

Columns:

ORDER_DAY_KEY

ORGANIZATION_KEY

EMPLOYEE_KEY

RETAILER_KEY

RETAILER_SITE_KEY

PRODUCT_KEY

PROMOTION_KEY

ORDER_METHOD_KEY

SALES_ORDER_KEY

SHIP_DAY_KEY

CLOSE_DAY_KEY

QUANTITY

UNIT_COST

UNIT_PRICE

UNIT_SALE_PRICE

GROSS_MARGIN

SALE_TOTAL

GROSS_PROFIT

 

Primary Key:

ORDER_DAY_KEY

ORGANIZATION_KEY

EMPLOYEE_KEY

RETAILER_KEY

RETAILER_SITE_KEY

PRODUCT_KEY

PROMOTION_KEY

ORDER_METHOD_KEY

 

Foreign Key: PRODUCT_KEY

Parent table: SLS_PRODUCT_DIM

__2.          If necessary, open a terminal window.

__3.          Change directories to the location of the sample data in your local file system. Alter the directory specification shown below to match your environment.

cd /usr/ibmpacks/bigsql/4.0/bigsql/samples/data

__4.          Count the number of lines (records) in the GOSALESDW.SLS_SALES_FACT.txt file, verifying that 446023 are present:

wc -l GOSALESDW.SLS_SALES_FACT.txt

__5.     If necessary, launch your Big SQL query execution environment.

__6.     From your query execution environment, create a Big SQL HBase table named sls_sales_fact_unique. Include a FORCE KEY UNIQUE clause with the row key specification to instruct Big SQL to append additional data to the ORDER_DAY_KEY values to ensure that each input record results in a unique value (and therefore a new row in the HBase table). This additional data won’t be visible to users who query the table.

CREATE HBASE TABLE IF NOT EXISTS BIGSQLLAB.SLS_SALES_FACT_UNIQUE
(
ORDER_DAY_KEY     int,
ORGANIZATION_KEY  int,
EMPLOYEE_KEY      int,
RETAILER_KEY      int,
RETAILER_SITE_KEY int,
PRODUCT_KEY       int,
PROMOTION_KEY     int,
ORDER_METHOD_KEY  int,
SALES_ORDER_KEY   int,
SHIP_DAY_KEY      int,
CLOSE_DAY_KEY     int,
QUANTITY          int,
UNIT_COST         decimal(19,2),
UNIT_PRICE        decimal(19,2),
UNIT_SALE_PRICE   decimal(19,2),
GROSS_MARGIN      double,
SALE_TOTAL        decimal(19,2),
GROSS_PROFIT      decimal(19,2)
)
COLUMN MAPPING
(
key         mapped by (ORDER_DAY_KEY) force key unique,
cf_data:cq_ORGANIZATION_KEY   mapped by (ORGANIZATION_KEY),
cf_data:cq_EMPLOYEE_KEY       mapped by (EMPLOYEE_KEY),
cf_data:cq_RETAILER_KEY       mapped by (RETAILER_KEY),
cf_data:cq_RETAILER_SITE_KEY  mapped by (RETAILER_SITE_KEY),
cf_data:cq_PRODUCT_KEY        mapped by (PRODUCT_KEY),
cf_data:cq_PROMOTION_KEY      mapped by (PROMOTION_KEY),
cf_data:cq_ORDER_METHOD_KEY   mapped by (ORDER_METHOD_KEY),
cf_data:cq_SALES_ORDER_KEY    mapped by (SALES_ORDER_KEY),
cf_data:cq_SHIP_DAY_KEY       mapped by (SHIP_DAY_KEY),
cf_data:cq_CLOSE_DAY_KEY      mapped by (CLOSE_DAY_KEY),
cf_data:cq_QUANTITY           mapped by (QUANTITY),
cf_data:cq_UNIT_COST          mapped by (UNIT_COST),
cf_data:cq_UNIT_PRICE         mapped by (UNIT_PRICE),
cf_data:cq_UNIT_SALE_PRICE    mapped by (UNIT_SALE_PRICE),
cf_data:cq_GROSS_MARGIN       mapped by (GROSS_MARGIN),
cf_data:cq_SALE_TOTAL         mapped by (SALE_TOTAL),
cf_data:cq_GROSS_PROFIT       mapped by (GROSS_PROFIT)
);

__7.     Load data into this table. Adjust the user ID, password, and directory information as needed for your environment.

LOAD HADOOP using file url
'sftp://yourID:yourPassword@rvm.svl.ibm.com:22/your-dir/GOSALESDW.SLS_SALES_FACT.txt' with SOURCE PROPERTIES ('field.delimiter'='\t')
INTO TABLE bigsqllab.sls_sales_fact_unique;

image64

__8.      Verify that the table contains the expected number of rows (446023).

select count(*) from bigsqllab.sls_sales_fact_unique;

__9.          Query the table.

select order_day_key, product_key, sale_total from bigsqllab.sls_sales_fact_unique
where order_day_key BETWEEN 20040112 and 20040115
fetch first 5 rows only;

 

image65

 

__10.       Optionally, investigate what happens if you had omitted the FORCE UNIQUE KEY clause when creating this table.

Create a table named sls_sales_fact_nopk that omits the FORCE UNIQUE KEY clause for the row key definition.

CREATE HBASE TABLE IF NOT EXISTS BIGSQLLAB.SLS_SALES_FACT_NOPK
(
ORDER_DAY_KEY     int,
ORGANIZATION_KEY  int,
EMPLOYEE_KEY      int,
RETAILER_KEY      int,
RETAILER_SITE_KEY int,
PRODUCT_KEY       int,
PROMOTION_KEY     int,
ORDER_METHOD_KEY  int,
SALES_ORDER_KEY   int,
SHIP_DAY_KEY      int,
CLOSE_DAY_KEY     int,
QUANTITY          int,
UNIT_COST         decimal(19,2),
UNIT_PRICE        decimal(19,2),
UNIT_SALE_PRICE   decimal(19,2),
GROSS_MARGIN      double,
SALE_TOTAL        decimal(19,2),
GROSS_PROFIT      decimal(19,2)
)
COLUMN MAPPING
(
key         mapped by (ORDER_DAY_KEY),
cf_data:cq_ORGANIZATION_KEY   mapped by (ORGANIZATION_KEY),
cf_data:cq_EMPLOYEE_KEY       mapped by (EMPLOYEE_KEY),
cf_data:cq_RETAILER_KEY       mapped by (RETAILER_KEY),
cf_data:cq_RETAILER_SITE_KEY  mapped by (RETAILER_SITE_KEY),
cf_data:cq_PRODUCT_KEY        mapped by (PRODUCT_KEY),
cf_data:cq_PROMOTION_KEY      mapped by (PROMOTION_KEY),
cf_data:cq_ORDER_METHOD_KEY   mapped by (ORDER_METHOD_KEY),
cf_data:cq_SALES_ORDER_KEY    mapped by (SALES_ORDER_KEY),
cf_data:cq_SHIP_DAY_KEY       mapped by (SHIP_DAY_KEY),
cf_data:cq_CLOSE_DAY_KEY      mapped by (CLOSE_DAY_KEY),
cf_data:cq_QUANTITY           mapped by (QUANTITY),
cf_data:cq_UNIT_COST          mapped by (UNIT_COST),
cf_data:cq_UNIT_PRICE         mapped by (UNIT_PRICE),
cf_data:cq_UNIT_SALE_PRICE    mapped by (UNIT_SALE_PRICE),
cf_data:cq_GROSS_MARGIN       mapped by (GROSS_MARGIN),
cf_data:cq_SALE_TOTAL         mapped by (SALE_TOTAL),
cf_data:cq_GROSS_PROFIT       mapped by (GROSS_PROFIT)
);

__11.     Load data into this table from your source file. Adjust the file URL specification as needed to match your environment.

LOAD HADOOP using file url
'sftp://yourID:yourPassword@rvm.svl.ibm.com:22/your-dir/GOSALESDW.SLS_SALES_FACT.txt'with SOURCE PROPERTIES ('field.delimiter'='\t') INTO TABLE bigsqllab.sls_sales_fact_nopk;

__12.     Count the number of rows in your table. Note that there are only 440 rows.

select count(*) from bigsqllab.sls_sales_fact_nopk;

__13.¬†¬†¬†¬†¬† Consider what just occurred. You loaded a file with 446023 records into your Big SQL HBase table without error, yet only 440 rows are present in your table. That’s because HBase ensures that each row key is unique. If you put 5 different records with the same row key into a native HBase table, your HBase table will contain only 1 current row for that row key. Because you mapped a SQL column with non-unique values to the HBase row key, HBase essentially updated the information for those rows containing duplicate row key values.

4.3.          Mapping multiple SQL columns to one HBase row key or column

Until now, you’ve mapped each field from a source file (i.e., each SQL column in the source relational table) to a single Big SQL HBase column. Although straightforward to implement, this one-to-one mapping approach has a significant drawback: it consumes considerable disk space. Why? As you learned earlier, HBase stores full key information (row key, column family name, column name, and timestamp) along with each cell value. As a result, HBase tables with many columns can consume considerable disk space.

In this exercise, you’ll explore two different many-to-one column mapping options. In particular, you’ll define a composite key for the HBase row key; in other words, your row key will be based on multiple SQL columns. In addition, you’ll define dense columns in your HBase table; in other words, one HBase column will be based on multiple SQL columns.

__1.¬†¬†¬†¬† Consider the relational schema for the SLS_SALES_FACT table shown at the beginning of the previous exercise. Recall that its primary key spanned several SQL columns, which you’ll model as a composite row key in your Big SQL HBase table. In addition, let’s assume that some SQL columns are commonly queried together, such as columns related to pricing and cost. Packing these SQL columns into a single, dense HBase column can reduce the I/O required to read and write this data.

__2.     If necessary, launch your Big SQL query execution environment.

__3.     Create a new Big SQL sales fact table named sls_sales_fact_dense with a composite key and dense columns.

CREATE HBASE TABLE IF NOT EXISTS BIGSQLLAB.SLS_SALES_FACT_DENSE
(
ORDER_DAY_KEY     int,
ORGANIZATION_KEY  int,
EMPLOYEE_KEY      int,
RETAILER_KEY      int,
RETAILER_SITE_KEY int,
PRODUCT_KEY       int,
PROMOTION_KEY     int,
ORDER_METHOD_KEY  int,
SALES_ORDER_KEY   int,
SHIP_DAY_KEY      int,
CLOSE_DAY_KEY     int,
QUANTITY          int,
UNIT_COST         decimal(19,2),
UNIT_PRICE        decimal(19,2),
UNIT_SALE_PRICE   decimal(19,2),
GROSS_MARGIN      double,
SALE_TOTAL        decimal(19,2),
GROSS_PROFIT      decimal(19,2)
)
COLUMN MAPPING
(
key               mapped by
(ORDER_DAY_KEY, ORGANIZATION_KEY, EMPLOYEE_KEY, RETAILER_KEY, RETAILER_SITE_KEY, PRODUCT_KEY, PROMOTION_KEY, ORDER_METHOD_KEY),

cf_data:cq_OTHER_KEYS   mapped by
(SALES_ORDER_KEY, SHIP_DAY_KEY, CLOSE_DAY_KEY),

cf_data:cq_QUANTITY           mapped by (QUANTITY),

cf_data:cq_MONEY        mapped by
(UNIT_COST, UNIT_PRICE, UNIT_SALE_PRICE, GROSS_MARGIN, SALE_TOTAL, GROSS_PROFIT)
);

__4.     Load data into this table. (Adjust the SFTP specification below to match your environment.)

LOAD HADOOP using file url
'sftp://yourID:yourPassword@svl.ibm.com:22/opt/ibm/biginsights/bigsql/samples/data/GOSALESDW.SLS_SALES_FACT.txt' with SOURCE PROPERTIES ('field.delimiter'='\t')
INTO TABLE bigsqllab.sls_sales_fact_dense;

__5.     Verify that the table contains the expected number of rows (446023).

select count(*) from bigsqllab.sls_sales_fact_dense;

__6.          Query the table.

select order_day_key, product_key, sale_total from bigsqllab.sls_sales_fact_dense where order_day_key BETWEEN 20040112 and 20040115 fetch first 5 rows only;

image66

__7.¬†¬†¬†¬† If you’re curious about how the storage savings of the “dense” model of your fact table compares with the original model that mapped each SQL column to an HBase row key or column, open a terminal window and execute this command:

hdfs dfs -du /apps/hbase/data/data/default | grep -i bigsqllab | sort

Compare the size of the sls_sales_fact_dense table (shown in the first line of the sample output) with the size of the bigsqllab.sls_sales_fact_unique table (shown in the third line of the sample output). The “dense” table consumes much less space than the original table.

image67

4.4.          Optional: Dropping tables created in this lab

If you don’t plan to complete any subsequent labs, you may want to clean up your environment. This optional exercise contains instruction to do so.

__1.     Drop the tables you created in this lab.

drop table sls_product_dim;
drop table sls_product_line_lookup;
drop table sls_product_flat;
drop table bigsqllab.sls_sales_fact_unique;
drop table sls_sales_fact_nopk;
drop table sls_sales_fact_dense;

__2.     Optionally, verify that these tables no longer exist. For example, query each table and confirm that you receive an error message indicating that the table name you provided is undefined (SQLCODE -204, SQLSTATE 42704).

 

Summary

In this lab, you gained hands-on experience using HBase natively as well as using Big SQL with HBase.   You learned how to create, popular and retrieve data from HBase using the HBase shell. In addition, you saw how Big SQL can store its data in HBase tables, thereby affording programmers sophisticated SQL query capabilities. You also explored some data modeling options available through Big SQL.

To expand your skills even further, visit the HadoopDev web site (https://developer.ibm.com/hadoop/) contains for links to free online courses, tutorials, and more.

Join The Discussion

Your email address will not be published. Required fields are marked *