Big SQL is the IBM SQL interface in the Apache Hadoop environment. Big SQL provides a way to efficiently query data that is stored on a BigInsights Hadoop cluster. We can use Big SQL to summarize, query, and analyze data.

Oozie is a management application that simplifies workflow and coordinate between MapReduce Jobs. Oozie provides the schedule functionality to run when the required dependencies are met. Oozie Workflow jobs are Directed Acyclical Graphs (DAGs) of actions.

In order to execute BigSQL actions using oozie, follow the steps outlined in this article. This article demonstrates the usage of Oozie app to run the bigsql jobs on the Apache Hadoop service.

This app need to be modified according to customer use cases and need to edit the files accordingly.

1. Create a scratch folder on master node and create the below sequence of files in it
. Ex: /tmp/MyOozieSample

2. Create an oozie workflow.xml with 2 actions such as
“uploadfile” a filesystem action that will help us to create a directory on hdfs and copy the files to hdfs.
“runscript” to invoke the jsqsh shell and run the bigsql statements.

The sequence of the workflow will be as below.

workflow

Workflow.xml

${jobTracker}
${nameNode}
run.sh
run.sh#run.sh
script.sql#script.sql

error message[${wf:errorMessage(wf:lastErrorNode())}]

test.csv

 4 "four"
 5 "five"
 6 "six"
 7 "seven"
 8 "eight"
run.sh
 /usr/ibmpacks/common-utils/current/jsqsh/bin/jsqsh -U youruser -P yourpw -d db2 -p 51000 -D bigsql -S your.bigsql.node.com -i script.sql -n -e

ex:

/usr/ibmpacks/common-utils/current/jsqsh/bin/jsqsh -U your_bigsql_user -P your_bigsql_pw -d db2 -p 51000 -D bigsql -S your.bigsql.node.com -i script.sql -n -e

script.sql(Bigsql statements as per you requirement.)

 USE TEST;
 DROP TABLE IF EXISTS TEST_TABLE;
 CREATE HADOOP TABLE TEST_TABLE (C1_INT INTEGER, C2_VARCHAR VARCHAR(15));
 INSERT INTO TEST_TABLE VALUES (1,'one'),(2,'two'),(3,'three');
 LOAD HADOOP USING FILE URL '/tmp/data/test.csv' INTO TABLE TEST_TABLE APPEND;
 SELECT * FROM TEST_TABLE;

3. Create a HDFS directory and copy the workflow directory.

hadoop fs -mkdir /tmp/bigsqltest
 hadoop fs -put workflow /tmp/bigsqltest

4. Create a job.properties file for launching the oozie job. The job.properties file will need to define the nameNode, jobTracker, and oozie.wf.application.path. Here is an example:

nameNode=hdfs:// your.name.node.com:8020
 jobTracker= your.resource.manager.node.com:8050
 oozie.wf.application.path=${nameNode}/tmp/bigsqltest/workflow

5. Run these commands to launch the Oozie app.

export OOZIE_URL=http://your.oozie.node.com:11000/oozie
 oozie job -config job.properties -run

Join The Discussion

Your email address will not be published. Required fields are marked *