Big SQL Offline Backup and Restore to the Same Cluster

The intention of this blog is to outline one of the procedures that can be taken to backup the Big SQL database across the nodes in the cluster. The procedure outlined here is for offline backup and restore on all Big SQL nodes which means that the backup/restore is taken when no active users are logged into the cluster. The backup will include any data from the worker nodes such as performance data gathered from event monitors but HA should be disabled while taking this backup.

The backup of the Big SQL database should be scheduled in off peak hours. The restore is intended to be done only in case of emergencies for example if a table is accidentally deleted. The steps outlined here are for backup and restoration on the same cluster. This is just one of the Big SQL recovery options more options can be found in Big SQL Best Practice and Guidelines – Recovery. For cases where there is no Disaster Recovery or High Availability implemented, it is recommended to take regular backups of the Big SQL database. The frequency of the backups are up to the DB administrator.

Important – Since Big SQL works with other components such as Hive, HDFS and HBase, backing up of these components is also required. It is recommended to take the backups of other components the same time that the backup is taken for the Big SQL database. After restoring the Big SQL database, restore the Hive metastore as well. Some suggestions are given of how to backup and restore other components and links are given for convenience. However, please consult with other db vendor’s documentation to determine the most up to date instructions for other components outside of Big SQL.

There is a bigsql_bar.py utility that can be used for taking offline backups of the Big SQL database from Big SQL 5.0.2. The following commands can be used to take an offline backup of the Big SQL database and restore it on the same cluster using the utility. The Hive backups and HDFS Snapshots should also be taken when using the utility, the order of execution is as follows:

    1. Create a Backup directory for Big SQL Backup (Section 1.1.1)  mkdir /backup/bigsql_dbbackup  2. Backup the Big SQL database on all nodes of the cluster  bar.py -m backup --offline --onallnodes -d /backup/bigsql_dbbackup/offline  3. Take Hive Backups and HDFS Snapshots (sections 1.2 and 1.3 below)  4. Restore Backup only in case of emergency  bar.py -m restore --offline --onallnodes --local -d /backup/bigsql_dbbackup/offline   5. Restore the Hive Backup only in case of emergency (section 2.2 below)

Or the commands below can be used to take the offline backup if the utility is not being used:

1. Backup

No workloads/clients should be running when this option is chosen so schedule these in off peak hours. If there are many Big SQL local tables tables in your environment, then taking a more frequent backup is recommended.

1.1 Big SQL OFFLINE Backup

1.1.1 Create Backup Directory for Big SQL Backup

When taking a backup, first choose a path that is different from the bigsql_db_path or bigsql_data_directories and also is not in the Big SQL home directory (/home/bigsql/). Use the mkdir command to create the backup directory on the Big SQL Head node and ALL worker nodes. For example:

    mkdir /backup/bigsql_dbbackup

1.1.2 Turn off log archiving

If LOGARCHMETH1 is set this needs to be turned off. LOGARCHMETH1 is set when HA is configured to LOGRETAIN. Turning off LOGARCHMETH1 will result in HA being disabled as well. First check if this is configured by executing the commands as the bigsql user on the head node, note that the default is non archival logging. Note the value of LOGARCHMETH1 as it is needed when it is reset in a step below.

     db2 get db cfg for bigsql | grep LOGARCHMETH1   First log archive method                 (LOGARCHMETH1) = DISK:/home/bigsql/bigsql_log/archive/      -- if this is set to anything other than OFF then turn it off via the following command   db2 update db cfg for bigsql using LOGARCHMETH1 OFF

1.1.3 Deactivate and take an OFFLINE Big SQL backup

Issue the following commands which require DB2 administrator (DBADMIN) privileges as the bigsql user from the Big SQL head node:

    // Turn off ATS prior to taking a backup  db2set DB2_ATS_ENABLE=NO  db2set DB2COMM=  //restart Big SQL service  db2_all “db2 force applications all”  db2 terminate  db2 deactivate db bigsql   db2 "backup db bigsql on all dbpartitionnums to /backup/bigsql_dbbackup/offline"   Take a Hive Metastore and a HDFS/HBase snapshot before proceeding   db2 activate db bigsql  -- and continue with day to day activity  // Turn ON ATS after taking the backup  db2set DB2_ATS_ENABLE=YES  db2set DB2COMM=TCPIP   //restart Big SQL service

1.1.4 Reset log archiving

If LOGARCHMETH1 was set prior to taking the backup this can be reset at this time.

     db2 get db cfg for bigsql | grep LOGARCHMETH1   First log archive method                 (LOGARCHMETH1) = OFF   db2 update db cfg for bigsql using LOGARCHMETH1 DISK:/home/bigsql/bigsql_log/archive/  OR   db2 update db cfg for bigsql using LOGARCHMETH1 LOGRETAIN   db2 force applications all   db2 terminate   db2 deactivate db bigsql    db2 activate db bigsql

1.2 Hive metastore Backup

More information on Hive Backup and restore can be found in HDP Documentation on backup. For convenience some commands are given for the default hive mysql database.
First, login to Ambari UI, go to Hive –> Config –> Advanced, check for the property “Database Type” – make sure it is MySQL and get the database name e.g HIVE. Login via shell to the meta-store host as root user and issue the following commands:

    mkdir /hive_backup  mysqldump hive > /hive_backup/hive.sql

1.3 HDFS/HBase Snapshots

If a table is created with the CREATE HADOOP TABLE clause, when the table is dropped the underlying data is also dropped. If a table is created with the EXTERNAL keyword then when the table is dropped the meta-data is dropped but the underlying data is kept intact. Usually, there is a mix of EXTERNAL and regular HADOOP table therefore the actual data needs to be backed up.

HDFS or HBase snapshots can be used to take a backup of the HDFS and HBase files. More details can be found on HDFS Snapshots and HBase Snapshots. Some simple instructions are given here for convenience of how to take a HDFS snapshot at the table level. Snapshots can be more coarse or granular in nature, it is up to the administrator.

1.3.1 Determine the hdfs location of the table

The HDFS location of a specific table can be determined from Hive or Big SQL. The hive commands are:

    hive> use bigsql;  hive> describe extended 'table_name';

Save the path after the LOCATION keyword. The big sql commands to retrieve the hdfs path of the table is:

    db2 "VALUES(SYSHADOOP.HCAT_DESCRIBETAB(‘schema_name', ‘table_name'))"

1.3.2 Take a HDFS snapshot

Logon to the machine that is referenced in that path as the hdfs user and issue the following commands:

    hdfs dfsadmin -allowSnapshot 'hdfs_path'  hdfs dfs -createSnapshot 'hdfs_path'

As an example, the following commands takes an HDFS snapshot for the specified hdfs_path corresponding to a Big SQL table:

    hive>use bigsql;  hive>describe extended web_sales;  Path after LOCATION keyword is hdfs://styes4.fyre.ibm.com:8020/apps/hive/warehouse/bigsql.db/web_sales  Log on styes4.fyre.ibm.com as hdfs user  hdfs dfsadmin -allowSnapshot /apps/hive/warehouse/bigsql.db/web_sales  hdfs dfs -createSnapshot /apps/hive/warehouse/bigsql.db/web_sales

2. Restore (ONLY in case of emergencies)

A restore operation should only be taken in case of an emergency for example a table is accidentally dropped or the database can not be accessed for some reason. It is best to first open up a support ticket to determine if the restore should indeed be done. When the meta-data for the Big SQL and Hive metastore is restored, in some cases a restoration of the HDFS snapshot is not needed. So first restore the Big SQL database and Hive catalog and determine if the data is accessible before restoring any of the HDFS snapshots.

2.1 Big SQL Restore (ONLY in case of emergency)

2.1.1 Determine the timestamp of the backup

When the backup is taken, a file with a timestamp is created on each of the Big SQL nodes. Example of a valid timestamp is 20170927071532. Go to the backup directory specified in the backup step for example /backup/bigsql_dbbackup and determine the timestamp of the latest backup.

2.2.2 Restore the Big SQL db

Restore the Big SQL db with the timestamp noted in the previous step using the following commands as the bigsql user on the Big SQL head node:

    // Turn off ATS prior to restoring  db2set DB2_ATS_ENABLE=NO  //restart Big SQL service  db2_all “db2 force applications all”  db2 terminate  db2 deactivate db bigsql   db2_all "db2 restore db bigsql from /backup/bigsql_dbbackup/offline taken at 'timestamp' replace existing without prompting"   Message such as “SQL2540W Restore is successful, however a warning "2539" was encountered  during Database Restore while processing in No Interrupt mode” is normal  // Turn on ATS after restoring  db2set DB2_ATS_ENABLE=YES  //restart Big SQL service

2.2 Restore the Hive metastore

Whenever a restore of the Big SQL database is done it is recommended that the Hive meta-store also be restored so that the Big SQL catalogs and the Hive catalogs are in sync. Note that the ‘auto hcat-sync’ feature and the HCAT_SYNC_OBJECTS stored procedure updates the Big SQL catalog based on the latest metadata from Hive. More information can be found in HDP Documentation on restore. For convenience some commands are given for restoration of the Hive meta-store for the default hive mysql database. First, login to Ambari UI, go to Hive –> Config –> Advanced, check for the property “Database Type” – make sure it is MySQL and get the database name e.g HIVE. Login via shell to the meta-store host as root user and issue the following commands:

    mysql hive < /hive_backup/hive.sql

Summary

The Big SQL backup and restore procedure using DB2 commands was outlined in this blog. The backup procedure outlined is an offline backup of the Big SQL database. Frequent backups of the Big SQL database are recommended after configuration changes have been made or tables have been added. The restore procedure outlined is intended for restoration on the same cluster in case of emergencies. It is recommended that when Big SQL database backups are taken, that at the same time, backups of the Hive metastore and HDFS/HBase snapshots should be taken. Consult with Big SQL support first before restoring the Big SQL database.

Tips

Big SQL Offline Backup and Restore to the Same Cluster - Hadoop Dev

Technical Blog Post

Abstract

Body