IBM Support

Big SQL Backup and Restore Utility - Hadoop Dev

Technical Blog Post


Abstract

Big SQL Backup and Restore Utility - Hadoop Dev

Body

Introduction
The Big SQL backup and restore utility simplifies the process of taking database backups, transferring backup images, and restoring backups on the existing cluster or to a new cluster.

The utility now consists of two tools (both of which are attached below):

  • bar.py
  • bigsql_bar.py

bigsql_bar.py and bar.py
While bigsql_bar.py was originally developed with a particular use case in mind (Active / Standby Disaster Recovery Solution for Big SQL Using IBM Big Replicate), customers soon recognised the benefits of such a utility for more general usage and for use with their own specific, but non-Big Replicate, use cases. As a result of this feedback, we’ve done a couple of things for Big SQL 5.0.2:

  1. Enhanced the functionality of bigsql_bar.py
    We enhanced the functionality of bigsql_bar.py, while also maintaining its default behavior in terms of the Active / Standby Disaster Recovery Solution for Big SQL Using IBM Big Replicate
  2. Created bar.py.
    We created a new tool called bar.py.
    This is a wrapper script that provides the expected defaults and allows for a more intuitive experience for customers working outside the Big Replicate context.

This blog will mostly focus on bar.py usage. For more information on bigsql_bar.py utility and for the latest options available, take a look at our IBM Knowledge Center Page: bigsql_bar.py – Big SQL backup and restore utility.

Requirements and Prerequisites
The requirements and prerequisites for using bar.py are the same as those for bigsql_bar.py.

  • User
    The bar.py utility must be executed as the configured bigsql service user.
  • Passwordless SSH
    If you wish to automatically transfer your generated backup images from one Big SQL Cluster to another using the Big SQL backup and restore utility, passwordless ssh must be configured between the Big SQL Cluster’s primary Head Nodes. If passwordless ssh is not configured, it will be necessary to use your own preferred mechanism, outside of the Big SQL backup and restore utility, in order to transfer backup images.
  • Where to Run the Big SQL Backup and Restore Utility
    The backup and restore utility must be executed on the Big SQL Head Node. In cases where Big SQL HA is enabled, the utility must be executed on the Big SQL primary Head Node.
  • Backup and Restore Utility Modes
    bar.py (and bigsql_bar.py) can be executed in one of the following modes:

  • Backup: Take a backup of the BIGSQL DB
  • Restore: Restore BIGSQL DB from an existing backup image
  • Transfer: Transfer an existing backup image to a remote machine
  • Backup + Transfer (by specifying the “-r remote machine” option when taking a backup): Take a backup of the BIGSQL DB and transfer the resulting backup image to a remote machine
  • Backupsize: Verify there is sufficient space on the filesystem to store a generated backup image

bar.py Options and Behaviour
There are a number of options available when executing the bar.py utility. All options, except ‘-m’ (mode), are optional and default values will be used if not explicitly specified. Below we highlight the key options available. Run “bar.py --help”, for a full list.

Option Optional/Required? Description/Behaviour
-m mode Required ‘Backup’ or ‘Restore’ or ‘Transfer’ or ‘Backupsize’
-r remote machine Optional Big SQL Head Node on remote cluster to transfer backup image(s) to
-t timestamp Optional Timestamp of backup image to be used for restore or transfer (Default: Timestamp of latest available image)
-c, --hcat Optional Run HCAT_SYNC_OBJECTS after a restore
-o, --offline Optional Force offline option (otherwise default to online)
-l, --local Optional Use local backup images for restore (Taking backups and restoring on the same cluster)
-s, --skipcheck Optional Skip the interactive check(s) during a backup or restore
-p, --prune Optional Prune DB Log Files before taking a backup
-g, --logarchmeth Optional Update the logarchmeth1 db cfg parameter as part of backup/restore operation
-w, Log archive method Optional Specify the primary log archive method to use (Default: LOGRETAIN)
-i, --onallnodes Optional Backup, transfer or restore for all Big SQL nodes
-q, --bigsqlhosts Optional Provide a list of Big SQL host names (local cluster) for backup, transfer or restore
-v, --rollforward Optional Perform a rollforward after an OFFLINE restore
-h, --help Optional Display the help menu

bar.py --onallnodes Option
When the “--onallnodes” option is used, it has the following effect:

  • Backup: Take a backup of ALL Big SQL nodes – Head Node and Worker Nodes
  • Transfer (either standalone transfer or as part of backup operation): Gather the associated backup images from ALL Big SQL worker nodes and store them on the local Big SQL head node so that they can be easily transferred from one Big SQL cluster to another. Transfer the images to the specified Big SQL cluster.
  • Restore: Distribute all relevant worker node backup images to the appropriate worker nodes within the cluster. Restore the BIGSQL database

Logging & Troubleshooting
A complete log of the script execution is located at /var/ibm/bigsql/logs/bigsql_bar.log (as bar.py is a wrapper around bigsql_bar.py) on the Big SQL primary Head Node. All debug, information and error messages are written to this file.

Backup Directory for BIGSQL DB Logs
Any existing database logs are backed-up before a restore is performed. The directory used for this is: <backup directory>/logstream_backup/<generated timestamp>/ (by default: /var/ibm/bigsql/backups/logstream_backup/<generated timestamp>/)

The Big SQL Backup and Restore Utility Location
Both bar.py and bigsql_bar.py are located at /usr/ibmpacks/current/bigsql/bigsql/bin/.

Log File Management
In order to use our Big SQL Backup and Restore solutions for online backups, the database must be made recoverable and database logging must be enabled. That is, the logarchmeth1 database configuration parameter must be updated (from “off”) during the initial offline backup. This can be achieved using the “--logarchmeth” bar.py option.
However, as database activity occurs, log files may accumulate and build up in the directories specified in the relevant database configuration parameters and it may then be necessary to occasionally remove the archived log files that are no longer needed. In order to remove the archived log files not needed for Big SQL disaster recovery, you can specify the “--prune” option when taking backups.

Big SQL Backup & Restore Usage
Here we outline some typical bar.py backup, transfer and restore commands. The appropriate commands could be easily selected from those below and combined in order to achieve any supported Big SQL backup and restore solution.

Note: If any of the backup, transfer or restore modes are executed without specifying a timestamp value, we locate and use the most recently available backup image in the backup directory.

Note: If Big SQL HA is being used, Big SQL HA must be disabled before restoring the database and then re-enabled once the database has been successfully restored.

Big SQL – Offline Backup
Take an offline backup and configure the logarchmeth1 db cfg parameter as necessary (using bar.py)

Big SQL Head Node ONLY
bar.py -m backup --offline --logarchmeth -d <backup directory>

ALL Big SQL Nodes (Head Node and Worker Nodes)
bar.py -m backup --offline -d <backup directory> --onallnodes

Big SQL – Offline Backup and Transfer

Big SQL Head Node ONLY
bar.py -m backup --offline --logarchmeth -d <backup directory> -r <machine.domain.com>

ALL Big SQL Nodes (Head Node and Worker Nodes)
bar.py -m backup --offline -d <backup directory> --onallnodes -r <machine.domain.com>

Big SQL – Online Backup
Perform online backups (using bar.py)

Big SQL Head Node ONLY

bar.py -m backup

Big SQL – Online Backup and Transfer

Big SQL Head Node ONLY
bar.py -m backup -r <machine.domain.com>

Big SQL – Transfer Only

Transfer backup images to another Big SQL Cluster (using bar.py)

Big SQL Head Node ONLY (transfer a single image from Big SQL head node)
bar.py -m transfer -r <machine.domain.com>
OR
bar.py -m transfer -t <timestamp> -r <machine.domain.com>

ALL Big SQL Nodes (transfer relevant images from all Big SQL Nodes – head and worker nodes)
bar.py -m transfer --onallnodes -r <machine.domain.com> -d <backup directory>
OR
bar.py -m transfer -t <timestamp> --onallnodes -r <machine.domain.com>

Big SQL – Offline Restore

Restore offline backup (using bar.py)

Big SQL Head Node ONLY
bar.py -m restore --offline
OR
bar.py -m restore --offline -d <backup directory>
OR
bar.py -m restore --offline -t <timestamp>

ALL Big SQL Nodes (Head Node and Worker Nodes)
bar.py -m restore --offline -d <backup directory> --onallnodes
OR
bar.py -m restore --offline -t <timestamp> --onallnodes

Big SQL – Online Restore

Restore online backups (using bar.py)

Big SQL Head Node ONLY
bar.py -m restore
OR
bar.py -m restore -d <backup directory>
OR
bar.py -m restore -t <timestamp>
OR
bar.py -m restore --hcat

    Summary
    In this blog we introduced the Big SQL Backup and Restore Utility, consisting of both the bar.py and bigsql_bar.py tools. We outlined what these tools are and the differences between the two. We also introduced the necessary requirements and prerequisites, we took a closer look at bar.py’s functionality and outlined the different modes and usage options available. Finally we outlined what these commands would look like when executed as part of a typical backup, transfer and restore scenario for Big SQL customers using bar.py.

    BAR Utility Scripts
    Attached is an archive containing both BAR Utility scripts.
    This is a standalone, version agnostic utility.

    bar_utility.tar_.gz

    Additional Information

    [{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSCRJT","label":"IBM Db2 Big SQL"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

    UID

    ibm16259791