IBM Support

Big SQL Auto-Analyze Scheduling - Hadoop Dev

Technical Blog Post


Abstract

Big SQL Auto-Analyze Scheduling - Hadoop Dev

Body

This is part 3 of a series on Auto-Analyze.

BigInsights Big SQL Auto-Analyze can use customized schedules for running Analyze. This allows balancing of the Auto-Analyze schedule with other Big SQL workloads.

It is best to run Auto-Analyze on a schedule, with concurrency settings that will not impact the resources needed to run other Big SQL workloads. If you have a window of time when more Big SQL resources can be devoted to running Analyze, then you can set a schedule and higher concurrency value to run more Auto-Analyze tasks during that period of time.

The default schedule checks the Auto-Analyze queue every 10 minutes for tables that need to be Analyzed. There is a default setting of 1 concurrent Auto-Analyze task. These settings together result in only one Auto-Analyze task being run in 10 minute intervals.

The query below will show the defaults set during install. The task to check the Auto-Analyze queue is named BIGSQL_CHECK_ANALYZE and is run with a cron schedule of ‘0,10,20,30,40,50 * * * *’.

  SELECT NAME, SCHEDULE, PROCEDURE_INPUT FROM SYSTOOLS.ADMIN_TASK_LIST WHERE PROCEDURE_SCHEMA='SYSHADOOP' AND PROCEDURE_NAME='BIGSQL_CHECK_ANALYZE';    +----------------------+--------------------------+-----------------+  | NAME                 | SCHEDULE                 | PROCEDURE_INPUT |  +----------------------+--------------------------+-----------------+  | BIGSQL_CHECK_ANALYZE | 0,10,20,30,40,50 * * * * | VALUES(0)       |  +----------------------+--------------------------+-----------------+  

The task’s input value of 0 means to use the value of property biginsights.stats.auto.analyze.concurrent.max as the number of concurrent Auto-Analyze tasks. An input value > 0 will override the biginsights.stats.auto.analyze.concurrent.max property value.

The default task was created using this statement below. Note the parameter values for NAME , SCHEDULE , and PROCEDURE_INPUT. The SCHEDULE is in UNIX cron format.

  CALL SYSPROC.ADMIN_TASK_ADD(    'BIGSQL_CHECK_ANALYZE',                      -- NAME    NULL, NULL, NULL,     '0,10,20,30,40,50 * * * *',                  -- SCHEDULE    'SYSHADOOP','BIGSQL_CHECK_ANALYZE',          -- PROCEDURE_SCHEMA,PROCEDURE_NAME    'VALUES(0)',                                 -- PROCEDURE_INPUT    NULL,NULL);  

The default settings may not be enough to satisfy all the Auto-Analyze that is needed for your Big SQL tables. In this case you will want to identify time periods when more Auto-Analyze can be run and adjust the Auto-Analyze schedule and concurrent Analyze settings. Multiple tasks can be created with different schedules to accomplish this.

For example:

1.Remove the default Auto-Analyze check task

  CALL SYSPROC.ADMIN_TASK_REMOVE('BIGSQL_CHECK_ANALYZE',NULL);  

2. Add these 2 Auto-Analyze check tasks without overlapping schedules. The task names must each be unique.

High Analyze time period: Every day Midnight to 5 am execute every minute. Run up to 5 concurrent Auto-Analyze tasks.

  CALL SYSPROC.ADMIN_TASK_ADD(    'BIGSQL_CHECK_ANALYZE High Time',        -- NAME    NULL, NULL, NULL,     '* 0-4 * * *',                           -- SCHEDULE    'SYSHADOOP','BIGSQL_CHECK_ANALYZE',      -- PROCEDURE_SCHEMA,PROCEDURE_NAME    'VALUES(5)',                             -- PROCEDURE_INPUT    NULL,NULL);  

Low Analyze time period: Every day 5 am to Midnight execute every 10 minutes. Run 1 Auto-Analyze task at a time.

  CALL SYSPROC.ADMIN_TASK_ADD(    'BIGSQL_CHECK_ANALYZE Low Time',         -- NAME    NULL, NULL, NULL,     '0,10,20,30,40,50 5-23 * * *',           -- SCHEDULE    'SYSHADOOP','BIGSQL_CHECK_ANALYZE',      -- PROCEDURE_SCHEMA,PROCEDURE_NAME    'VALUES(1)',                             -- PROCEDURE_INPUT    NULL,NULL);  

Big SQL service does not need to be restarted for the new schedules to take effect.

This is now the new schedule that will be used for running Auto-Analyze.

  SELECT NAME, SCHEDULE, PROCEDURE_INPUT FROM SYSTOOLS.ADMIN_TASK_LIST WHERE PROCEDURE_SCHEMA='SYSHADOOP' AND PROCEDURE_NAME='BIGSQL_CHECK_ANALYZE';    +--------------------------------+-----------------------------+-----------------+  | NAME                           | SCHEDULE                    | PROCEDURE_INPUT |  +--------------------------------+-----------------------------+-----------------+  | BIGSQL_CHECK_ANALYZE High Time | * 0-4 * * *                 | VALUES(5)       |  | BIGSQL_CHECK_ANALYZE Low Time  | 0,10,20,30,40,50 5-23 * * * | VALUES(1)       |  +--------------------------------+-----------------------------+-----------------+  

More information about adding and modifying ATS tasks.

Previous Big SQL Auto-Analyze blogs:

Faster BigInsights Big SQL queries with Auto-Analyze
Big SQL Auto-Analyze Customization

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSCRJT","label":"IBM Db2 Big SQL"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

UID

ibm16259897