Many times, administrators are not aware of the statements that the rest of the organization is running on the Big SQL service. When administering the Big SQL service, analysis of the statements being executed on the cluster is needed to facilitate proper workload management. Gathering of job history details such as statements being executed, execution time as well as resources utilized can give insight into the clusterâ€™s overall health and performance. This blog will give an overview of how to collect and analyze this historical data.
Big SQL DSM and Monitoring Functions
Data Server Manager for Big SQL and DB2 monitoring functions (which have replaced DB2 snapshots) can be used to gather statistics on statements being executed or recently executed on the cluster. Big SQL automatically maintains an in-memory cache of statements run and various execution metrics. When these caches are flushed to make room for new statements (or when Big SQL is restarted), the statistics on these statements are also flushed. The metrics gathered from these monitoring utilities are sufficient to provide insight into statements being executed or recently executed on the system.
Example of using MON_GET_PKG_CACHE_STATEMENT to retrieve some valuable information on queries being executed on the system:
SELECT INSERT_TIMESTAMP AS CREATE_TIME, TOTAL_CPU_TIME/NUM_EXEC_WITH_METRICS as AVG_CPU_TIME,STMT_EXEC_TIME, QUERY_COST_ESTIMATE, ROWS_READ, POOL_TEMP_DATA_P_READS, TOTAL_SECTION_SORT_TIME, EXT_TABLE_RECV_WAIT_TIME, SUBSTR(STMT_TEXT,1,70) as STMT_TEXT FROM TABLE(MON_GET_PKG_CACHE_STMT ( 'D', NULL, NULL, -2)) as T WHERE T.NUM_EXEC_WITH_METRICS <> 0 ORDER BY STMT_EXEC_TIME;
When the analysis window is larger (for example, analysis of statements executing on an active cluster for an entire day is needed), then event monitors can be used to gather metrics of queries. Since event monitors log metrics into tables rather than keeping them in memory, the metrics data is more persistent. The storage location of these tables needs to be specified. In releases prior to 5.0.3, creation of event monitors across workers was possible, however there were limitations with respect to adding and dropping nodes. In Big SQL 5.0.3 event monitors are supported ONLY on the head node.
Step 1: Creation of Event Monitors
Statement and activity event monitors need to be created to monitor the resource utilization of statements executing on the cluster. The following sample script can be used to create statistics and activity event monitors on the Big SQL head node:
CONNECT TO BIGSQL; -- Create the statistics event monitor CREATE EVENT MONITOR DB2STATISTICS FOR STATISTICS WRITE TO TABLE SCSTATS (TABLE SCSTATS_DB2STATISTICS IN BIGSQLCATSPACE), WCSTATS (TABLE WCSTATS_DB2STATISTICS IN BIGSQLCATSPACE), WLSTATS (TABLE WLSTATS_DB2STATISTICS IN BIGSQLCATSPACE), QSTATS (TABLE QSTATS_DB2STATISTICS IN BIGSQLCATSPACE), SCMETRICS (TABLE SCMETRICS_DB2STATISTICS IN BIGSQLCATSPACE), HISTOGRAMBIN (TABLE HISTOGRAMBIN_DB2STATISTICS IN BIGSQLCATSPACE), WLMETRICS (TABLE WLMETRICS_DB2STATISTICS IN BIGSQLCATSPACE), CONTROL (TABLE CONTROL_DB2STATISTICS IN BIGSQLCATSPACE), OSMETRICS (TABLE OSMETRICS_DB2STATISTICS IN BIGSQLCATSPACE); -- Create the activity event monitor CREATE EVENT MONITOR DB2ACTIVITIES FOR ACTIVITIES WRITE TO TABLE ACTIVITY (TABLE ACTIVITY_DB2ACTIVITIES IN BIGSQLCATSPACE), ACTIVITYMETRICS (TABLE ACTIVITYMETRICS_DB2ACTIVITIES IN BIGSQLCATSPACE), ACTIVITYSTMT (TABLE ACTIVITYSTMT_DB2ACTIVITIES IN BIGSQLCATSPACE), ACTIVITYVALS (TABLE ACTIVITYVALS_DB2ACTIVITIES IN BIGSQLCATSPACE), CONTROL (TABLE CONTROL_DB2ACTIVITIES in BIGSQLCATSPACE);
Step 2: Collecting data on workloads and service classes
A default user workload (SYSDEFAULTUSERWORKLOAD) and a default user service class (SYSDEFAULTUSERCLASS) is created for each Big SQL database. If no additional user defined workloads and service classes are created, all user activities will be associated with these default classes. This command can be used to alter the default classes so that monitoring data can be gathered on them. Execute this command as the bigsql user on the Big SQL head node when working with event monitors on the Big SQL head node:
db2 connect to bigsql; db2 â€śalter service class sysdefaultsubclass under sysdefaultuserclass collect activity data on coordinator member with details and valuesâ€ť; db2 "alter service class sysdefaultmanagedsubclass under sysdefaultuserclass collect activity data on coordinator member with details and values";
Step 3: Activating the event monitors
When event monitors are created with the command above, they are not automatically started. They can be started automatically by specifying the AUTOSTART keyword at the end of the statement in step 1. Keeping event monitors active for a long period of time can result in a lot of historical data generated. Therefore, it is recommended that event monitors be switched on when collection of the historical data is necessary and deactivated after the collection is complete. Activate the event monitors and verify that they are active by issuing the following commands as the bigsql user on the Big SQL head node:
db2 connect to bigsql; db2 â€śset event monitor DB2ACTIVITIES STATE 1â€ť; db2 â€śset event monitor DB2STATISTICS STATE 1â€ť; db2 â€śSELECT substr(evmonname, 1, 30) as evmon_name FROM syscat.eventmonitors where event_mon_state(evmonname) = 1â€ť EVMON_NAME -------------------------------------------------------------------------------------------------------------------------------- DB2ACTIVITIES DB2STATISTICS 2 record(s) selected.
Step 4: Send statistics to event monitor
Use the WLM_COLLECT_STATS stored procedure to send the statistics for all DB2 workload management objects to the active statistics event monitor. When statistics are collected and sent to the statistics event monitor, the values are reset. Execute this command as the bigsql user on the Big SQL head node for on the fly data collection:
db2 â€śCALL SYSPROC.WLM_COLLECT_STATS()â€ť
This command is asynchronous meaning that you donâ€™t need to wait for it to finish before moving to another task. One can also automate workload management statistics collection using the WLM_COLLECT_INT database configuration parameter. If this parameter is set to a nonzero value, workload management statistics will be collected automatically every wlm_collect_int minutes (as if you manually invoked the WLM_COLLECT_STATS procedure every wlm_collect_int minutes). Issue this command as the bigsql user on the Big SQL head node:
db2 â€śupdate db cfg for bigsql using WLM_COLLECT_INT 60â€ť; Restart Big SQL
Step 5: Run Queries
Execute the daily workload so that the metrics data can be collected for them.
Step 6: Turn off the event monitors
De-activate the event monitors and verify that none are active by issuing the following commands as the bigsql user on the Big SQL head node:
db2 â€śalter service class sysdefaultsubclass under sysdefaultuserclass collect activity data noneâ€ť; db2 "alter service class sysdefaultmanagedsubclass under sysdefaultuserclass collect activity data none"; db2 â€śset event monitor DB2ACTIVITIES STATE 0â€ť; db2 â€śset event monitor DB2STATISTICS STATE 0â€ť; db2 â€śSELECT substr(evmonname, 1, 30) as evmon_name FROM syscat.eventmonitors where event_mon_state(evmonname) = 1â€ť EVMON_NAME -------------------------------------------------------------------------------------------------------------------------------- 0 record(s) selected.
Step 7: Analysis of the data generated
Statistical data from event monitors are stored in Big SQL tables. There are a lot of metrics that are gathered when event monitors are activated. A description of all these metrics can be found in the DB2 Knowledge Center Monitor Element Reference.
Example 1: User level job history, which includes information about which user invoked which statement, start time and the duration of the statement
- This script gives the details about up to 100 statements that were executed successfully" connect to bigsql; select ACT_EXEC_TIME, TIME_COMPLETED, TIME_STARTED , ROWS_RETURNED, SESSION_AUTH_ID, SUBSTR(STMT_TEXT, 1, 70) AS STMT_TEXT from ACTIVITYSTMT_DB2ACTIVITIES as ast, ACTIVITY_DB2ACTIVITIES as av where ast.appl_id=av.appl_id and ast.uow_id=av.uow_id and ast.activity_id=av.activity_id and SQLCODE >= 0 order by act_exec_time desc fetch first 100 rows only; Example output: ACT_EXEC_TIME TIME_COMPLETED TIME_STARTED ROWS_RETURNED SESSION_AUTH_ID STMT_TEXT ============================================================================================================== 8679205 2017-12-07-14.01.05.226651 2017-12-07-14.00.56.547446 100 BIGSQL with cross_items as (select i_item_sk ss_item_sk from item, â€¦
Example 2: Statement failures, including which statements have failed and the error code returned by the failed statement
-- This script gives the details about queries that were executed that resulted in errors, to get more details about the errors execute db2 "? SQLCODE" connect to bigsql; select act_exec_time, SQLCODE, TIME_COMPLETED, TIME_STARTED, ROWS_RETURNED, SESSION_AUTH_ID, SUBSTR(STMT_TEXT, 1, 70) AS STMT_TEXT from ACTIVITYSTMT_DB2ACTIVITIES as ast, ACTIVITY_DB2ACTIVITIES as av where ast.appl_id=av.appl_id and ast.uow_id=av.uow_id and ast.activity_id=av.activity_id and SQLCODE < 0 order by act_exec_time desc fetch first 100 rows only; Example output: ACT_EXEC_TIME SQLCODE TIME_COMPLETED TIME_STARTED ROWS_RETURNED SESSION_AUTH_ID STMT_TEXT =============================================================================================================================================== 3828252 -5104 2017-12-07-14.02.44.594650 2017-12-07-14.02.40.766398 0 BIGSQL with inv as (select w_warehouse_name,w_warehouse_sk,i_item_sk,d_moy â€¦
By issuing the following command e.g. â€śdb2 ? sql5104â€ť more information can be found about this error code.
Step 8: Drop Event monitors and associated tables
You maybe interested in cleaning up some of the metrics data, to do this you can drop the event monitors and associated tables and start the process again. Save the contents below to a file for example drop_evmon.sql and issue this command ‘db2 -tvf drop_evmon.sql’ as the bigsql user on the head node:
connect to bigsql ; set event monitor DB2STATISTICS STATE 0; drop event monitor DB2STATISTICS; drop table CONTROL_DB2STATISTICS; drop table HISTOGRAMBIN_DB2STATISTICS; drop table OSMETRICS_DB2STATISTICS; drop table QSTATS_DB2STATISTICS; drop table SCMETRICS_DB2STATISTICS; drop table SCSTATS_DB2STATISTICS; drop table WCSTATS_DB2STATISTICS ; drop table WLMETRICS_DB2STATISTICS; drop table WLSTATS_DB2STATISTICS; set event monitor DB2ACTIVITIES STATE 0; drop event monitor DB2ACTIVITIES; drop table ACTIVITYMETRICS_DB2ACTIVITIES; drop table ACTIVITYSTMT_DB2ACTIVITIES ; drop table ACTIVITYVALS_DB2ACTIVITIES; drop table ACTIVITY_DB2ACTIVITIES; drop table CONTROL_DB2ACTIVITIES ;
Check that all event monitors have been removed by using the db2look command with the -wlm option. There should be no event monitors in this output. Execute this command as the bigsql user on the Big SQL head node:
db2look -d bigsql -wlm -o evmon.out
You can use this query to check that there are no event monitor tables:
db2 "select tabname,tbspace from syscat.tables where tbspace = 'BIGSQLCATSPACE'"
This blog demonstrated how to use event monitors to gather resource metrics on historical statements. In most cases using DSM and monitoring metrics is sufficient to gather statistics on Big SQL activity. However, these metrics are stored in memory, and as new statements come into the system, these metrics can be overwritten. Event monitors store these metrics in Big SQL tables created on the Big SQL head node. As a best practice when using event monitors, activate them when statistics need to be collected and disable them afterwards. Event monitors can only give resource utilization metrics for the DB2 component of Big SQL. EXT* metrics can be used to give an indication of the HDFS metrics such as time taken to read from HDFS.