by Nailah Bissoon
Big SQL Senior Technical Architect
The needs of the business may dictate that certain queries or workloads should be prioritized higher than others. Prioritizing of workloads must be done at the CPU and memory level. In this blog, we will show how to map applications to workloads and how to assign CPU and memory resources to them such that one workload is prioritized higher than the other. Consult an overview of WLM for Big SQL for more information on the capabilities of workload management.
Recall that workloads are mapped to service classes and resources are controlled at the service class level. More details can be found in Big SQL Workload Management Terminologies. There is no direct way to use the ALTER SERVICE CLASS statement to control the percentage of memory that is used by each workload. However, an indirect way is to use the methodology described in Big SQL Workload Management Establishing Gatekeeper Concurrency Controls to ‘distribute’ the available shared sort memory between workloads. It is recommended to create gatekeeper concurrency controls before using CPU prioritization techniques because the most effective way to control work is to control when it can start. If CPU prioritization techniques are used on their own then the system could still have very high run queues (as reported by vmstat).
In Big SQL Workload Management – Stage 1, monitoring was done at the system and workload level. In this stage, long running and resource intensive queries were identified. In Big SQL Workload Management – Stage 2, gatekeeper concurrency controls were applied at the system level. In this stage, gatekeeper concurrency controls will be applied at the workload level to manage the memory resources across the workloads, and CPU limits will be applied to each workload.
Example Use Case – Prioritizing One Workload Ahead of Another
Consider, two workloads running on the system, one workload can be identified by the application name ‘db2jcc_application’ and the other can be identified by the application name ‘GreenTea’. The CREATE WORKLOAD statement can be used to create the two workloads based on their application names. Workloads can also be created based on userids, roles etc, for more information consult the CREATE WORKLOAD section of the DB2 knowledge center. According to the business, the ‘db2jcc_application’ workload should be prioritized slightly higher than the ‘GreenTea’ workload, so it is assigned 60% of the CPU and memory resources. The steps to setting up an advanced workload management configuration for this use case will be shown below.
1. Monitor Workloads
To determine which queries from each workload can execute simultaneously on the system, monitoring and analysis at the workload level needs to be done. Recall the methodology presented in Big SQL Workload Management – Stage 1 to analyze queries on the system, the same sort of analysis needs to be done but this time at a workload level. This is important because if each workload has resource intensive queries, then the number of the resource intensive queries that can execute simultaneously on the system from each workload should be limited. Since the applications are mapped to the workloads, some statistics of these applications can be collected and analyzed based on the application names via queries against Event Monitor Tables such as:
WITH V1 AS (SELECT CASE WHEN QUERY_COST_ESTIMATE < 150000 THEN 'TRIVIAL' WHEN QUERY_COST_ESTIMATE BETWEEN 150000 AND 1000000 THEN 'SIMPLE' WHEN QUERY_COST_ESTIMATE BETWEEN 1000000 AND 6000000 THEN 'MEDIUM' WHEN QUERY_COST_ESTIMATE > 6000000 THEN 'COMPLEX' END AS TARGET_WORKCLASS, QUERY_COST_ESTIMATE, APPL_NAME, SORT_SHRHEAP_TOP, timestampdiff(4,TIME_COMPLETED - TIME_STARTED) AS LIFETIME_DURATION FROM ACTIVITY_DB2ACTIVITIES), V2 AS (SELECT TARGET_WORKCLASS, QUERY_COST_ESTIMATE, APPL_NAME, SORT_SHRHEAP_TOP, LIFETIME_DURATION FROM V1) SELECT TARGET_WORKCLASS, COUNT(*) AS COUNT, SUM(LIFETIME_DURATION) AS TOT_LIFETIME_MINS, AVG(SORT_SHRHEAP_TOP) AS AVG_SORT_MEM4K, MAX(SORT_SHRHEAP_TOP) AS MAX_SORT_MEM4K, MAX(LIFETIME_DURATION) AS MAX_LIFETIME_MINS, INT(SUM(LIFETIME_DURATION)/COUNT(*)) AS AVG_LIFETIME_MINS FROM V2 WHERE APPL_NAME LIKE 'db2jcc_application' GROUP BY TARGET_WORKCLASS,APPL_NAME ORDER BY TARGET_WORKCLASS; TARGET_WORKCLASS COUNT TOT_LIFETIME_MINS AVG_SORT_MEM4K MAX_SORT_MEM4K MAX_LIFETIME_MINS AVG_LIFETIME_MINS ---------------- --------------------------------- ----------------- -------------------- -------------------- ----------------- ----------------- COMPLEX 1296. 9469 1008840 9723719 50 7 MEDIUM 2688. 22666 204519 2443022 446 8 SIMPLE 864. 336 113298 541049 8 0 TRIVIAL 2110. 0 102 11814 0 0 WITH V1 AS … FROM V2 WHERE APPL_NAME LIKE 'GreenTea' GROUP BY TARGET_WORKCLASS,APPL_NAME ORDER BY TARGET_WORKCLASS; TARGET_WORKCLASS COUNT TOT_LIFETIME_MINS AVG_SORT_MEM4K MAX_SORT_MEM4K MAX_LIFETIME_MINS AVG_LIFETIME_MINS ---------------- --------------------------------- ----------------- -------------------- -------------------- ----------------- --------------- -- COMPLEX 4433. 194634 116783 1559199 54 43 MEDIUM 13. 581 99800 175337 45 44 SIMPLE 52. 797 82685 204496 45 15 TRIVIAL 1638. 0 315 121272 0 0
The MAX_SORT_MEM4K values reported for each application and target work class are needed when determining the concurrency threshold values for each of the intended work classes.
2. Define Work Classes For Each Workload
The characteristics of one workload maybe different from another, as a result, the definition of a complex query for one workload maybe different from that of another. It maybe useful to have different work class definitions for each workload. For the ‘db2jcc’ application, examining the output of the query above, since there is a good representation of queries across the target work classes, four work classes will be created for the TRIVIAL, SIMPLE, MEDIUM and COMPLEX queries. For the ‘GreenTea’ workload, since there are only few queries in the MEDIUM and SIMPLE classes, it is not beneficial to create four work classes. One TRIVIAL and one COMPLEX class will suffice, which is very similar to the Default Big SQL Workload Management Configuration.
3. Apply Gatekeeper Concurrency Controls For Each Workload
In Big SQL Workload Management – Stage 2, gatekeeper concurrency controls were applied at the system level, all work entering the system was assigned to the same workload and only a certain number of SIMPLE, MEDIUM or COMPLEX queries were allowed to execute simultaneously on the system. The concurrency threshold values for each work class was determined by dividing the available shared sort memory for each work class by the maximum shared sort memory usage for each work class. When there are multiple workloads, the available shared sort memory needs to be divided among the workloads and then divided among the work classes within each workload. The available shared sort memory on the cluster can be determined by running this command after executing the workloads:
db2 get db cfg for bigsql | grep SHEAPTHRES_SHR Sort heap thres for shared sorts (4KB) (SHEAPTHRES_SHR) = AUTOMATIC(12914436)
Since, 60% of the shared sort memory resources will be assigned to the ‘db2jcc’ application and 40% to the ‘GreenTea’ application. This means that
Shared sort memory for ‘db2jcc’ application = 60% x 12914436 4K pages= 7748661 4K pages
It was decided that concurrency thresholds are not needed for the TRIVIAL class since these queries are very light weight. The shared sort memory for this workload needs to be divided among the remaining work classes. In the calculations below we assume that no prioritization is needed across the work classes. If it is decided that within the workload certain work classes should be given higher priority than the others, then instead of evenly distributing the shared sort memory across the work classes, the weights can be increased for work classes with higher priority. The following calculation can be used to determine the concurrency threshold values for each of the work classes, observe the consideration for the 60% (0.6) resource assignment here:
COMPLEX concurrency threshold= 12914436*0.6/3/9723719=1 MEDIUM concurrency threshold = 12914436*0.6/3/2443022=1 SIMPLE concurrency threshold = 12914436*0.6/3/541049=4
Note, that the concurrency threshold values for the COMPLEX and MEDIUM classes are the same so in this case we could eliminate one of the work classes for simplicity and adjust the timeron cost range for the remaining work classes. However, since the initial settings could be gradually increased to improve performance, the SIMPLE, MEDIUM and COMPLEX work classes are kept.
For the ‘GreenTea’ application, based on the number of queries in each work class, we decided that there would be 2 work classes, TRIVIAL and COMPLEX. Since we do not need to create any concurrency control for the TRIVIAL work class, the concurrency value for the COMPLEX work class needs to be determined using the following calculation, observe the consideration for the 40% (0.4) assignment here:
COMPLEX concurrency threshold= 12914436*0.4/3/1559199=1
4. Assign CPU Resources For Each Workload
Assignment of CPU resources to workloads can be applied at the service class level. CPU resources are distributed via CPU SHARES. The default number of CPU SHARES for each service class is 1000. If a workload needs to be prioritized higher than another, the number of CPU SHARES can be increased. For example, if there are two workloads, and one workload should be given 40% of the CPU resources and the other 60%, then 40000 CPU SHARES can be assigned to the first and 60000 CPU shares can be assigned to the second. For the two workloads above, the following statements can be used to alter the service classes for each workload:
ALTER SERVICE CLASS "GREENTEA" SOFT CPU SHARES 40000; ALTER SERVICE CLASS "DB2JCC" SOFT CPU SHARES 60000;
If one workload is not executing then the workload that is running could be given all the CPU resources on the system. This can be done at the service class level with the SOFT CPU SHARES keyword as shown above. If the workload is never allowed to use more than 40% of the CPU resources, then HARD CPU SHARES can be used or the CPU LIMIT 40 keyword can be applied at the service class level.
5. Update Database Manager WLM configurations
In order to enable the DB2 WLM Dispatcher and utilize the Dispatcher CPU shares, the following updates are needed to the database manager:
db2 attach to bigsql; db2 update dbm cfg using wlm_dispatcher yes; db2 update dbm cfg using wlm_disp_cpu_shares yes; db2 detach;
WLM Advanced Configuration for this Use Case
For the use case described above, once the steps above are completed, the commands below can be used to create the workloads, service classes, sub classes, work action set, work class set and thresholds such that the ‘db2jcc’ application is given 60% of the resources and the ‘GreenTea’ application is given 40% of the resources.
SET WORKLOAD TO SYSDEFAULTADMWORKLOAD; CREATE SERVICE CLASS "GREENTEA" SOFT CPU SHARES 40000 DISABLE; CREATE SERVICE CLASS "DB2JCC" SOFT CPU SHARES 60000 DISABLE; CREATE SERVICE CLASS "COMPLEX_GREENTEA " UNDER "GREENTEA" COLLECT ACTIVITY DATA ON ALL DATABASE PARTITIONS WITHOUT DETAILS COLLECT AGGREGATE ACTIVITY DATA EXTENDED DISABLE; CREATE SERVICE CLASS "TRIVIAL_GREENTEA " UNDER "GREENTEA" COLLECT ACTIVITY DATA ON ALL DATABASE PARTITIONS WITHOUT DETAILS COLLECT AGGREGATE ACTIVITY DATA EXTENDED DISABLE; CREATE SERVICE CLASS "COMPLEX_DB2JCC" UNDER "DB2JCC" COLLECT ACTIVITY DATA ON ALL DATABASE PARTITIONS WITHOUT DETAILS COLLECT AGGREGATE ACTIVITY DATA EXTENDED DISABLE; CREATE SERVICE CLASS "MEDIUM_DB2JCC" UNDER "DB2JCC" COLLECT ACTIVITY DATA ON ALL DATABASE PARTITIONS WITHOUT DETAILS COLLECT AGGREGATE ACTIVITY DATA EXTENDED DISABLE; CREATE SERVICE CLASS "SIMPLE_DB2JCC" UNDER "DB2JCC" COLLECT ACTIVITY DATA ON ALL DATABASE PARTITIONS WITHOUT DETAILS COLLECT AGGREGATE ACTIVITY DATA EXTENDED DISABLE; CREATE SERVICE CLASS "TRIVIAL_DB2JCC" UNDER "DB2JCC" COLLECT ACTIVITY DATA ON ALL DATABASE PARTITIONS WITHOUT DETAILS COLLECT AGGREGATE ACTIVITY DATA EXTENDED DISABLE; --------------------------------------------- --DDL Statements for Work Class Sets ----------------------------------------------- CREATE WORK CLASS SET "GREENTEA_WCS" ( WORK CLASS "TRIVIAL_COST_GREENTEA" WORK TYPE ALL FOR TIMERONCOST FROM 0 TO 150000, WORK CLASS "COMPLEX_COST_GREENTEA" WORK TYPE ALL FOR TIMERONCOST FROM 150000.0 TO UNBOUNDED ); CREATE WORK CLASS SET "DB2JCC_WCS" ( WORK CLASS "TRIVIAL_COST_DB2JCC" WORK TYPE ALL FOR TIMERONCOST FROM 0 TO 150000, WORK CLASS "SIMPLE_COST_DB2JCC" WORK TYPE ALL FOR TIMERONCOST FROM 150000 TO 1000000.0, WORK CLASS "MEDIUM_COST_DB2JCC" WORK TYPE ALL FOR TIMERONCOST FROM 1000000.0 TO 6000000.0, WORK CLASS "COMPLEX_COST_DB2JCC" WORK TYPE ALL FOR TIMERONCOST FROM 6000000.0 TO UNBOUNDED ); --------------------------------------------- -- DDL Statements for Work Action Sets ----------------------------------------------- CREATE WORK ACTION SET "GREENTEA_WAS" FOR SERVICE CLASS "GREENTEA" USING WORK CLASS SET "GREENTEA_WCS" ( -- Direct trivial cost to the Trivial subclass and allow any nested SQL -- statement to be re-evaluated WORK ACTION "MAP_TRIVIAL_GREENTEA" ON WORK CLASS "TRIVIAL_COST_GREENTEA" MAP ACTIVITY WITHOUT NESTED TO "TRIVIAL_GREENTEA", -- Direct complex cost to the Complex subclass and allow any nested SQL -- statement to be re-evaluated WORK ACTION "MAP_COMPLEX_GREENTEA" ON WORK CLASS "COMPLEX_COST_GREENTEA" MAP ACTIVITY WITHOUT NESTED TO "COMPLEX_GREENTEA" ) DISABLE; CREATE WORK ACTION SET "DB2JCC_WAS" FOR SERVICE CLASS "DB2JCC" USING WORK CLASS SET "DB2JCC_WCS" ( -- Direct trivial cost to the Trivial subclass and allow any nested SQL -- statement to be re-evaluated WORK ACTION "MAP_TRIVIAL_DB2JCC" ON WORK CLASS "TRIVIAL_COST_DB2JCC" MAP ACTIVITY WITHOUT NESTED TO "TRIVIAL_DB2JCC", -- Direct simple cost to the Simple subclass and allow any nested SQL -- statement to be re-evaluated WORK ACTION "MAP_SIMPLE_DB2JCC" ON WORK CLASS "SIMPLE_COST_DB2JCC" MAP ACTIVITY WITHOUT NESTED TO "SIMPLE_DB2JCC", -- Direct medium cost to the Medium subclass and allow any nested SQL -- statement to be re-evaluated WORK ACTION "MAP_MEDIUM_DB2JCC" ON WORK CLASS "MEDIUM_COST_DB2JCC" MAP ACTIVITY WITHOUT NESTED TO "MEDIUM_DB2JCC", -- Direct complex cost to the Complex subclass and allow any nested SQL -- statement to be re-evaluated WORK ACTION "MAP_COMPLEX_DB2JCC" ON WORK CLASS "COMPLEX_COST_DB2JCC" MAP ACTIVITY WITHOUT NESTED TO "COMPLEX_DB2JCC" ) DISABLE; --------------------------------------------- -- DDL Statements for Thresholds ----------------------------------------------- -- Create disabled activity concurrency threshold for complex subclass CREATE THRESHOLD "COMPLEX_GREENTEA_CONCURRENCY" FOR SERVICE CLASS "COMPLEX_GREENTEA" UNDER "GREENTEA" ACTIVITIES ENFORCEMENT DATABASE DISABLE WHEN CONCURRENTDBCOORDACTIVITIES > 1 AND QUEUEDACTIVITIES UNBOUNDED CONTINUE; -- Create disabled activity concurrency threshold for simple subclass CREATE THRESHOLD "SIMPLE_DB2JCC_CONCURRENCY" FOR SERVICE CLASS "SIMPLE_DB2JCC" UNDER "DB2JCC" ACTIVITIES ENFORCEMENT DATABASE DISABLE WHEN CONCURRENTDBCOORDACTIVITIES > 4 AND QUEUEDACTIVITIES UNBOUNDED CONTINUE; -- Create disabled activity concurrency threshold for medium subclass CREATE THRESHOLD "MEDIUM_DB2JCC_CONCURRENCY" FOR SERVICE CLASS "MEDIUM_DB2JCC" UNDER "DB2JCC" ACTIVITIES ENFORCEMENT DATABASE DISABLE WHEN CONCURRENTDBCOORDACTIVITIES > 1 AND QUEUEDACTIVITIES UNBOUNDED CONTINUE; -- Create disabled activity concurrency threshold for complex subclass CREATE THRESHOLD "COMPLEX_DB2JCC_CONCURRENCY" FOR SERVICE CLASS "COMPLEX_DB2JCC" UNDER "DB2JCC" ACTIVITIES ENFORCEMENT DATABASE DISABLE WHEN CONCURRENTDBCOORDACTIVITIES > 1 AND QUEUEDACTIVITIES UNBOUNDED CONTINUE; ALTER SERVICE CLASS "GREENTEA" ENABLE; ALTER SERVICE CLASS "COMPLEX_GREENTEA" UNDER GREENTEA ENABLE; ALTER SERVICE CLASS "DB2JCC" ENABLE; ALTER SERVICE CLASS "COMPLEX_DB2JCC" UNDER "DB2JCC" ENABLE; ALTER SERVICE CLASS "MEDIUM_DB2JCC" UNDER "DB2JCC" ENABLE; ALTER SERVICE CLASS "SIMPLE_DB2JCC" UNDER "DB2JCC" ENABLE; ALTER SERVICE CLASS "TRIVIAL_DB2JCC" UNDER "DB2JCC" ENABLE; ALTER WORK ACTION SET "DB2JCC_WAS" ENABLE; ALTER WORK ACTION SET "GREENTEA_WAS" ENABLE; ALTER THRESHOLD "COMPLEX_GREENTEA_CONCURRENCY" ENABLE; ALTER THRESHOLD "SIMPLE_DB2JCC_CONCURRENCY" ENABLE; ALTER THRESHOLD "MEDIUM_DB2JCC_CONCURRENCY" ENABLE; ALTER THRESHOLD "COMPLEX_DB2JCC_CONCURRENCY" ENABLE; -------------------------------------- --DDL Statements for Workload -------------------------------------- CREATE WORKLOAD "GREENTEA_WORKLOAD" APPLNAME('GreenTea*') SERVICE CLASS "GREENTEA" POSITION AT 1 COLLECT ACTIVITY DATA ON ALL DATABASE PARTITIONS WITH DETAILS COLLECT AGGREGATE ACTIVITY DATA BASE COLLECT ACTIVITY METRICS NONE; CREATE WORKLOAD "DB2JCC_WORKLOAD" APPLNAME('db2jcc*') SERVICE CLASS "DB2JCC" POSITION AT 2 COLLECT ACTIVITY DATA ON ALL DATABASE PARTITIONS WITH DETAILS COLLECT AGGREGATE ACTIVITY DATA BASE COLLECT ACTIVITY METRICS NONE; SET WORKLOAD TO AUTOMATIC ;
The following diagram, displays the newly created service classes, work class sets, work action sets and thresholds for the db2jcc workload and the GreenTea workload:
Conclusion
This blog demonstrated how to setup an advanced workload management configuration that involved execution of two workloads on the system. One workload was given 40% of the CPU and memory resources and the second workload was given 60% of the resources. After setting up this advanced WLM configuration, continue to monitor the system especially if new work is being added to the workloads.