IBM Support

How to cancel running queries in Big SQL - Hadoop Dev

Technical Blog Post


Abstract

How to cancel running queries in Big SQL - Hadoop Dev

Body

This article applies to BigInsights Big SQL V4.0 or later. The code and ideas were inspired by Big SQL customers and the prior work of Gustavo Arocena and Uday Kale.

Sometimes end users need to cancel their own queries. On Hadoop platforms, data exploration and ad hoc queries are the norm, and runaway or otherwise incorrectly written queries are common. In Big SQL, as in most DBMS systems, cancelling a query is a privileged action by default. This article describes a simple stored procedure that enables you to see the connections that you own and have the authority to interrupt.

First, we’ll create a stored procedure that returns a list of your active connections. Normally, this can be done through the command line processor (CLP) by running the LIST APPLICATIONS command, but this approach returns all connections, which can be messy, and some users do not have access to or are uncomfortable with the CLP. The goal is to get information about your connections with only a JDBC/ODBC connection, which opens up this query cancelling capability to any end-user tool. A second stored procedure to cancel specific queries embeds logic to check that the query being cancelled belongs to the user.

  • BIGSQL.show_my_apps() returns information about all connections that belong to the currently connected user ID.
  • BIGSQL.cancel_my_apps() enables a user to cancel a query that is associated with a particular application ID.

Note: This code is created to safely expose privileged functions for end users (not SYSADMIN, like in Big SQL). If you are a SYSADMIN who wants to cancel arbitrary workloads, use SYSHADOOP.CANCEL_APPLICATION directly or FORCE APPLICATIONS from the CLP.

Build the code

Run the following DDL statements to create the stored procedures with the Big SQL administrator ID (usually “bigsql”). The first line invokes the Big SQL command line processor with the option to specify the @ symbol as the statement delimiter. This is required because our stored procedures are multi-line statements that use semicolon delimiters internally.

  db2 -td@    CONNECT TO BIGSQL@    CREATE PROCEDURE bigsql.cancel_my_app(v_app_id VARCHAR(128))  BEGIN  DECLARE v_auth_id, v_error_msg varchar(128);  DECLARE v_app_handle int;       SELECT session_auth_id, application_handle INTO v_auth_id, v_app_handle       FROM SYSHADOOP.APPLICATIONS      WHERE application_id = v_app_id       LIMIT 1;             IF (session_user != v_auth_id) THEN       SIGNAL SQLSTATE '12345' set message_text = 'Permission denied - You can only cancel your own applications.';     END IF;       CALL SYSHADOOP.CANCEL_APPLICATION(v_app_handle);      END@    CREATE PROCEDURE bigsql.show_my_apps()  RESULT SETS 1  BEGIN            DECLARE c1 CURSOR WITH RETURN FOR      SELECT distinct session_auth_id, application_name, application_id, request_state        FROM SYSHADOOP.APPLICATIONS apps       WHERE apps.session_auth_id = session_user;            OPEN C1;        END@  

These stored procedures are created and owned by the Big SQL administrator ID, which is required when calling the administrative stored procedure SYSHADOOP.CANCEL_APPLICATION. Other users can’t call SYSHADOOP.CANCEL_APPLICATION, because this procedure can cancel any active running application.

The next step is to grant the EXECUTE privilege on these procedures to either users, groups, roles, or PUBLIC. Granting to PUBLIC makes sense, because users should be able to cancel any of their own work and the administrator shouldn’t have to manage permissions on these procedures again.

  GRANT EXECUTE ON PROCEDURE bigsql.show_my_apps TO PUBLIC@  GRANT EXECUTE ON PROCEDURE bigsql.cancel_my_app TO PUBLIC@  

If you ever want to delete these stored procedures, use the DROP statement, as shown in the following example:

  DROP PROCEDURE bigsql.show_my_apps@  DROP PROCEDURE bigsql.cancel_my_app@  

To stop the CLP, use the QUIT command:

  quit@  

Test the code

Suppose an unprivileged user “paul” has a long-running query. Using a different connection, because the first connection is tied up with the long-running query, paul calls the following procedure:

  CALL bigsql.show_my_apps()  

The procedure returns the following output:

    Result set 1    --------------      SESSION_AUTH_ID         APPLICATION_ID           REQUEST_STATE    ----------------------- ------------------------ --------------------------------    PAUL                    *N0.bigsql.160423152734  TRANSIENT    PAUL                    *N0.bigsql.160423203659  UOWEXEC    PAUL                    *N0.bigsql.160423143359  UOWWAIT  

The REQUEST_STATE field shows that application *N0.bigsql.160423203659 is in “unit of work executing state” (UOWEXEC); it is an in-flight query. A state of UOWWAIT indicates that the connection is idle (Big SQL is waiting for a query to be submitted). UOWWAIT is often misinterpreted as a lock wait condition, or something similar, but it’s a normal and healthy state.

Although there might be thousands of connections belonging to various users, the show_my_apps() stored procedure lists only the connections that belong to user paul.

To cancel the in-flight query, paul calls the following procedure:

  CALL bigsql.cancel_my_app('*N0.bigsql.160423203659')  

The parameter should be wrapped in single quotation marks and include the leading star (*), if applicable, because it is actually part of the application connection name.

Execution is asynchronous and might require a few seconds to complete. The interrupted query receives the following error:

  SQL0952N  Processing was cancelled due to an interrupt.  SQLSTATE=57014  

If you try to cancel a query that you don’t own, the stored procedure returns the following error:

  SQL0438N  Application raised error or warning with diagnostic text:  "Permission denied - You can only cancel your own applications.". SQLSTATE=12345  

If this article helped you, please leave a comment!

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSCRJT","label":"IBM Db2 Big SQL"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

UID

ibm16259955