Tutorial

Troubleshooting guide for MQ queue managers on AWS

A cheat sheet for developers for debugging IBM MQ queue managers on AWS

By

Soheel Chughtai

You have deployed your MQ app and queue manager to AWS, and something isn’t quite right. You’ve read the IBM MQ cheat sheet, and you’ve come to the point where you need to view the logs. The problem is this: How do you get to the logs on AWS? This troubleshooting guide will show you how.

Using this guide, you find the logs. Reading through the logs, you realize that you need to check the configuration using MQSC commands. To issue MQSC commands, you need to access the queue manager container, but you’re not sure how you do that. This troubleshooting guide will show you how to access the container, but the good news is that you don’t need to access the container to run MQSC commands. This guide will show you how to run MQSC commands from outside the container.

Using this guide, you access the queue manager container, run some MQSC commands, and then discover that you need to enable TLS. You read this tutorial, but you aren’t sure how you enable TLS for a MQ queue manager running in a container in AWS. This guide will show you how.

Using this guide, you have TLS enabled on your queue manager and every now and again an application isn’t quite configured right to use TLS. Whenever this happens you want to be alerted whenever a connecting application experiences TLS configuration errors. This guide introduces you to CloudWatch, which you can use to monitor your logs for errors.

In this guide, we will show you how to:

  • Get to the queue manager logs:

    • Logs that are external to the queue manager in CloudWatch
    • Logs that are internal to the queue manager (you need to access the container to do this)
  • Access the queue manager container using ECS Exec:

    • To be able to get to the logs internally
    • To be able to run administrator tasks internally using runmqsc
  • How to run administration MQSC tasks from outside the container

  • How to enable TLS for an MQ queue manager on AWS
  • How to monitor the health of your queue manager using CloudWatch

Viewing the queue manager logs

The IBM MQ cheat sheet tells you that the queue manager logs can be found in this directory: /var/mqm/qmgrs/<your_queue_manager_name>/errors/.

But how do you get to that location when the queue manager is in a container running in AWS cloud? There are two ways. The first is external to the container using Amazon CloudWatch. The second is inside the container.

Let’s start with the logs that are available outside of the queue manager container.

Viewing queue manager logs using Amazon CloudWatch

When a queue manager runs in a container, the queue manager logs are formatted and rerouted to the container’s console logs. So, if you can find the container console logs, then you have access to the queue manager’s logs.

If your application is misbehaving, then you might need to shut it down. If you do shut down the application, you will lose the logs unless you create a log group to retain the messages.

A LogGroup is created when you start your application services by running this command:

docker --context=mq-server compose -f docker-compose.yaml up

When a compose app (set of services) is deployed various AWS components will be created, including a CloudWatch log group.

On AWS, the Docker Compose instructions are used to construct an AWS CloudFormation stack. The CloudFormation stack describes all of the AWS services that will be needed, and CloudFormation provisions and configures these services.

The CloudFormation stack creates a CloudWatch log group and log stream for you, which is why you had to grant access to the CloudWatch logs as part of the prerequisites of the “Get an IBM queue for development running on AWS” tutorial. The policy you used can be found in the mq-dev-patterns repo.

Amazon CloudWatch is a monitoring management service for applications and services running on AWS. CloudWatch offers monitoring, dashboards, alarms, logs and events.

Amazon CloudWatch is not just for logging, but we will come back to that.

The Docker Compose awslogs logging service sends container logs to CloudWatch. The log entries can be retrieved through the command line.

If the app is misbehaving, you might need to shut it down while you debug it. You might be tempted to run this command:

docker --context=mq-server compose -f docker-compose.yaml down

However, this log group will be deleted when shut down the app.

There is a way of retaining the logs so you can inspect them and work out why the app was misbehaving. It’s done by creating, and then referring to, a log group outside of the Docker Compose controlled CloudFormation stack.

Screen capture of Create log group in Amazon CloudWatch

In the figure above, we are creating a CloudWatch log group called “MQAppLogs”. After the log group is created, we can ask AWS CloudFormation, using Docker Compose, to use the log group for logging by adding the following snippet to the docker-compose.yaml file.

    logging:
      driver: awslogs
      options:
        awslogs-region: eu-west-2
        awslogs-group: MQAppLogs

You can find the unedited docker-compose.yaml file – the one before you appy the change – in the mq-dev-patterns repo.

See the Docker docs for details on all logging options.

To restart the services, and send the log streams to the MQAppLogs log group, run the following command:

docker --context=mq-server compose -f docker-compose.yaml up

If you see the following message, then you have either mistyped the log group name or you forgot to create it.

Screen capture of error message, TaskFailedToStart

If you got it right, then now after you run the following command, the log group and log stream are created and remain for you to inspect the MQ queue manager logs:

docker --context=mq-server compose -f docker-compose.yaml down

When you open the MQAppLogs log group in the Amazon CloudWatch console, you’ll see the log stream still exists.

Screen capture of MQAppLogs log group in Amazon CloudWatch console

Click a log stream to inspect the log messages:

Screen capture of a log stream in Amazon CloudWatch console

As the log group lifecycle is no longer managed by docker compose, remember that the log group will remain, until its expiry date or until you delete it.

Viewing queue manager logs inside the container

In the queue manager container, you find the logs in their standard directory: /var/mqm/qmgrs/<your_queue_manager_name>/errors/.

To access the queue manager container, follow the instructions in the next section, “Accessing the queue manager container using ECS Exec.”

Accessing the queue manager container using Amazon ECS Exec

You don’t need to access the queue manager container to run administration tasks or to view the queue manager logs, but if for some unforeseen reason you do, then you can. The container can be accessed by using Amazon ECS Exec, which is a way to run an interactive shell or a single command against a container.

The following steps update the permissions associated with the task role (TaskRole). If you update the task role, the docker compose down command does not remove the TaskRole. Everything else will be gone, but what remains is the AWS CloudFormation stack and the TaskRole. You will either need to remove the TaskRole manually or tell CloudFormation to ignore it, when you manually delete the CloudFormation stack.

Make sure that you install the prerequisites for using ECS Exec, which include the AWS CLI, Session Manager plugin for the AWS CLI, and setting IAM permissions for ECS Exec, to the correct TaskRole. Search for "MqTaskRole", it will be called something like “compose-MqTaskRole-1TN2MVC12VX8U “ All of these prerequisites are defined on the main page for "Using Amazon ECS Exec for debugging" in the Amazon ECS Developer Guide.

To be able to ECS Exec into the container, you need to enable the execute-command command for the associated AWS ECS task.

You need the following information to enable the execute-command command on an AWS ECS task:

  • The cluster name
  • The task definition name
  • The task name
  • The service name
  • The container name

The steps that follow show you how to gather this information and then how to enable the execute-command command.

The container name is mq, as stated in the docker-compose.yaml file.

Screen capture of mq section of docker-compose.yaml file

You can get a list of all clusters by running the following command:

aws ecs list-clusters

You will see something like this returned:

Screen capture of output of the list-clusters command

In this example, our cluster name is “compose.”

Now that we know the cluster name, we can list the tasks by running the following command:

aws ecs list-tasks --cluster compose

You should see something like returned:

Screen capture of output of the list-tasks command

In this example, the task name is “6fdf58d493e34d1e906013683beabaca”

Now that we have the cluster name and the task name, you can get details of the task by running the following command:

aws ecs describe-tasks --cluster compose --tasks 6fdf58d493e34d1e906013683beabaca

To determine if ECS Exec is enabled for this task, we can look for the enableExecuteCommand property using the following command:

aws ecs describe-tasks --cluster compose --tasks 6fdf58d493e34d1e906013683beabaca --query 'tasks[0].enableExecuteCommand'

You should see the following output: false. False means that the task is not configured to allow us to run ECS Exec.

So, to enable ECS Exec, you need the service name. To get the service name, run the following command:

aws ecs describe-tasks --cluster compose --tasks 6fdf58d493e34d1e906013683beabaca | grep -i  service

You should see something like this returned:

Screen capture of the output of the describe-tasks command

In this example, the service name is “compose-MqService-d02E1DygtdNX”.

To get the task definition name, run the following command:

aws ecs list-task-definitions

You should see something like this returned:

Screen capture of the output of the list-task-definitions command

The number of values returned will depend on your ECS usage, but for this exercise, the task definition we are looking for is “compose-mq”.

Now, we have all the details that we need to enable ECS Exec on the container:

  • The cluster name: compose
  • The task definition name: compose-mq
  • The task name: 6fdf58d493e34d1e906013683beabaca
  • The service name: compose-MqService-d02E1DygtdNX
  • The container name: mq

First, to enable ECS Exec on the container, we need to update the desired-count on the service to stop the running MQ container using this command:

aws ecs update-service --cluster compose --task-definition compose-mq --service compose-MqService-d02E1DygtdNX--desired-count 0

Check that the task no longer appears in the list by running this command:

aws ecs list-tasks --cluster compose

You should see something like this returned:

Screen capture of the output of the list-tasks command, with no tasks showing

To restart the MQ task with ECS Exec enabled, run the following command:

aws ecs update-service --cluster compose --task-definition compose-mq --service compose-MqService-d02E1DygtdNX--desired-count 1       --enable-execute-command

Now, run the list-tasks command again:

aws ecs list-tasks --cluster compose

This command should now report a new task running:

Screen capture of the output of the list-tasks command, with our new task showing

In our case the new task was 9fca0a49b93b4d71842769a08d77c967.

The describe-tasks command with the query for the enableExecuteCommand property should now return true.

aws ecs describe-tasks --cluster compose --tasks 9fca0a49b93b4d71842769a08d77c967 --query 'tasks[0].enableExecuteCommand'

You can now open a shell on the MQ container using the execute-command ECS Exec command:

aws ecs execute-command --cluster compose --task 9fca0a49b93b4d71842769a08d77c967 --container mq --interactive --command "/bin/bash"

You should see a prompt like this one, which means you are in the container:

Screen capture of the command shell from running ECS Exec command

If you see the following error, then you’ve most likely made an error in setting up the IAM permissions and policy:

Screen capture of error from running the ECS Exec command

Make sure that you have a policy that looks like this in the AWS IAM console:

Screen capture of the policy configuration in the IAM console

And, make sure that you have attached the policy to the task role that was created by Docker Compose, as shown in the following screen capture.

Screen capture of the TaskRole in the IAM console

After you have corrected the IAM permissions and policy, you will need to allow several minutes for the updated role and policy to become effective.

Run queue manager administration tasks using runmqsc

If all you want to do is run MQSC administration tasks using runmqsc, then you don’t need to connect to the queue manager container. Instead, you need either a redistributable client or the MacOS toolkit if you are running on MacOS.

After you’ve installed these clients, you can use runmqsc command to run MQSC administration commands against your queue manager, which is running in a container on AWS cloud.

For help on the expected parameters, run this command runmqsc -help or review the parameters in the MQ documentation.

You will need to set the environment variable MQSERVER to point at the AWS load balancer, which is acting as a reverse proxy to your queue manager, by running this command:

export MQSERVER='DEV.ADMIN.SVRCONN/TCP/compo-LoadB-OB8IK8NLN9E4-0c95efaa2c51a926.elb.eu-west-2.amazonaws.com(1414)'

Don’t forget the DEV.ADMIN.SVRCONN/ prefix!

Then, run the following command:

runmqsc -c -u admin QM1

After the prompt, enter your queue manager admin password.

If you do want to run the MQSC administration tasks inside the queue manager container, then you will first need to access the queue manager container. See the previous section on “Accessing the queue manager container using ECS Exec.”

Securing communication between MQ endpoints with TLS

To enable TLS on the container, you need a server certificate and a key. If you don’t have one from a certificate authority, you can create self-signed certificates by following the steps in the tutorial, “Secure communication between IBM MQ endpoints with TLS.”

The MQ queue manager container documentation states that the keys must be placed in the /etc/mqm/pki/keys/<Label> directory. However, the Docker docs state that Docker Compose secrets end up in the /run/secrets directory.

These clashing requirements does not mean that you cannot secure your MQ endpoints when deployed using Docker Compose. You can!

To enable TLS on your queue manager in AWS you need to supply the relevant keys as secrets, then move them from where Docker Compose puts them to where the IBM MQ container expects them. After which, you run the script to start MQ.

You can move the secrets in the docker-compose.yaml. First you need to supply the keys as secrets.

Screen capture of the secrets section of the docker-compose.yaml file

Next, you let the MQ service know about the secrets. This puts the secrets into /run/secrets/.

Screen capture of the secrets section in the mq section of the docker-compose.yaml file

Finally, you modify the container entry point to create the relevant MQ keys directory, copy the secrets, and then start MQ.

Screen capture of the entrypoint section in the docker-compose.yaml file

The modified docker-compose.yaml will look something like this:

Screen capture of modified docker-compose.yaml file in its entirety

Now that TLS has been enabled, any connection clients, including the Messaging Playground app (which you learned about in the tutorial, “Build and deploy an IBM MQ app to AWS Cloud),” will need to specify a cipher suite and key repository, which you can learn how to do in the tutorial, “Secure communication between IBM MQ endpoints with TLS.”

TLS encrypts message data as it flows over the network. IBM MQ AMS (Advanced Message Security) provides a high level of protection for data as it flows through an IBM MQ network, including message signing and message encryption. For more details see the "Securing your application" section in the MQ messaging app coding challenge tutorial.

Monitoring your queue manager health using CloudWatch Monitoring

You can use AWS CloudWatch to monitor your application. The Queue Manager that you deployed onto ECS comes with CloudWatch.

In addition to logging, CloudWatch allows you to monitor the services and track metrics.

Using CloudWatch, you can create dashboards to display metrics and alarms that are relevant to you, such as cost, CPU, or disk read/writes. Alarms can be used to trigger start or terminate actions.

For example, you can create a log metric that raises an alarm when specific log events are detected.

Run some client apps with poorly configured TLS or cipher settings. These will show up as TLS errors in the logs.

Screen capture of TLS errors in the log file

In CloudWatch select your log group and select Create metric filter from the Actions button.

Screen capture of the log group MQAppLogs in the Amazon CloudWatch console, with Actions button selected

Define a pattern that looks for specific error codes, such as ?AMQ9999E ?AMQ9665E.

Screen capture of the Define Pattern page in the Create metric wizard

Scroll down on the Define pattern page, and click Test pattern to test the metric filter against your log.

Screen capture of the Test Pattern page in the Create metric wizard

Click Next. Then, complete the metric details, including a namespace, name, value, and default value.

Screen capture of the Metric details page in the Create metric wizard

Click Next, and on the final screen of the wizard, click Create the metric filter.

Screen capture of the Review and Create page in the Create metric wizard

To make use of the newly created metric, go to All metrics.

Select your new metric to measure the occurrences. To create an alarm based on this metric, in the table of metrics, in the Actions column, click the bell button (Create alarm) action.

Screen capture of the Metrics page in the Amazon CloudWatch console

For the alarm, specify a metric name, statistic, and period.

Screen capture of the Create alarm wizard, specify metric and conditions page

Scroll down and specify the conditions.

Screen capture of the Create alarm wizard, specify metric and conditions page

Click Next. Then, you can use SNS (Amazon Simple Notification) to send an email alert. If this is a new topic, click Create topic. Then, click Add notification.

Screen capture of the Create alarm wizard, configure actions page

Click Next. Add a name and description for your alarm.

Screen capture of the Create alarm wizard, name and description page

Click Next, preview the alarm, and then click Create alarm.

Screen capture of the Create alarm wizard, preview page

Summary

In this tutorial, you learned how to troubleshoot IBM MQ queue managers in AWS. Using this troubleshooting guide, you learned how to access the queue manager logs, how to use ECS Exec to access the queue manager container, how to run MQSC administrative tasks, how to enable TLS for a queue manager on AWS, and how to use Amazon CloudWatch to monitor the health of your queue manager using alerts for specific log entries.