I’ve had a number of inquiries recently wanting to know how well utilised their MQ appliance was? and how much room did they have for additional workload or QM migration?
I’ve provided some questions and answers to help users understand how they can monitor how resources are being utilised on the MQ Appliance and help determine how much capacity the MQ Appliance has for adding additional workload.
How can I tell how much of the MQ Appliance is being used?
There are a number of statistics exposed through both the MQ Console and the CLI (MQCLI and DP) that can provide data to help establish resource utilisation in your appliance.
All of the data available through the MQ Console graphing interface is also available using the amqsrua command on the MQCLI. There are also additional mqcli commands like status and status <QM> that provide dynamic information.
The amqsrua code is also provided as a sample and can be used remotely to monitor the MQ appliance.
In general if you are running NP workload, you probably want to monitor CPU and Network utilisation. If you are running HA workload you are probably more interested in the IO utilisation and the Network bandwidth in use by your HA replication link.
How can I see how much CPU the MQ appliance is using?
The first resource commonly monitored is CPU. This can be obtained via the status command on the CLI. The specific User and System CPU can also be obtained through the Console. The load averages for the previous 1, 5 and 15 minute intervals are also provided. The M2001A MQ appliance contains 20 cores, and a load average of 20 would correspond to the machine being 100% utilised. A load average of 10 would correlate with the appliance being 50% loaded and a load average of 40 would correspond to the machine having twice as much work being requested than it can process.
What information does the status cli command provide?
mqa(mqcli)# status Memory: 16297MB used, 189.1GB total [8%] CPU: 75% CPU load: 6.62, 6.19, 5.07 Internal disk: 786432MB allocated, 2979.5GB total [26%] System volume: 5270MB used, 14.7GB allocated [35%] MQ errors file system: 175MB used, 2 FDCs, 15.8GB allocated [1%] MQ trace file system: 3034MB used, 31.5GB allocated [9%] mqa(mqcli)# status PERF0 QM(PERF0) Status(Running) CPU: 7.29% Memory: 209MB Queue manager file system: 1230MB used, 63.0GB allocated [2%]
In the data above you can see the appliance was 75% CPU utilised of which the Queue Manager (PERF0) was using approximately 7%.
How can I see how much memory the MQ appliance is using?
Memory utilisation can also be obtained in a similar way to CPU with both System wide and Queue Manager specific totals being available within the Console/amqsrua.
How can I see how much filesystem/disk space is available?
The status command will provide information on the System volume, the Trace and Errors file system and how much is available on the internal disk subsystem for allocation to new Queue Managers. The status
How can I see how much data is flowing through the network interfaces on the appliance?
There is capability in the appliance CLI to enable network statistics and view the bandwidth flowing in to each of the interfaces configured on the appliance:
mqa# config Global configuration mode mqa(config)# statistics Statistics enabled mqa(config)# show receive-kbps mqa# show receive-kbps Interface type Interface name 10 sec 1 min 10 min 1 hour 1 day -------------- -------------- ------ ----- ------ ------ ----- Other ip6tnl0 0 0 0 0 0 Other lo 1 20 78 78 78 Other sit0 0 0 0 0 0 Ethernet eth10 0 0 0 0 0 Ethernet eth11 0 0 0 0 0 Ethernet eth12 0 0 0 0 0 Ethernet eth13 6 2 111 111 111 Ethernet eth14 0 0 0 0 0 Ethernet eth15 0 0 0 0 0 Ethernet eth16 0 0 0 0 0 Ethernet eth17 0 0 104 104 104 Ethernet eth20 0 0 0 0 0 Ethernet eth21 8107 7967 5031 5031 5031 Ethernet eth22 26279 25845 16030 16030 16030 Ethernet eth23 1 0 0 0 0 Ethernet mgt0 3 8 11 11 11 Ethernet mgt1 0 0 0 0 0
You can see that workload data is arriving at eth22 (10Gb Network) at ~26000 Kbps == 3.2MB/s. There is also data received from the second HA appliance on replication interface eth21 and the HA group primary and alternate interfaces (eth13 and eth17).
To disable the statistics collection enter the following:
mqa# config Global configuration mode mqa(config)# no statistics Statistics disabled
How can I monitor the IO subsystem of the appliance?
There are a number of statistics exposed via the console and amqsrua which relate to how the MQ logger is performing. The most relevant are the average write latency(microseconds), physical/logical bytes written in the reporting interval (and the calculated rate).
mqa(mqcli)# amqsrua -m PERF0 -c DISK -t Log Publication received PutDate:20170622 PutTime:14063084 Interval:10.000 seconds Log - bytes in use 1073741824 Log - bytes max 1207959552 Log file system - bytes in use 1274994688 Log file system - bytes max 16910295040 Log - physical bytes written 311582720 31157263/sec Log - logical bytes written 210666990 21066016/sec Log - write latency 132 uSec Log - current primary space in use 21.58% Log - workload primary space utilization 42.88% Log - write size 12280
Totalling the physical write bytes/sec across your active QM will provide you with the amount of data that is flowing the the IO subsystem. Its difficult to state at which point you will start to notice IO saturation as IO performance will vary on a number of factors:
- Number of active QM
- Log configuration of those active QM
- Messaging rate across active QM
- Level of concurrency (numbers of threads driving that workload)
- Client separation distance, network configuration and bandwidth
- HA separation distance, HA network configuration and bandwidth
- DR separation distance, DR network configuration and bandwidth
In lab testing, we have driven the IO subsystem to in excess of 500MB/s utilising MQ client workload, but not all environments would be able to achieve this rate. To offer some points of comparions, for 10QM NonHA testing 300 clients at 2K message size in a requester responder scenario will drive ~300MB/s to the IO subsystem. A 20K message would drive over 500 MB/s.
For 10QM HA(direct connected) testing 300 clients at 2K message size in a requester responder scenario will drive ~80MB/s to the IO subsystem. A 20K message would drive over 200 MB/s. Adding up to 5000 total clients and the amount of data that is written will double.
How can I see how many messages are being processed by the appliance?
Using the console or amqsrua, you can monitor statistics at the QM level or on a particular queue:
mqa(mqcli)# amqsrua -m PERF0 Enter Class selection ==> STATMQI Enter Type selection ==> PUT Publication received PutDate:20170622 PutTime:13450725 Interval:1 minutes,46.295 seconds Interval total MQPUT/MQPUT1 count 94614 890/sec Interval total MQPUT/MQPUT1 byte count 193750280 1822749/sec Non-persistent message MQPUT count 22 Persistent message MQPUT count 94592 890/sec Failed MQPUT count 0 Non-persistent message MQPUT1 count 0 Persistent message MQPUT1 count 0 Failed MQPUT1 count 0 Put non-persistent messages - byte count 25864 243/sec Put persistent messages - byte count 193724416 1822506/sec MQSTAT count 0 mqa(mqcli)# amqsrua -m PERF0 Enter Class selection ==> STATQ Enter Type selection ==> PUT An object name is required for Class(STATQ) Type(PUT) Enter object name ==> REQUEST1 Publication received PutDate:20170622 PutTime:13473663 Interval:23 minutes,40.909 seconds REQUEST1 MQPUT/MQPUT1 count 743965 524/sec REQUEST1 MQPUT byte count 1523640320 1072299/sec REQUEST1 MQPUT non-persistent message count 0 REQUEST1 MQPUT persistent message count 743965 524/sec REQUEST1 MQPUT1 non-persistent message count 0 REQUEST1 MQPUT1 persistent message count 0 REQUEST1 non-persistent byte count 0 REQUEST1 persistent byte count 1523640320 1072299/sec REQUEST1 lock contention 0.19% REQUEST1 queue avoided puts 0.00% REQUEST1 queue avoided bytes 0.00%
How do I view the monitoring system resource usage in the MQ Console?
In the MQ Console. Select Add Widget. Select Chart. Select cog wheel in upper right corner and select resource class/type/element. Select Save.
The full list of available monitoring statistics is available here:
How do I use the amqsrua utility?
The amqsrua utility can either run in interactive mode where the user will be prompted to enter the Class/Type/Element required to monitor, or these can be specified on the command line e.g. amqsrua -m PERF0 -c STATQ -t PUT -o REQUEST1
mqa(mqcli)# amqsrua -m PERF0 CPU : Platform central processing units DISK : Platform persistent data stores STATMQI : API usage statistics STATQ : API per-queue usage statistics Enter Class selection ==> CPU SystemSummary : CPU performance - platform wide QMgrSummary : CPU performance - running queue manager Enter Type selection ==> SystemSummary Publication received PutDate:20170622 PutTime:10504528 Interval:5.851 seconds User CPU time percentage 36.40% System CPU time percentage 35.35% CPU load - one minute average 6.31 CPU load - five minute average 5.91 CPU load - fifteen minute average 4.89 RAM free percentage 91.58% RAM total bytes 193547MB mqa(mqcli)# amqsrua -m PERF0 -c STATQ -t PUT -o REQUEST1 Publication received PutDate:20170622 PutTime:13524324 Interval:28 minutes,47.524 seconds REQUEST1 MQPUT/MQPUT1 count 779064 451/sec REQUEST1 MQPUT byte count 1595523072 923589/sec ...