The time taken to restart a queue manager that was processing transactional messages will depend on a number of factors including the performance of the underlying files system hosting the recovery log and the depth of the queues.
A paper showing some examples of queue manager re-start times in different situations is now available on our github performance page here
Measurements include cases where V9.1.1 optimisations dramatically improve restart times, e.g.
- Reduced I/O operations During Restart
We have reduced the number of I/O requests required to access the recovery log during restart (particularly beneficial when using NFS for multi-instance queue manager support, for instance).
The chart above shows the elapsed time taken to recover a queue manager that was killed abruptly when a requester/responder style workload was running at a rate of 2,500 round trips per second. The recovery log was hosted on a local SAN, and then on NFS, via a local 10Gb link, with increasing levels of latency:
For higher latency files system, recovery time was dramatically reduced in this test for V9.1.1 (orange bars) vs V9.1.0 (blue bars)
- Parallel Queue Loading
The chart above shows restart times for a controlled test where each of 10 queues has a single uncommitted message on it (ensuring every queue will need to be loaded on restart), followed by 10K committed messages on the recovery log. The 10 queues are initially empty, or populated with up to 1,000,000 2KB messages per queue. The queue manager is killed abruptly, then restarted causing forward recovery of the transactions on the log, which in turn triggers all 10 of the queues to be loaded as part of the re-start process. Queue loading is done in parallel in V9.1.1. which reduces the start-up time when compared to V9.1.0. The time saving increases with the initial depth of the queue.