The time taken to restart a queue manager that was processing transactional messages will depend on a number of factors including the performance of the underlying files system hosting the recovery log and the depth of the queues.

A paper showing some examples of queue manager re-start times in different situations is now available on our github performance page here

Measurements include cases where V9.1.1 optimisations dramatically improve restart times, e.g.

    Reduced I/O operations During Restart

We have reduced the number of I/O requests required to access the recovery log during restart (particularly beneficial when using NFS for multi-instance queue manager support, for instance).

The chart above shows the elapsed time taken to recover a queue manager that was killed abruptly when a requester/responder style workload was running at a rate of 2,500 round trips per second. The recovery log was hosted on a local SAN, and then on NFS, via a local 10Gb link, with increasing levels of latency:

  • NFS1 – Undelayed network link
  • NFS2 – Network link with 0.5ms delay in each direction
  • NFS3 – Network link with 1ms delay in each direction
  • For higher latency files system, recovery time was dramatically reduced in this test for V9.1.1 (orange bars) vs V9.1.0 (blue bars)

      Parallel Queue Loading

    When queues need to be loaded as part of the restart processing, they are now done so in parallel, further speeding up start-up of the queue manager.

    The chart above shows restart times for a controlled test where each of 10 queues has a single uncommitted message on it (ensuring every queue will need to be loaded on restart), followed by 10K committed messages on the recovery log. The 10 queues are initially empty, or populated with up to 1,000,000 2KB messages per queue. The queue manager is killed abruptly, then restarted causing forward recovery of the transactions on the log, which in turn triggers all 10 of the queues to be loaded as part of the re-start process. Queue loading is done in parallel in V9.1.1. which reduces the start-up time when compared to V9.1.0. The time saving increases with the initial depth of the queue.

    Join The Discussion

    Your email address will not be published.