I consider myself lucky I get one comment for a blog entry – so far I have had 4 so this topic must be interesting!    I have added more comments at the bottom to reflect the feedback I have had

On distributed MQ, at a hand waving level of understanding, whenever an application does a put or get to a queue, a lock is held on the queue for the duration of that request.
If an application puts or gets a persistent message out of syncpoint, then the queue lock is held across the the IO request.  If the log time is 1 ms – this limits the number of requests to 1000 a second.
What tends to happen is the application issues a log write request, but the previous log write has just started, so the application has to wait for almost 2 log writes – so the throughput is now down at the 500 requests a second.

If your log write time is over 10 ms (which I have recently seen!) this means you may only  achieve 50 puts plus gets a second.

If your throughput requirement is higher than this you have several options

  1. Improve your log IO time
  2. Put the puts and gets WITHIN sycnpoint
  3. Review if your messages need to be persistent.
  4. Use more queues.  For for example instead of using the SYSTEM.CLUSTER.TRANSMISSION.QUEUE use multiple queues.  See here

On z/OS the lock is held for a shorter time, so you may experience performance problems when migrating applications programs from z/OS to MQ Distributed or the MQ Appliance.  Of course you will spot this when you do your load testing.  You now know how to fix it.


Additional comments

Paul Harris said that if you are doing a GET out of syncpoint there is no advantage of using a persistent message.   This took me a moment to see this truth in this.

There is an advantage of doing a PUT out of syncpoint.  For example putting “I got here” which is created even if the transaction backed out.

I was asked to explain from a performance perspective why out of syncpoint is bad.   I’ll use example numbers which are not very realistic – but make the point.

First we need to understand some terms.    The transaction time is the time for the request, the throughput is how many requests per second can be done when there are many transactions running concurrently

Let the time to do a put within syncpoint be 1  ms, and the time  (as seen by the application) to do a log write ( commit or out of syncpoint) as 10 milliseconds.

  • A put within syncpoint followed by a commit takes 1 ms for the put, and 10 ms for the commit so it takes 11 ms.
  • A put OUT  of syncpoint (logicially a put within syncpoint + commit) saves a few micro seconds because there is only one MQ request.  So it takes 10.9 millisecond
  • How many puts out of syncpoint can we do a  second? The answer is 1000 ms /10.9 ms = 90 per second.   This is because the queue is locked for the duration of the put and the IO
    • Multiple instances do not give increased throughput because the lock is held for 10.9 ms.   You will still only get 90 a second.
  • How many puts within syncpoint can we do a second.   One transaction takes 1ms + 10 ms (= 11ms) , so 1000 ms/11 ms  = 90 a second as above
    •  but we can do many transactions in parallel.  The queue is locked for 1ms – so we can do 1000 puts a second.  This is 10 times the throughput of a put out of syncpoint.
    • If we do 10 puts within syncpoint, then commit (called batching),  It takes 10 * 1ms  for the put, + 10 ms for the commit = 20 ms.   1000 ms /20 ms is 50 of these a second which is 50 * 10 messages put a second for the transaction instance
    • If there were 3 instances of a transaction doing 10 * put + commit – the overall throughput would still be 1000 puts a second due to the lock on the queue during the put request
    • If there are putting applications and getting applications then each get or put will take the lock – so perhaps 500 puts and 500 gets a second.

 

These figures very simplistic and for illustration to give the general concepts.    To get more information look at the performance reports – or do your own measurements.

7 comments on"Why are persistent messages out of syncpoint a bad idea?"

  1. Thanks for showing the math – very helpful in understand what’s happening here!

  2. Hi Colin,
    Thanks for the very important (if counter intuitive) blog post. I come from a long background in MQ Support and I like to think that I know a lot about how MQ works.
    My application developers speak JMS and it would be a great addition to explain how the options with acknowledgement mode and/or transactions relate to this.
    Thanks
    Tom

    • Tom,
      Thanks for your comments – often my blog posts get no comments, so it is good that this is being useful.
      I spoke to people who know about JMS and they said
      1) JMS transactions will cause messages to be processed inside syncpoint.
      2) JMS acknowledgements are only used in non-transacted sessions, and the implementation is undefined.
      I dont know what 2) means – as I dont use JMS
      hope this helps
      Colin

      • The JMS spec defines 5 different levels of transactions/ack modes:

        1) XA coordinated sessions – these aren’t coordinated by the queue manager, but instead by a third party such as WebSphere Application Server. These will map to messages being put/got in synchpoint in MQ but the act of committing the messages uses a two phase commit, rather than the one phase option provided by MQ

        2) transactional sessions (created using the Session.SESSION_TRANSACTED flag depending on exactly which variant of JMS you are using). As Colin mentions these directly map on to putting/getting messages in synch point

        3) auto ack sessions (created using the Session.AUTO_ACKNOWLEDGE flag)
        4) client ack sessions (created using the Session.CLIENT_ACKNOWLEDGE flag)
        5) dups ok ack sessions (created using the Session.DUPS_OK_ACKNOWLEDGE flag)

        These last three (which map to Colin’s point 2 above) are really concerned with messages being got. They all still use synchpoints to some degree to prevent messages being lost in various failure windows in the code. For example with #3 the commit occurs after the message is successfully passed to the application code; a failure in the application results in a rollback. For #4 the commit is driven by the application calling message.acknowledge(); again
        rollback occurs in failure scenarios. I _think_ we treat #5 the same as #3.

        So basically the MQ JMS implementation uses MQ synchpoints to meet the transactional semantics of the JMS spec.

        Hopefully this gives a bit more detail for you.

  3. Peter Potkay August 26, 2017

    I have the same question as Roger. Can you add some background as to why it will perform better within syncpoint? I’m assuming it will only be better if you group 2 or more puts/gets in a syncpoint, as then those multiple puts/gets “share” the I/O if the single MQCMIT (or MQBACK). Would you see any performance increase for 1 persistent MQPUT per MQCMIT versus 1 persistent MQPUT outside of syncpoint? Why?

  4. carl_farkas August 25, 2017

    Nice article, Colin. Good stuff! Thanks!

  5. Hello Colin,

    > 2. Put the puts and gets WITHIN sycnpoint

    Please explain how using syncpoint will improve the speed of persistent messages for MQGET and/or MQPUT/1.

    The lock will still be put on the queue. Now with syncpoint, the persistent messages are marked as invisible until an MQCMIT. Or removed if an MQBACK is issued. Hence, more work is being added to the queue manager, so I would like to understand how this will speed up the through-put rate.

    Regards,
    Roger Lacroix
    Capitalware Inc.

Leave a Reply