Response time, memory footprint, and scalability

In the previous article, Jack considered buying a new car for his taxi business and looked at how long the car would take to get going (application startup time) and how quickly it would get up to speed (application ramp-up time). Having found a suitable vehicle, he then thought about how quickly he can get his customers to their destinations (throughput).

In this article, Jack needs to keep the customer satisfied (response time) at the same time as keeping his costs down (minimizing memory footprint). And if his business is successful, he’ll also have to consider how to accommodate even more customers (scalability).

4. Response time

How quickly will Jack’s customers get to their destinations? In other words, what is Jack’s response time?

The first part of the answer to that depends on how long it takes Jack to get to a customer in the first place.

This is perhaps akin to how long it takes OpenJ9 to find and start a thread for a particular task.

But once Jack has picked up a customer, how quickly can he deliver them to their destination? He might measure that either in terms of the average time taken to deliver all his customers, or as the time for the worst case: the longest journey time.

Usually, response-time requirements in the software domain revolve around Service Level Agreements (SLAs), where either a certain average or worst-case time for a task is expected to be guaranteed by the software provider.

There is a difference between improving response time and improving response time variability, though. We can usually improve the former by using the same techniques that improve other metrics such as throughput and scalability. But improving variability is often a trade-off and there is typically a throughput or footprint cost for greater consistency.

Both these factors are important to Jack. He wants everyone to arrive as quickly as possible, but he also wants to ensure that no single person takes an exceptionally long time to reach their destination. So, Jack might decide to take a detour to drop off a customer more quickly who has been in his car for a relatively long time, even if it means a slightly longer journey to drop off his remaining customers.

OpenJ9 is well aware of these trade-offs, and handles garbage collection accordingly. Its default GC policy (gencon) does a good job of balancing throughput with response time variability. But you can use other GC policies (optthruput and optavgpause respectively) that either maximize throughput or minimize response time variability, though at some expense to the other.

You can learn more about memory management and garbage collection in this article: Memory management, garbage collection (GC) policies, and GC diagnostic tools.

5. Memory footprint

Keeping the customer happy is all very well, but Jack also needs to keep his costs down. And one significant expense is fuel: miles per gallon, kilometres per litre… in computing terms, memory and CPU usage.

Can we minimize how much memory the process is using? In a virtual system, memory usage is often a bottleneck and might actually be an immediate expense in some commercial clouds. And in some situations (as with z Systems) CPU consumption might also be an expense, and both memory usage and CPU usage will have to be minimized (and balanced).

Reducing fuel consumption – or memory/CPU usage – is a direct cost saving.

Garbage collection in the JVM helps reduce memory footprint: akin to finding and throwing out unnecessary rubbish from the car so that fuel consumption is reduced.

And when Jack chooses his new taxi, he should make sure he gets one with an “ECO” button: improved fuel consumption, although at the cost of peak performance.

Similarly, the OpenJ9 JVM has its own “ECO” button in the form of the command line option -Xtune:virtualized, which reduces CPU consumption when idle. Again, there’s a balance. Here, it’s the trade-off between memory usage and CPU time on the one hand, and performance and throughput on the other.

6. Scalability

So Jack has measured and optimized all these parameters and is providing an efficient service; more and more people want a ride, and Jack needs to transport more people at the same time. He can either get a bigger taxi (I suppose Jack will turn from taxi driver to bus driver!), or he can talk to his friends and get more “regular” taxis on the road.

Of course, a single big taxi (aka “a bus”!) is potentially more economical to run, but there’s an advantage to having a fleet of taxis: even if one breaks down, the majority of passengers still get to their destination.

Also, using several smaller vehicles is far more flexible, and Jack can much more easily make sure that there’s transport available when a potential customer needs it. (Getting a taxi home from the station doesn’t usually involve a wait; getting a bus usually does!)

In computing, it’s the difference between “vertical scaling” (adding extra resources – bigger hard drives, better CPUs, more RAM, and so on – to a few, relatively expensive nodes in the system) and “horizontal scaling” (expanding the system by adding extra, relatively cheap and simple nodes that can be flexibly deployed). One bus vs several taxis!

However, even with a fleet of taxis, Jack still might not be able to meet demand at the busiest times. If there’s no taxi available, the customer has two choices: either they stand around waiting for their ride, or they might choose to go and do something useful until Jack texts them when there’s a taxi free to pick them up.

This is analogous to the different contention resolution strategies employed when a thread hits a scalability bottleneck. The thread must either “spin”, burning CPU cycles while it waits, or “yield”, giving up the CPU until the lock becomes available and the thread gets notified.

These resolution strategies are built into Eclipse OpenJ9 to provide the scalability that modern day applications demand.

You can learn more about threads and locking in this article: Java concurrency and locking.

In conclusion


Thinking about Jack and his taxis has hopefully given you a better feel for the key performance metrics that characterize Java applications – and software in general.

Just as in computing, Jack is continually trying to maximize his throughput and minimize the waiting time for his customers – all the while keeping the resources he uses (read fuel and drivers or CPU cycles and memory, as you like!) to a minimum.

It’s all down to keeping expense low and satisfaction high. By monitoring the performance of our taxis – and computer systems – we can most effectively balance these goals. And just like my canny friend Jack, OpenJ9 does a pretty good balancing act to achieve the very best performance.


Authors: Vijay Sundaresan and Peter Hayward

1 comment on"Measuring Java performance (or Jack starts a taxi business): Part 2"

  1. […] 本文翻译自:Measuring Java performance (or Jack starts a taxi business): Part 2(2017-10-25) […]

Join The Discussion

Your email address will not be published. Required fields are marked *