Saturday, January 10, 2009

Improving Jamon’s performance

I was browsing through Jamon’s code to see why it is so much slower under high contention then Simon (see my previous article). In this article I will present these differences and show you a way to speed up Jamon right now.

Differences
Jamon has more features then Simon like listeners and data ranges. This obviously need some computing, if only to see if they are used. Another interesting difference is that Jamon uses double’s for aggregate statistics, whereas Simon uses long’s. I don’t really see the advantage of double’s (please surprise me) and it is probably a bit slower as well.

Synchronization
Jamon uses 5 synchronization points during a timing operation. There are 3 to get and create the monitor: on the MonitorFactory method, on the map of all monitor datas and on the map of all data range definitions. The last 2 synchronization points are for starting and stopping the timer (on the monitor data).

Simon (does not support data ranges) uses only 3 synchronization points: one for getting the monitor (on the SimonManager method) and 2 for starting and stopping the timer (on the timer itself).
Simon 2.0 no longer needs synchronization on starting a timer and therefore only needs 2 synchronization points.

Jamon defines its maps for monitors and data ranges as Collections.synchronizedMap(new HashMap(50)). However, it is not necessary to synchronize on these maps when the code that uses these maps is already synchronized on the MonitorFactory (in method getMonitor). Unfortunately when the synchronization wrapper is removed, all code that uses these maps will need to be analyzed to see if they properly synchronize.

Another solution would be to use a map that needs less synchronization: Java 5’s ConcurrentHashMap! Luckily Jamon provides a method to change the map implementation for exactly this reason. I included the following line in my test application (from my previous post):

MonitorFactory.setMap(new ConcurrentHashMap(200));

And these are the results. The measurements are in ms, the scale is logarithmic.

Quite astonishing: Jamon is no longer 5 up to 200 times slower, but just a bit slower for low contention, up to 15 times slower under heavy contention, and 2 times slower under extreme contention. Note that only the map for monitors was changed, the map for range definitions (always empty in my test) was not changed.

Conclusions
Even though Jamon is quite a bit slower under heavy contention, there is room for improvement. By calling a single method, you can make Jamon 20 times faster right now.

1 comment: