Wednesday, August 6, 2008

Asynchronous cache updates

This is the second article on Java concurrency. The first is Java thread control.

Many applications use more or less static reference data such as postal codes, exchange rates and externally stored text. Often this type of data can be safely cached in memory for performance reasons. To make sure that the cached data does not become stale for too long, it can be reloaded every 10 minutes, once a day or on any other schedule you might like.

Rather then creating custom code for every type of reference data, one can extract the hard concurrency code to a utility class. This article presents the PeriodicExecutor, a utility class I wrote years ago, and proved to be so useful and reliable that is has been traveling with me from project to project.

Here are the requirements I had in mind while writing PeriodicExecutor:

  1. Support any data retrieval and any cache mechanism.
  2. Refresh the cache in the background, do not slow down request handling.
  3. No idle threads, only start a thread when one is needed.
  4. Be resilient against errors, retry when they occur.
  5. Support flexible reloading schedules.
  6. Support initial loading of the memory cache.
Lets discuss these requirements and see how they are implemented.

1. Support any data retrieval and any cache mechanism.
Each type of data source needs its own retrieval code. E.g. JDBC code for database stuff and things like HttpClient for remote stuff. The cache implementation can also be different from case to case. For example in one case we just need to store a String, in another case we first transform the data to a Map. Therefore the utility class will only solve the the multi-threaded scheduling problem.

PeriodicExecutor solves this by letting the client code provide its reload task as a Runnable. The task is also responsible to make the results available when reloading is finished.

2. Refresh the cache in the background, do not slow down request handling.
When its time for a data reload, it does not make sense to delay normal request processing until the reload is done. Remember, we are caching for performance reasons! As a consequence PeriodicExecutor executes the client task in a separate thread.

PeriodicExecutor guarantees that the task is never run in parallel, but the client task must make sure that there are no concurrency problems the moment the cache is updated.

3. No idle threads, only start a thread when one is needed.
There are 2 approaches to thread handling. We can either start a thread upon initialization and keep that thread running all the time, or we can start a new thread each (short) time a reload is required. The first approach has the disadvantage that more resources are used then is necessary (though since Java 5 we could share an executor service with all reloaders). Inconvenient is that we need explicit code to shut down the thread (or executor service). Advantage of this approach is that a thread can schedule reloads precisely.

I chose the second approach: each time the cache is accessed by some code, that code must call method PeriodicExecutor#requestStart() to see if a reload is dictated by the schedule. If that is the case, the reload is started in a newly started background thread. A small disadvantage to this approach is that after a long period of inactivity, the first caller will trigger a reload, but until the reload is finished all cache users will see stale data.

4. Be resilient against errors, retry when they occur.
The reference data may be retrieved from an unreliable data source. When the reload fails, it should be tried again later. Normal processing can continue as the new background thread isolates it from exceptions. In addition, the reload task should only update the memory cache until after the reload succeeded.

PeriodicExecutor detects reload errors by catching exceptions thrown by the reload task. When an error occurred the next call to requestStart() will trigger a second reload attempt. After 2 failed attempts the next reload task is delayed for a couple of minutes to prevent trashing.

5. Support flexible reloading schedules.
PeriodicExecutor normally reloads with a fixed delay. However, it is easy to override this by providing your own next reload time. A future improvement would be to extract this code according to the strategy pattern.

6. Support initial loading of the memory cache.
Initially the memory cache is empty. As we are talking about reference data, it does not make sense to continue for most programs. PeriodicExecutor therefore also support synchronous loading and exception rethrowing. During a synchronous load the caller is blocked until the reload task is done. Before one task completed, the default is to run synchronously and rethrow exceptions from the reload task. Once one task completed, the default is to run asynchronously and to swallow exceptions (they are logged of course).

Example
Enough theory, how do you use it? Here is a simple example:

public class CachingPostalCodeService { private final Object postalCacheLock = new Object(); private List<PostalCode> postalCache; private PeriodicExecutor postalCacheReloader; public CachingPostalCodeService() { postalCacheReloader = new PeriodicExecutor( TimeUnit.MINUTES.toMillis(10), new Runnable() { public void run() { refreshPostalCache(); } public String toString() { return "Postal code cache reloader"; } }); } public List<PostalCode> getPostalCodes() { postalCacheReloader.requestStart(); synchronized (postalCacheLock) { return postalCache; } } private void refreshPostalCache() { List<PostalCode> newPostalCodes = .... // Don't forget to make newPostalCodes unmodifiable. synchronized (postalCacheLock) { postalCache = newPostalCodes; } } }
Notice that only a limited amount of code is needed to get all the discussed features. In this example a list of postal codes is cached in memory. The only user of the cache, method getPostalCodes, calls requestStart every time. Initially, when no data is in the cache, requestStart will block until the data has been loaded. After 10 minutes of use, the cache will be reloaded in the background. Note how the example synchronizes on postalCacheLock to prevent concurrency problems during the cache update.

Variations
Sometimes you only need to reload once a day, preferably early in the morning. The following example shows how one can override scheduled reload to 5 o'clock in the morning.

public CachingPostalCodeService() { postalCacheReloader = new PeriodicExecutor( 0, new Runnable() { // ...as above... }) { @Override protected long nextExecuteTime(long currentExecute) { return getNextTime(5); } }; } /** * @param hour the requested hour * @return the next time it is the given hour * in the JVM default time zone */ private static long getNextTime(int hour) { Calendar cal = Calendar.getInstance(); cal.set(Calendar.HOUR_OF_DAY, hour); cal.set(Calendar.MINUTE, 0); cal.set(Calendar.SECOND, 0); cal.set(Calendar.MILLISECOND, 0); if (cal.getTimeInMillis() < System.currentTimeMillis()) { cal.add(Calendar.DAY_OF_MONTH, 1); } return cal.getTimeInMillis(); }

If you need more control on when to execute synchronously or not, take a look at method PeriodicExecutor#requestStart(boolean).

Download
Download PeriodicExecutor. PeriodicExecutor can also be used with Java 1.4 (and perhaps 1.3 as well).

Other options
Probably the most complete FOS library for task scheduling is Quartz.

Enjoy!

2 comments:

  1. Erik, how are you doing?

    Thanks for sharing your experiences and making me think once more.

    After reading this piece, chewing on it for a while and indeed discovering your 'PeriodicExecutor' class in our code base (what does that say about code ownership?) I must say that I think your 'solution' is conceptually rather flawed.

    For reasons of apparent 'premature optimization' - What's wrong with an idle thread if it serves a well understood purpose? - you have tried to compromise two pretty well understood modes of operation (1. a traditional cache that only fetches data when not cached already or invalidated e.g. because of expiration; 2. a mirror that gets periodically synchronized in the background) and seem to accept the risk of incorrectness as if correctness should not be the number one requirement.

    When given an expiration timeout, both a cache and a mirror provide a pretty good guarantee that no expired/invalidated data is ever served. Failure is obviously always an option (no connection or whatever), but will get noticed and marked clearly. Depending on the kind of data and how it is used, such a synchronization problem will need to be considered fatal or not by the client.

    Your 'Asynchronous cache updates' idea happily serves expired data, both in the normal case of data being requested for the first time (long) after it has expired and in case of synchronization problems. Despite some complexish Exception handling (automatic retries usually have little chance of succeeding and make problems even harder to diagnose), you are effectively eating them. As a result the client application has no way of knowing that it gets served expired data. For a list of postal codes that gets used all the time, this might not be problematic, but in the general case I really don't see how this behaviour can be acceptable in light of any 'real (business) requirements' (quick attempt at examples: fetching a Certificate Revocation List, currency exchange rates, etc.).

    You do mention Quartz and I would say that using that for background synchronization would be a much better choice in most situations.

    In the mean time we're happy to have established that at least the way PeriodicExecutor is used in our code base, it won't do much harm... ;-)

    ReplyDelete
  2. Jurjan-Paul! Nice to hear from you.

    You are completely right about that the control flow of PeriodicExecutor can make you serve very old stale data in some circumstances. As with any caching/mirroring solution (nice distinction) you must make sure that these circumstances do not, or rarely occur and whether this is acceptable at all.

    Regarding the fact that the client can not see that there is a synchronization problem, this can both be a blessing and a big problem. I agree that for currency exchange rates this is rather undesirable. On the other hand, I have also seen that a site remained running while someone had changed the database and the postal codes were no longer available. This was found quickly because the logs were monitored. The solution took a couple of days but meanwhile the site just continued working properly. So again, you must make sure you can live with stale data.

    One way to make the alleviate the problem a bit is to update the cache synchronously when some unacceptable timeout has been observed. It should also not be very difficult to add the option to rethrow any failures when the retry failed.

    Regarding code ownership: this code was created way earlier, and in my spare time.

    In conclusion, I find calling PerioidExecutor flawed a rather too strong statement. There are many valid use cases, including your code base :) Still, the extra warning is appreciated.

    Regards,
    Erik.

    ReplyDelete