Wednesday, August 13, 2008

Java transaction boundary tricks

Controlling transactions is one of the fundamental things you must control well if your program is to be used by more then 1 user and has a database (as in about every web application). This article shows a little trick to get more out of transactions then most programs do. Since I am using Spring to demonstrate the principle, I will introduce Spring's transactions first. Feel free to skip that section if you know all about Spring transactions.

Using transaction from Spring
With Spring, controlling where a transaction is started can be as simple as adding the @Transactional annotation (and one extra line in the Spring configuration). The @Transactional is typically placed on the class that implements the interface of your service layer. For example you may have a MemberService:

public interface MemberService { List findMembers(MemberCriteria criteria); Member getMember(long id); Member updateMember(Member member); }
with the following implementation:
@Transactional public class DefaultMemberService implements MemberService { public List findMembers(MemberCriteria criteria) { ... } public Member getMember(long id) { ... } public Member updateMember(Member member) { ... } }
When you let Spring instantiate DefaultMemberService as a bean, all of its public methods will be automatically proxied in such a way that the method is executed within a transaction.

Java transactions are bound to a thread (think ThreadLocal), the proxy can therefore join an existing transaction in the current call stack, and by default that is what happens. As a result you can safely call other transactional service methods from within your own transaction methods.

Introducing the CommandService
Here is a kind of weird service that I came up with a couple of years ago, named after the Command pattern, the CommandService:

@Transactional public class DefaultCommandService implements CommandService { public void inTransaction(Runnable command) { command.run(); } public <T> T inTransaction(Callable<T> command) { try { return command.call(); } catch (Exception e) { throw new RuntimeException( "CommandService#inTransaction(" + command.toString() + ")", e); } } }
(Download link at bottom of article.)

As you can see it doesn't to anything except for executing the runnable or callable it is given. How can this be useful? Well, notice the @Transactional. Your block of code is now executed within a transaction!

Again, how can this be useful? Here are 3 use cases:

Use case 1: prevent fine grained service methods
Suppose you have a webpage that will update only one property of an entity, e.g. the email address of a member, and another page to update the password. Now to prevent concurrent update problems you want to get and update the member within the same transaction. You could make a new method on the service interface for each property you want to change. However, it is a good practice to keep interfaces concise. So instead we do something like this:

final long memberId = ...; final String newEmail = ...; Member freshMe = commandService.inTransaction(new Callable<Member>() { public Member call() throws Exception { Member freshMe = memberService.getMember(memberId); freshMe.setEmail(email.getNewEmail()); return memberService.updateMember(freshMe); } });
The member entity is retrieved and updated in the same transaction. This way we only need 2 methods in the member service interface to safely update any property of a member.

Use case 2: bundle operations for performance reasons
I once wrote some code that would keep thousands of LDAP records in sync with a relational database. In my test environment, a dodgy laptop, it would take 50 minutes to do the initial import of 15,000 records. Each record was persisted to the database in its own transaction. After a small code change, I used the CommandService to persist the records in groups of 20. What do you think happened? The total time to import these 15000 records dropped from 50 to 2 minutes! Not bad, 25 times faster through 4 lines of extra code.

Careful positioning of transaction boundaries can have a dramatic effect on performance. CommandService can help you do that.

Use case 3: transactions in unusual places
Let us look at a little bit more detailed example of the PeriodicRetriever from my previous article (changes in italic).

public class CachingPostalCodeService { private final Object postalCacheLock = new Object(); private List<PostalCode> postalCache; private PeriodicExecutor postalCacheReloader; private HibernatePostalCodeDao hibernatePostalCodeDao; public CachingPostalCodeService() { postalCacheReloader = new PeriodicExecutor( TimeUnit.MINUTES.toMillis(10), new Runnable() { public void run() { refreshPostalCache(); } public String toString() { return "Postal code cache reloader"; } }); } public List<PostalCode> getPostalCodes() { postalCacheReloader.requestStart(); synchronized (postalCacheLock) { return postalCache; } } private void refreshPostalCache() { List<PostalCode> newPostalCodes = Collections.unmodifiableList( hibernatePostalCodeDao.getAll() ); synchronized (postalCacheLock) { postalCache = newPostalCodes; } } // ... setter for hibernatePostalCodeDao ... }
If you try this, you may notice that the data is retrieved the first time, but not the second time! The exception message is clear too, something like "Hibernate requires a transaction". To understand this lets follow the two execution paths.

The first time getPostalCodes() is called, it is called through Spring's transaction proxy, so the method is executed within a transaction. So when the PeriodicRetriever is called, the runnable (which calls refreshPostalCache()) is immediately called within the same thread and thus also within the same transaction. No problem!

The second time getPostalCodes() is called (11 minutes later), PeriodicRetriever is called again but returns immediately. Meanwhile it starts a new thread which executes the runnable defined in CachingPostalCodeService's constructor. The important thing here is that the new thread does not join the transaction of its parent thread as transactions are bound to threads. The runnable then calls refreshPostalCache(). You may think you'll get a new transaction at that moment. However, a method call within the same class will never go through the transactional proxy. This is because the instance on which we call refreshPostalCache() is simply this, and not the proxy that was obtained from Spring. In other words: refreshPostalCache() is not executed within a transaction.

CommandService to the rescue (again, changes in italic):

public class CachingPostalCodeService { private final Object postalCacheLock = new Object(); private List<PostalCode> postalCache; private PeriodicExecutor postalCacheReloader; private HibernatePostalCodeDao hibernatePostalCodeDao; private CommandService commandService; public CachingPostalCodeService() { postalCacheReloader = new PeriodicExecutor( TimeUnit.MINUTES.toMillis(10), new Runnable() { public void run() { commandService.inTransaction(new Runnable() { public void run() { refreshPostalCache(); } }); } public String toString() { return "Postal code cache reloader"; } }); } // ... same as above ... // ... setters for hibernatePostalCodeDao and commandService }

Download
Feel free to use the complete version of CommandService in any way you see fit.

Conclusions
Watching your transaction boundaries can be very rewarding, both in code size as in performance. A command service, such as the one presented in this article can help you do that.

9 comments:

  1. Somehow I have a bit of a problem with this approach.

    Services should demarcate transactional boundaries of domain specific logic. With this approach you create a way to write logic in a layer above the service layer which might move logic to the 'wrong' place.

    But I agree, used correctly it is a powerful solution.

    ReplyDelete
  2. I think I understand what you're saying, but I am not sure you are using the right words.

    You always have the option to write logic above the service layer. This is in fact quite normal and fine as long as you put that logic in the domain objects (which, in my view, the service layer is a part of).
    So in other words, even though the service layer normally functions as the transaction demarcation, I see no reason that prohibits you from putting this function in another kind of domain object.

    Now the latter may be difficult. If you look at use case 3 in the article for example, I see no clear way to rewrite it such that the transaction boundary is again within a domain object. It should be possible though..., just have to think some more about this.

    ReplyDelete
  3. How is this different then using Spring's old TransactionalTemplate?

    ReplyDelete
  4. I didn't know about TransactionalTemplate, it sounds the about the same though.

    Btw, I can not find the API docs for TransactionalTemplate. Is it still supported?

    ReplyDelete
  5. I assume kristof josza means TransactionTemplate. Surely it's still supported; it's very useful.

    In the post Erik wrote the transactional logic still resides in the servicelayer; where it belongs IMHO.

    ReplyDelete
  6. If I understand correctly the spawned thread will not join the same transaction as the parent thread so to make the call transactional you add the @Transactional annotation to the Command class that will ultimately scope the Callable/Runnable methods with transactional boundaries. Does this mean that a new transaction is created when a new thread is spawned and the new thread will run within that new transaction scope?

    ReplyDelete
  7. Thanks for the response. I'm trying to get the new transactions to join the existing transaction in progress. Do you know of any solution?

    florin

    ReplyDelete
  8. Sorry, I have never ventured in that direction.

    ReplyDelete