Re: multithreaded cache?

From:

Daniel Pitts <newsgroup.nospam@virtualinfinity.net>

Newsgroups:

comp.lang.java.programmer

Date:

Tue, 15 May 2012 09:56:39 -0700

Message-ID:

<bVvsr.9500$TC4.4320@newsfe14.iad>

On 5/15/12 2:14 AM, bugbear wrote:

I'm using various Apache Commons Maps as a
multithread cache, protected using
ReentrantReadWriteLock, so that getting() uses a read lock,
and putting() uses a write lock.

But I've got an issue; in the
case of a cache miss (protected by a read lock),
the required value is acquired using the "underlying function"
that the cache is over; this value is then put() into
the cache (protected by a write lock)

This is all perfectly thread safe, and gives
correct results.

However, if the underlying function is slow
and/or resource hungry (consider cacheing
a ray traced image!) many threads can
end up calling the real function (second
and subsequent threads to the first get a miss
during the first threads call to the underlying function).

"obviously..." what I want is for only
the thread that FIRST has a cache miss
calls the underlying function, whilst other
threads (for the same key) wait.

This seems "obvious" enough that I'm guessing
there's a standard solution.

Googling led me to several "heavy" libraries;

This appears more a locking/cacheing issue
than a Java issue, although my implementation
is in Java.

Can anyone (please) point me at a
canonical solution/implementation?

BugBear

What I think I would do in this situation is have the "write" operation
be fast. It would put in a "FutureTask<V>" object, which hasn't been run
yet.

The FutureTask<V> class would basically be a "lazy loader". The reason
to do this is to be able to release the lock on the Map, while still
blocking the return of the get until the value is ready.

public V get(K key) throw ExecutionException, InterruptedException {
     readLock.lockInterruptibly();
     try {
         final Future<V> future = map.get(key);
         if (future != null) { return future.getValue(); }
     } finally {
       readLock.unlock();
     }
     final FutureTask<V> future;

     writeLock.lockInterruptibly();
     try {
         // We need to double check, to make sure no one else
         // has added the future.
         final FutureTask<V> cachedFuture = map.get(key);
         if (cachedFuture != null) {
           future = cachedFuture;
         } else {
           future = new FutureTask<V>(new
UnderlyingFunctionCallable<V>(key));
           map.put(key, future);
         }

     } finally {
      writeLock.unlock();
     }
     future.run();
     return future.getValue();
}

Note, since the "get" and "put" operations should be fairly fast, a
read/write lock may be over-kill, and the whole thing could be simplified:

public V get(K key) throw ExecutionException, InterruptedException {
    synchronize(mySync) {
         final FutureTask<V> cachedFuture = map.get(key);
         if (cachedFuture != null) {
           future = cachedFuture;
         } else {
           future = new FutureTask<V>(new
UnderlyingFunctionCallable<V>(key));
           map.put(key, future);
         }

     }
     future.run();
     return future.getValue();
}

This basically gives you "per key" synchronization, with the "whole map"
synch being only for an O(1) operation of "check for key, add place-hold
if absent".