Re: Concurrent bidirectional one-to-many map?

From:

Tom Anderson <twic@urchin.earth.li>

Newsgroups:

comp.lang.java.programmer

Date:

Sun, 8 May 2011 04:51:58 +0100

Message-ID:

<alpine.DEB.2.00.1105080428540.28763@urchin.earth.li>

On Sat, 7 May 2011, Patricia Shanahan wrote:

On 5/7/2011 2:43 AM, Sebastian wrote:
...

To give an example, I'm trying to solve a problem like this:
Associate tasks with workspaces, where a workspace may hold many
tasks,but a task may be associate with at most one workspace.

...

I'd deal with that sort of problem by having a custom data structure
that uses java.util structures in its implementation.

That's an eminently sensible course of action.

The question then is what structures does it use, and how? Sebastian was
quite specific about the behaviour he needs from this custom structure;
simply telling him he should use standard structures to build it is not
that much more helpful than telling him he should use classes and methods
to build it.

I say this because this is not - that i can see - one of those cases where
the solution is a simple matter of combining existing parts. The central
problem is actually quite a tricky one; managing a bidirectional mapping
is one thing, doing it in a threadsafe concurrent manner is not that much
harder, nor is making it one-to-many, but i think the combination of
efficient thread safety and one-to-manyness is actually pretty tricky. You
can make it work, and work safely, by using a coarse-grained lock over a
suitable agglomeration of data structures (a map of keys to sets of
values, plus a backwards map of values to keys, say), but that has very
poor concurrency.

It's not obvious to me how to make it work correctly with good
concurrency; the ConcurrentHashMap approach is to stripe the locks, so as
to partition the hashtable into independently-locked domains, but to use
the same set of stripes for the reverse mapping, you would need to be able
to tell which stripe a value belongs to - and that can only be done by
looking it up in the very reverse mapping you are worrying about locking!
Can you have a separate set of stripes for the reverse mapping? Do you
need some multi-phase approach, where you determine the stripes without
locking, then acquire the locks to do the actual lookup or insertion? Is
there some lock-free voodoo which could be used? Is there any mileage in
using a union-find structure to collapse the sets of values into a single
representative which could participate in a 1:1 key-value mapping? I am
certainly not any sort of data structure guru, but i don't have answers to
any of those questions. That makes it a problem worth discussing here, in
my book. Does anyone have any ideas on how to do this?

tom

--
No man ever steps in the same river twice, for it's not the same river
and he's not the same man. -- Heraclitus