Re: Data structure to keep things in memory
On Sun, 28 Nov 2010, Mike Schilling wrote:
Without going into all the details, I have the following situation:
There are a set of objects RTS, all of the same type, that gather run-time
statistics about the running of a program. Various related objects with
different lifecycles pass them around and use them. I'm interested in when
each member of RTS is no longer being updated, because I want to collect its
final statistics , but there's no good way to tell when it's no longer in use
except via its collection, so when it's created I attach a weak reference to
it and associate the weak reference with a queue. When the reference is
delivered, I record the statistics. 
So far so good. The question is, what's the best way to keep the references
in memory until they can be delivered to the queue?
Extend the references to make them nodes in a doubly-linked list - give
them forward and back pointers to others of their kind. Put a pointer to a
reference somewhere in the rootset, perhaps right next to the reference
queue. When you create a reference, add it to the list. When you pull a
dead reference off the queue, remove it from the list - because the
reference itself is the link, this can be done directly, without needing
to do a lookup.
You can make the removal code slightly neater with the old trick of
hanging the list off a fake head/tail link which never gets removed, to
avoid having to special-case removing the true head or tail. I'm sure you
know that, but i mention it for completeness.
Both addition and removal need to be protected by locks. Adding something
to the list involves reading one pointer and storing three, with
essentially no computation, so although there may be contention for the
lock, it's unlikely to be a bottleneck. If it is, do what the concurrent
collections do, and stripe the work over multiple parallel lists. Removing
from the list shouldn't involve any contention, as removals will be well
spread out, and you may not even be removing in multiple threads.
I had a bit of a think about whether you could do this lock-free with
atomics, but i don't think it's a winning strategy; multiple threads
adding nodes to the same head will always compete, and if that doesn't
manifest as contention for a lock, it will manifest as retries in a
lock-free algorithm. Unless you can find a wait-free algorithm to do this;
i'll buy you a pint if you can.
eggflip, brandy, bits of Tia Maria, Beecham's powder, aspirin,
Benedictine, Alka-Seltzer, black currant juice, a touch of mustard and