Re: Vector (was Re: Change character in string)
This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.
---910079544-1088558220-1237169236=:22189
Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8BIT
On Sun, 15 Mar 2009, Little Green Man wrote:
On Mar 15, 10:05?am, Tom Anderson <t...@urchin.earth.li> wrote:
static boolean order(List<?> a, List<?> b) {
? ? ? ? // this is the only tricky bit
? ? ? ? // but as long as it's consistent, it doesn't matter how you do it
}
Sun should add a construct
synchronized (x, y, z, ...) {
}
that allows any number of items in the parentheses and always locks
the same set of objects in a deterministic order.
As for how this could be implemented.
One way adds four bytes (eight on 64 bit systems) to an object: add an
"object number" hidden field that is globally unique. (It might also be
used as the identity hash, making those unique, possibly improving some
hash table performance). These might be issued sequentially in
allocation order, but that requires all allocations to go through a
global lock. Smarter might be to partition the space of possible object
numbers into non-overlapping ranges that are assigned to the
thread-local allocation buffers; objects allocated from these get
numbers from the associated range. When a TLAB needs to be replenished,
it also gets a new, previously-unused range. (And the TLAB now "needs to
be replenished" if it can't satisfy an allocation *either* of memory
*or* of an object number). This returns the performance of allocation
close to what it is in present Hotspot VMs.
See if you can find the thread where we dug up and interpreted the
implementation of identityHashCode - it's almost exactly as you've
described. Except that numbers are only assigned when an object's identity
has is first asked for, not at allocation, thus avoiding the work of
assigning numbers to objects which will never need them.
Another way would use the object's hashCode return, with tie-breaking by
assigning an "object number" on the fly to only the objects that come up
in ties (which gets reused for the same objects if they come up again).
A global lock only has to be acquired when a particular object comes up
in a tie for the first time. Tiebreaking could proceed first by hashCode
and then by (if different) identityHashCode, further reducing (in some
cases) the need to acquire that global lock.
The trouble with that is when you're dealing with mutable objects: a
change in state may mean a change in hashCode order, which means that two
attempts to lock the same pair of objects at different times may go in
different orders, which could lead to deadlock.
There's also a consideration with long-running systems eventually
going through the entire pool of numbers (though it might take an
astronomically long time in the 64-bit case). Two solutions to *that*:
* Use 64 bits no matter what, and pray to one of your cultures'
deities.
No - if we're going to come up with barmy schemes, we might as well come
up with ones which are guaranteed to work.
* In combination with the new "G1" collector, associate ranges (and
presumably TLABs) with GC spaces within the heap; whenever that
space
is free of live objects, that range of object numbers is noted as
available for reuse. Indeed, the spaces and the ranges can be
permanently associated to one another. Objects move less in G1, so
the
initial address of an object can be used most of the time. When an
object has been moved and is still live, though, the same initial
address mustn't be reused. If there is an efficient way to detect
that
such an object exists (with the GC presumably keeping track), the
address of a new object allocated at the same spot can be
"perturbed"
to give the "object number". The lowest three bits and many of the
highest bits are all likely to be zero, so flipping bits 0, 1, 2,
and
then 63 (or 31), 62 (or 30), etc. might be used to displace the
numbers of objects initially allocated in the same spot from one
another.
I think the detection would be the hard part.
Another option might be to have some kind of monotonically increasing
number - something like time, although it only has to increase when things
get moved - and use that in the high bits, so that objects allocated in
the same place at different times don't get the same number.
In fact, if you just the the offset of the object in the block in the low
bits, then this works as long as every block into which things are
alllocated gets a unique prefix for the high bits. So, you have a global
counter, and every time you designate a block as a nursery, you use that
counter to issue it with a prefix. If allocation blocks are 2**n bytes in
size and you allocate on 2**m-byte boundaries, you have wordsize - n + m
high bits to play with; with 1 MB blocks (a number pulled randomly from
the air) and 8-byte boundaries, that's 17 low bits, and 15 high bits on
32-bit machines, or 47 on 64-bit machines. 2**15 is probably not enough;
2**47 is.
* If an OS exists that can hand out huge virtual address spaces to
processes and efficiently use only as much page file and RAM as the
process uses from that address space, however distributed, then JVMs
on that platform can just allocate all objects at unique addresses!
As long as pages are bigger than objects, this is going to lead to
fragmentation - you end up keeping a whole page in core just for one
object.
A less efficient but robust solution is to associate with each object
a unique *ternary* string, stored using the bit-pairs 01, 10, and 11,
with 00 a sentinel value, and making this get as long as is needed.
Long-running systems will slowly lose efficiency of allocation,
identity hash codes, and multi-synchronization but never acquire the
risk of deadlock.
The ternary string with terminator exceeds 32 bits at 15 ternary
digits stored, so this becomes worse than the current situation, space-
wise, at around 3^15 = 14,348,907 objects allocated. That is a
dismayingly small number. But it only becomes worse, space-wise, than
64-bit binary integers at 3^30 = 205,891,132,094,649 objects
allocated, or a couple hundred trillion. (At this point, the ternary
strings jump from 62 bits to 65 bits, exceeding 64, at two bits per
ternary digit plus two sentinel zero bits.)
Speed-wise, it is slightly worse from the get-go due to the more
complex operations to be performed.
Normal speed efficiency can be restored by using integers of machine-
native size up until -1 (all Fs in hex; the "maximum" before it wraps
to numbers already used) and then layering on an alternative scheme
for all subsequent objects (which are marked by all having -1 in the
"object number" field). Ternary strings is one such alternative
scheme. So is a second integer field, used until it becomes -1, then a
third, and so forth, or a zero-terminated string of bytes or words
used as a bignum, or something. (A bignum that only has to support
order-comparison and addition, simplifying things somewhat.)
I am filing this under very clever but unlikely to be a good idea.
Since the deadlock risk only arises while a multi-synchronized is
actually in the process of acquiring locks, though, we can do a few
tricks:
Exactly. I think the idea of allocating persistent, unique-for-all-time
identity numbers to objects isn't necessary - as long as you can get the
GC and locking subsystems to cooperate. Bear in mind that it's doesn't
matter in the slightest if we lock a pair of objects in a different order
at different times, as long as we never do it at overlapping times. If we
do lock a-then-b, unlock both, then lock b-then-a, we're fine.
* Multi-synchronized may start by flagging all of the objects, then
proceed to get their numbers, then begin acquiring locks, then
unflag the objects. The garbage collector will refuse to move
objects
that are flagged at the time, waiting until the next GC or just
waiting until the object is unflagged. A volatile boolean suffices
for
the flag, obviating the need for global locking.
Risky - a contended lock could hold up movement, which sounds like bad
news.
* A GC moving objects can cause multi-synchronizeds that are in
progress
to abort -- that is, the lock-acquisition behavior is viewed as a
transaction. If it acquires a few locks but hasn't got them all when
a
GC occurs that moves an object, it releases all the acquired locks,
then begins acquiring locks all over again, in (possibly-changed)
sequence order. Since none of the objects has been modified by the
*body* of the synchronized block yet, releasing and reacquiring
locks
doesn't violate thread-safety.
Better. This reminds me of Richard Gabriel's PC losering story, and the
winning solution adopted by unix:
http://www.jwz.org/doc/worse-is-better.html
My solution would be to be a bit more MIT, and say that (a) multi-locks
are acquired in allocation order and (b) if objects which are currently
multi-locked (or perhaps just locked at all, for simplicity) are moved,
they must maintain their allocation order with respect to other locked
objects. If we arrange things so that memory is divided into zones with
known non-overlapping ranges of the allocation order (which corresponds
nicely to generations), then that just means taking care when moving
objects from one zone to another, or within a zone, to keep locked ones in
the same order in memory. This avoids the need for identity numbers at
all.
Although i don't know what happens if a multi-lock is underway during a
move; i worry that two threads could end up with different ideas about the
order, but i *think* that as long as they pay attention to when moves
occur, and adjust their locking sequence accordingly, it's okay.
Making all moves, regardless of locking, order-preserving would avoid any
problems. There's even a fighting chance that this is already what
happens.
tom
--
There is no violence or enmity in the LEGO universe until you, the
builder, decide what to build with the pieces. -- Pyrogenic
---910079544-1088558220-1237169236=:22189--