Re: where's your Vector, victor
On 5/30/2014 5:30 PM, Jukka Lahtinen wrote:
Eric Sosman <esosman@comcast-dot-net.invalid> writes:
I'm "mostly" in agreement. For curiosity's sake, though, I ran
a few simple timing tests and found that Vector took about 35% more
time than ArrayList (for the mix of operations I tried).
"That's HU-U-U-GE!" somebody's shouting, but does it truly make a
difference? What else is the program doing, and how much time does it
spend on things other than List manipulation? Like, say, constructing
Well, even when the program is spending more time in other tasks, why
use a slow class when you can easily replace it with a faster one that
does the same thing?
(Shrug.) A friend of mine once referred to such micro-optimizations
as cleaning the cigarette butts and bottle tops off the beach so the
sand would be nice and clean around the beached whales.
For new code I wouldn't use Vector (unless it's required by some
other API I can't control), but I wouldn't bother editing an old Vector-
using class simply to change it to ArrayList or something. I'd need to
perform a *lot* of operations on the List to recoup the editing time.
Perhaps you're a faster typist? ;)
And about the synchronization: in many cases, when you need
synchronization, you will need to synchronize something more than just
the Collection manipulation. And if the Collection handling is already
within a synchronized block, why use one more synchronization block
inside another one?
In a multi-threaded setting you need to pay close attention to
exactly which manipulations need to be atomic. Vector or the Lists
produced by Collections.synchronizedList() will guarantee atomicity
for the individual method calls, but (as has been pointed out a few
times already) the program may require some combinations of those to
be atomic. The classic example is iterating over the collection:
Just because .add() and .remove() and so on (and the Iterator methods
themselves) are atomic doesn't imply that the entire iteration is so.
One approach, as you suggest, is to wrap the entire iteration
in a synchronized block:
synchronized (theList) {
for (Thing t : theList) {
...
}
}
.... and, as you say, any synchronization provided by theList itself
is pure overhead here: You could just as well use an unsynchronized
List of some kind. But then you would have to ensure that every
*other* use of theList provided its own synchronization: You would
need to write
synchronized (theList) {
theList.add(theThing);
}
.... and lie awake nights wondering whether you (or that dopey intern)
had forgotten the dance somewhere, or had accidentally synchronized on
the wrong thing. I put it to you that you *would* use a synchronized
implementation for theList, even though the extra synchronization during
iteration would be overhead -- and with all that extra sleep, you'd be
in a better mood at breakfast. :)
Finally, I question the wisdom of a program design that requires
iterating over a collection shared by multiple threads. You'll need to
hold the collection's lock for the entire iteration, including whatever
goes on inside the iterating loop, and it's a Bad Idea to hold locks
for "macroscopic" time. I think this is what Josip Almasi was getting
at with "or go to java.util.concurrent" -- which usually involves more
than just dropping in a ConcurrentLinkedDeque or something, but requires
a fresh look at what the program really, truly needs to do.
Back to the original question: Vector is disparaged for two main
reasons: It's slower than ArrayList, and it's not fashionable. The
first reason is more often cited, but the latter probably has more
weight.
--
Eric Sosman
esosman@comcast-dot-net.invalid