Re: Why "lock" functionality is introduced for all the objects?

From:

Lew <noone@lewscanon.com>

Newsgroups:

comp.lang.java.programmer

Date:

Tue, 28 Jun 2011 16:20:14 -0400

Message-ID:

<iudd1t$9mu$1@news.albasani.net>

On 06/28/2011 02:52 PM,
supercalifragilisticexpialadiamaticonormalizeringelimatisticantations wrote:

On 28/06/2011 2:33 PM, Lew wrote:

On 06/28/2011 02:23 PM,
supercalifragilisticexpialadiamaticonormalizeringelimatisticantations
wrote:

On 28/06/2011 2:13 PM, Lew wrote:

Michal Kleczek wrote:

Lew wrote:

Show me the numbers. What penalty?

It is (almost) twice as much memory as it could be and twice as much GC
cycles. Almost because in real world the number of objects that you
need to

Nonsense. It's an extra 4 bytes per object. Most objects are much larger
than 4 bytes,

Bullpuckey, other than that a nontrivial object is always at least 12
bytes

So 4 bytes overhead is less than 100%, as I said.

I didn't dispute that. I disputed "most objects are much larger than 4 bytes".
Most objects are only a little bit larger than 4 bytes.

And yet you go on and on and on about how much larger than 4 bytes they are,
yourself. Which is it?

By your count, objects are at a minimum 2 to six times larger than four bytes,
even discounting the 4 bytes for the monitor, just for overhead alone.

So again, which is it?

Most strings in a typical program are non-empty and generally longer
than two bytes.

A lot longer, though, or only a little?

According to you, a lot longer.

A good percentage are interned.

Not in my experience.

In most of the programs I've seen they are. (Where "good" means something
large enough to notice.) String literals alone abound in all the real-world
Java code that I've seen. Dynamic string variables exist, too, of course, and
I'm not claiming that a majority are interned.

Strings in many runtime contexts refer to substrings of those already
in memory, saving overhead.

Not in my experience. And substring sharing is a three-edged sword, with two
possible downsides:

1. A small string hangs onto a much larger char array than is needed,
the rest of which is unused but can't be collected.

2. Small strings are even less efficient if one adds an offset as well
as a length field to the string, besides the pointer to the char
array.

And let's not forget that a string incurs the object overhead twice, once for
the string and once for the embedded array, assuming that array ISN'T (and it
usually isn't) shared.

But the overhead of the monitor is still only 4 bytes, less than 100% of the
object size.

From what you're saying, less than 100% of the object header alone.

Ergo the claim that the monitor doubles the allocation size is bogus.

So we're looking at one object header going along with 12 bytes of offset,
length, pointer to array; then another going along with 4 bytes of length and
2 per character for the actual array contents. For an eight-character string
we have 16 bytes of actual data and 32 bytes of overhead from two redundant
(if the array isn't shared) length fields, an offset field, a pointer, and two
8-byte object headers. That's 33% meat and 67% fat, folks. For an EIGHT
character string. A substrings-are-separate implementation fares somewhat
better: eight byte object header, four byte pointer, eight byte object header,
four byte length, array contents, for 24 rather than 32 bytes of cruft. Still
60% overhead. If Java had const and typedef and auxiliary methods so you could
just declare that String is another name for const char[] and tack on the
String methods, you'd get away with just 12 bytes of overhead: array object
header and length field. Now the 8 character string is actually more than 50%
meat instead of fat. Well, unless you count all the empty space between the
probably-ASCII-bytes ... encoding them internally as UTF-8 would save a lot
more space in the common case. Maybe we should assume that only about 60% of
the space taken up by the actual chars in the string is non-wasted, presuming
a low but nonzero prevalence of characters outside of ISO-8859-1; now a ten
character string has four wasted bytes internally, plus the object
header/various fields of overhead. Still somewhat icky.

Java strings are quite inefficient any way you slice 'em. But at least we can
get their lengths in O(1) instead of O(n). Take *that*, C weenies! Oh, wait,
most strings are short and it wouldn't take many cycles to find their lengths
at O(n) anyway ...

So that 4-byte overhead for a monitor is looking like less and less of a
problem by comparison.

Aren't you proving my point that objects are much larger than 4 bytes? We're
talking the overhead alone is 600% of the monitor overhead. That's "much
larger" in my book.

You're providing evidence for my point. Thanks.

Integer objects make up a small fraction of most programs. Many Integer
instances are shared, especially if one follows best practices. Not a
lot of memory pressure there.

Not my experience again, not since 1.5. Before autoboxing was introduced you
might have been right; now I expect there's a lot of "hidden" (i.e., the
capitalized classname doesn't appear in code much) use of boxed primitives,
particularly in collections.

Most of which are shared, and best practice militates against autoboxing so
that scenario you describe represents bad programming. I was speaking to the
overhead assuming good programming. I mentioned that earlier.

You show only that the overhead of 4 bytes per object is less than 100%
of the object's memory footprint, which is what I said.

Keep on attacking that straw man ...

You're bringing in the straw man. The OP claimed that monitors doubled the
memory need for objects. This is the point I addressed, and therefore is not
a straw man. Calling it a straw man doesn't make it one.

You have, in fact, provided substantial evidence for my claim that the monitor
presents far less than 100% overhead.

How is directly addressing the main point remotely classifiable as a straw-man
argument?

Which footprint can be reduced by HotSpot, to the point of pulling an
object out of the heap altogether.

???

It's called "enregistration", and it's one of the optimizations available to
HotSpot, as is instantiating an object on the stack instead of the heap.

Where are the numbers? Everyone's arguing from speculation. Show me the
numbers.

Real numbers. From actual runs. What is the overhead, really? Stop
making shit up.

Stop accusing me of lying when I've done nothing of the sort.

Yet you don't show the numbers.

What other conclusion can I draw?

Tell verifiable truth if you don't want to be called to account for fantasy
facts. Don't get mad at me for pointing out your failure. Get mad at
yourself for the failure.

Show me the numbers.

http://c2.com/cgi/wiki?JavaObjectOverheadIsRidiculous

People ran tests and found an 8 byte overhead per object, much as was claimed
earlier in this thread. Oh, and that an array of java.awt.Points containing
pairs of ints is 60% overhead and 40% actual integer values in the limit of
many array elements -- so array-related overheads (object header, length
field) go away. That suggests something eating another 4 bytes per object *on
top of* the Points' object headers and integer values, showing that Point has
some extra field in it taking up space.

That was in Java 1.2.2, pre-HotSpot, and includes more than the 4 bytes
overhead of the monitor, which was the topic under discussion here despite
your attempts to reframe it.

Michal Kleczek had written:
"It is (almost) twice as much memory as it could be and twice as much GC
cycles."

I said that was "nonsense", to which you replied "Bullpuckey", then proceeded
to demonstrate that I was correct. Thanks.

And how is that a straw-man argument on my part, again? Given that I directly
addressed that claim and you yourself provided evidence for my point? Hm?

And in regard to the original topic of this thread,

http://c2.com/cgi/wiki?EveryObjectIsaMonitor

raises some very good points, including that forcing people to use the
java.util.concurrent classes (while making "synchronized" exception-safely
lock a ReentrantLock, or similar) or having objects only be lockable if they
implemented a Lockable interface or inherited a Monitor class would have
resulted in code having to document its concurrency semantics and explicitly
declare which objects and which types of objects were meant to be used as
monitors; this more-self-documenting-code point in which intended-to-be-locked
is part of something's type and subjected to type safety was not raised in
this thread. Until now.

I'm not defending the decision to make every object a monitor, other than to
point out that it contributed mightily to Java's utility and popularity.
Surely there are better ways, despite the efforts of others to claim that I
said otherwise. But I am refuting the claim that the monitor adds 100% to the
object's memory footprint.

Meanwhile no one is showing me the numbers for non-obsolete Java in actual
runtime scenarios, taking HotSpot and realistic loads into account, nor has
anyone addressed the benefit side of the equation. The addition of monitors
to Java has a benefit. Is it worth the cost? That depends on the actual
cost, and the actual benefit, quantification of which is not in evidence in
this discussion.

A point you nor anyone else has yet to address, choosing instead to divert
with side issues and straw men. That'd be you bringing in the straw man, not
me, dude.

Show me the numbers.

--
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg