Re: how actually string store in java machine
On 3/28/2014 9:58 AM, taqmcg@gmail.com wrote:
On Thursday, March 27, 2014 11:50:22 AM UTC-4, Eric Sosman wrote:
On 3/27/2014 11:23 AM, taqmcg@... wrote:
On Thursday, March 27, 2014 8:54:36 AM UTC-4, Eric Sosman wrote:
On 3/26/2014 7:37 PM, markspace wrote:
On 3/26/2014 10:54 AM, Eric Sosman wrote:
...
I think so, too. It feels like an implementation detail
that might better have been left unspecified.
But that would mean that the behavior of programs that compared
constant >strings was subject to subtle implementation dependencies,
anathema to the spirit behind the development of Java. I'd suggest
the fact that the designers spelled this out is a testament the to
the concern they had to making Java a really portable language and
the thoroughness of their analysis.
Yes, it would change the behavior -- of programs that relied
on the dubious practice of comparing "value" objects with == rather
than with equals(). All I'm saying is that I think it would have
been better not to encourage the dubious practice. Too late now,
of course.
As for "subtle implementation dependencies" -- Well, Java is
certainly not free of them. It has fewer such dependencies than
many other languages I've used, but still has enough to make life
interesting.
Personally I've been very impressed with how carefully consistency of execution is addressed in the Java design. The only places I've seen real implementation dependenciy is in the interaction with the world external to the Java environment (notably the file system) and multithreading. Not sure that there's really anything one can do about the first. I don't use multithreaded tasks much but I gather the technology was not sufficiently developed to fully specify this. Regardless my sense is that most of the variability in multithreaded applications is not so much differences in implementation of threading but in the interactions between the threads and the scheduler which again might be thought of as something external.
I'd be interested in understanding other areas where Java programs can legally give different answers absent some difference in external inputs. The only one I'm aware of is the non-strict handling of floating point. They tried to enforce consistency there originally -- strict was the original requirement, but the performance cost there was simply too great and the tiny degree of implementation dependency that's allowed there now affects a miniscule fraction of calculations.
My sense of the spec was not that they wanted to encourage users to use == to compare constant strings -- nor to discourage it -- but they recognized that it was a legal operation and so that it needed a defined result. From that perspective they might have chosen that constants in different classes would use different instances. I don't know that that would be a better or worse choice, but either would -- from the perspective of Write-once, run many times -- be better than leaving it unspecified. I used the word anathema above, and I think that was appropriate for their (i.e., the designers of Java) view of leaving something undefined as a mechanism to discourage its use.
As I said, Java behaves more consistently across platforms than
other languages in my experience. "Write Once, Run Anywhere" was a
goal of the language, but like "Don't Be Evil" it's not a goal that
was attained in perfection. A few examples of variability:
- Integer.valueOf(int): "This method will always cache values in
the range -128 to 127, inclusive, and *may* [emphasis mine]
cache other values outside of this range." Hence, a test
like `System.identityHashCode(Integer.valueOf(200)) ==
System.identityHashCode(Integer.valueOf(200))' may yield
either true or false, depending on the implementation. Much
the same holds for other primitive wrapper classes, too.
- HashMap: "This class makes no guarantees as to the order of
the map; in particular, it does not guarantee that the order
will remain constant over time." Also, "Note that the fail-fast
behavior of an iterator cannot be guaranteed."
- Map: "The behavior of a map is not specified if the value of
an object is changed in a manner that affects equals comparisons
while the object is a key in the map." Also, "Implementations
are free to implement optimizations whereby the equals invocation
is avoided, for example, by first comparing the hash codes of the
two keys," so you might *or might not* hit a breakpoint (etc.)
in an equals() method when searching or inserting.
- Class.getMethods(): "The elements in the array returned are
not sorted and are not in any particular order." You might
get differently-ordered arrays for the same class from
different JVM's, and the same goes for getConstructors()
and so on, too.
- Evaluate `Object.class.hashCode() > System.class.hashCode()'.
Discuss.
I'll grant that a "sane" program would not be affected by these
or similar implementation dependencies. It seems to me, though,
that a "sane" program wouldn't compare String instances with `=='.
--
Eric Sosman
esosman@comcast-dot-net.invalid