Re: one interview question, 17 lines in java, 3 lines in ruby.
Lew wrote:
Well, my one is stored in bytes -- see a "Content-Type" field of my
message. :-)
That tells how your message is sent, not how your source is stored.
Well, not necessarily stored as such. But believe me, on my disk a copy
of my message is stored exactly as was transmitted, that is in bytes
representing a source characters (encoded using charset ISO-8859-2) you
(and others) have received later.
Similar is the original source file (C.java) of a published piece of
code, which size is exactly 177 bytes. The only minor difference
between the post message and the file is that the source code was
converted into bytes using Cp1250 charset, which in this particular
source code case gives exactly the same sequence of bytes, what using
ISO-8859-2 charset gives.
According to the Java Language Specification, Java source files are in
characters:
Programs are written in Unicode ...
and
Programs are written using the Unicode character set.
You see? Are *written*, not necessarily *stored* as such.
Other way around. JSL Chapter 3:
lexical translations are provided (??3.2) so that Unicode escapes
(??3.3) can be used to include any Unicode character using only ASCII
characters.
The /Unicode escapes/ are completely unrelated to what we are talking
about. They are being processed after conversion of a source file bytes
into characters (ASCII, or Unicode). In other words, they are already
characters -- called a /raw Unicode character stream/ -- which are
translated into other Unicode characters. Translation into sequence of
input tokens begins just after that translation.
But you right, to avoid confusions in our small contest, better is to
count characters. :)
The JLS requires it.
Nope. AIUI, I can _store_ the source code in whatever form I like, and
the JLS can not prevent me from doing that. The only requirement is to
instruct my compiler (normally using -encoding option) on how Unicode
(or ASCII) characters are encoded (as bytes) in my Java source files.
piotr