Re: Why do you deserve a better IO library

From:

"kanze" <kanze@gabi-soft.fr>

Newsgroups:

comp.lang.c++.moderated

Date:

15 Jun 2006 11:02:57 -0400

Message-ID:

<1150359643.178974.202390@y41g2000cwy.googlegroups.com>

Alf P. Steinbach wrote:

* kanze:

Alf P. Steinbach wrote:

as mentioned, you cannot implement "cat" for Windows using
only standard C++ functionality. That's because you cannot
access the underlying binary i/o functionality of the
standard input and output streams: you can only access an
interface that's on top of a modal translation engine where
the mode cannot be changed (all three aspects are horrible!
and are not things that occur naturally, they must be
intentionally designed in).

But that's something we inherited from C, and that we won't be
able to change as long as our IO is defined in terms of and by
reference to the C standard.

Currently there's no assumption of e.g. std::cout being based
on stdout.

Directly, no, but the semantics of ostream are defined in terms
of printf, the semantics of filebuf::open are defined in terms
of fopen, and so on.

On one hand, this is a good thing. It ensures that there are at
least two languages (C and C++) with compatible file semantics.
On the other hand, it also ensures that where C is broken, C++
is too -- the undefined behavior when reading numeric values is
an obvious example (and one that would have been trivial to fix,
without breaking the link with C otherwise).

The problem is that that infernal translation, or rather, the
/possibility/ of such translation occurring, has been
misguidedly designed in for C++ i/o, in the same way as in the
C streams.

The problem is that iostream, like FILE*, tries to be all things
to all people. Logically, we need separate types for binary and
text IO.

But this doesn't address your complaint that you can't write cat
in standard C++ (or in standard C). The problem there is more
fundamental, and unavoidable if we accept the constraint that
C++ must be implementable on systems other than Windows and
Unix.

The only positive effect is to make it "easy" for a C++
implementor to reuse the C library's translation, and for that
we pay the price of totally crippled i/o functionality (so
that it's often not used anyway).

The other positive effect is that we do have two languages with
the same file semantics. That we can (or should be able to)
easily read files -- including binary files -- in one language
that were written in the other. If your reference is Unix, this
seems obvious, since Unix only has one file type. Windows is
similar, and the only problems between Windows and Unix are due
to C++ imposing the non-standard interpretation of '\n' adopted
by Unix internally. But once you leave these two, things get a
lot trickier.

[snip]

I think you're missing the point that the distinction
binary/text is based on a distinction of actual file types
in many systems. This means that there really isn't anyway
to handle it otherwise.

Sorry, James, that's incorrect.

Have you actually used C or C++ on a mainframe? I have, and I
can assure you that a file written in binary mode cannot be
opened in text mode, and vice versa.

There are systems for which stream i/o isn't possible, but
that doesn't stop us from having it in the standard. There
are /conceivably/ systems where binary i/o isn't possible, but
hey, it's in the standard anyway. There's no system where
static type checking can guarantee that you're using only
allowable operations for the right kind of file: that's a
run-time thing, always with the possibility of run-time
failure, and it provides an abundance of options of how to
handle it, very otherwise.

OK. If I understand you correctly, what you are asking for is a
function (in filebuf, at least) which always specifying the mode
of an open file. If the system can't handle it (because we're
changing the mode, and the system doesn't allow that), the
function returns an error code, much as if we tried to open a
text file in binary mode on such a system.

That sounds legitimate, and not to difficult to implement. Just
returning an error would be a legal implementation, although
under Unix or Windows, QoI would require something more. I
could knock up a proof of concept implementation under Unix in
about five minutes, since all the function would do is return
successfully:-); maybe someone who has actually implemented
iostream under Windows could comment on how difficult it would
be there. (It occurs to me that use of the function could be
restricted to before the first actual IO, like modifications of
the codecvt facet are. I can imagine that handling already
buffered input from a non-seekable device would be rather
tricky.)

I can see one problem: although not required by the standard,
any quality implementation will keep std::cin and stdin in sync.
How do you do this if you have this function?

Just to show how silly that "can't be done otherwise because
of system XYZ" really is, consider systems with only fixed
size record files. Can't have anything else than fixed size
record i/o in the C++ standard library, because really, it
can't be handled otherwise.

The standard was conceived in such a way that it could be
implemented using fixed size records. In fact, the first system
I actually used C on had fixed sized records -- really fixed
sized, in fact, since they were always 128 bytes, regardless of
the file. This is why binary files may be padded at the end,
and why trailing white space in a text file may disappear when
read. (Curiously enough, the only reference in the C standard
that I can find to a limit on line length is in a non-normative
footnote. Obviously, however, if the implementation map lines
to records, and the system has a maximum record size, you have a
maximum line length.)

--
James Kanze GABI Software
Conseils en informatique orient?e objet/
                    Beratung in objektorientierter Datenverarbeitung
9 place S?mard, 78210 St.-Cyr-l'?cole, France, +33 (0)1 30 23 00 34

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated. First time posters: Do this! ]