Re: mixed-sign arithmetic and auto

From:

Walter Bright <walter@digitalmars-nospamm.com>

Newsgroups:

comp.lang.c++.moderated

Date:

Tue, 15 Jan 2008 07:52:30 CST

Message-ID:

<-92dnbFuPZ_oixHanZ2dnUVZ_r2nnZ2d@comcast.com>

Jerry Coffin wrote:

Yes, it would. It's entirely possible and indeed quite practical and
reasonable to write C and C++ code that works perfectly fine depending
only upon what C and C++ guarantee for char (for one example). The fact
that under some circumstances char might be 16, 32 or even 64 bits
doesn't bother the code a bit. Even code that is written to depend on 8-
bit chars is usually quite easy to convert to remove that dependency.

Except when you're doing the (very common) practice of using char types
to do byte manipulation. Or if you're trying to deal with Unicode encodings.

D goes the opposite route: by guaranteeing that char will always be 8
bits, it (tacitly?) encourages people to write their code to depend on
that fact. It's a bit difficult to guess at how easy it is to write D
code that won't break with other sizes of char, since there's apparently
no way to test such code right now.

If you were worried about that, you could:
   typedef char mychar; // 8 bit chars
or:
   typedef wchar mychar; // 16 bit chars
or:
   typedef dchar mychar; // 32 bit chars

I suspect that this is an even better solution for those programming
such machines, because they won't be under the delusion that code that
has never been tested under such conditions would have been "portably"
written with some wrong notion of portability.

I think this is a delusion. People who normally work with such machines
are unlikely to be deluded about the amount of code that's really
portable.

I agree that people who are used to porting between machines A and B
will have long since figured out how to write code that is portable from
A to B. My point was for people who had no experience with such ports
attempting to write code that is portable.

Writing code that is portable in C++ is a process of accumulating
techniques, tricks, and methods over a period of time from experience.
Just reading the spec isn't sufficient (and few C++ programmers have
actually read the whole thing).

Nonetheless, if somebody _wants_ to do so, they most certainly
_can_ write C and/or C++ code that works fine with various sizes of
char. With D they can't do any such thing, because anything with a
different size of char, by definition, isn't D.

See the above typedef's.

A more relevant question is how many programmers are programming for
these oddballs, vs programming for mainstream computers?

That, of course, is a difficult question to answer. The ratio of
programmers to CPUs is almost certainly smaller than with mainstream
computers, but sales volume is also _quite_ high in many cases, making
it difficult to figure out the actual number.

The sales volume for embedded processors is quite meaningless in
determining the number of programmers on it. It's just as worthless as
determining the number of Apple engineers there are by counting ipod sales.

UB does not imply reliable or repeatable behavior, so any dependence on
UB is inherently unreliable _and_ unportable.

This is nonsense and you know it. UB does not _imply_ reliable or
repeatable, but it does not preclude either one. The standard
specifically allows an implementation to define the result of particular
undefined behavior -- and when portability isn't a concern, the fact
that it's defined only by the implementation is entirely irrelevant.

The compiler implementor can guarantee whatever he wants to, but the
specification does not guarantee anything with regards to UB. Erratic,
random behavior is certainly allowed by the spec, and indeed happens
with UB behavior like buffer overflowing.

But when you've got a
million lines of code, suddenly even the obscure cases become probable.
And when you don't have a thorough test suite (who does?) how can you be
*sure* you don't have an issue there?

Being truly "*sure*" of anything is pretty rare -- even if it's required
by the language, it's nearly impossible to be sure you couldn't run
across some obscure bug in the compiler.

I agree, but that doesn't justify not doing what we can to improve the
odds of things being correct. Boeing can never guarantee their planes
won't crash, but they work very hard at addressing every cause of
failure that they know about.

I'm very interested in building languages which can offer a high degree
of reliability. While D isn't a language that gets one there 100%, it
gets a lot closer than C++ does.

I've yet to see convincing evidence of that -- though I'll admit I
haven't looked very hard, and such evidence would be off-topic here in
any case.

I don't believe such evidence would be off-topic in a discussion about
whether C++ should fix its UB problems or not.

I would vote for C++ to explicitly ditch 16 bit support. No problem there!

To accomplish what?

To get rid of the possibility of 16 bit 'int' types.

For most practical purposes, 16-bit platforms were
"ditched" when exception handling was added. OTOH, if somebody's willing
to go to the time, effort, etc., of implementing it for a 16-bit
platform, and meets all the requirements, more power to them.

Digital Mars C++ supports 16 bit code with exception handling. It works
technically, but is not practical. I agree that 32 bits is needed for EH.

While
portability is a useful attribute, there are also many perfectly good
reasons to write code that's not portable.

I agree. But each issue of UB and IDB should be evaluated in terms of
its cost and benefits. C++ has too much UB and IDB that has signficant
costs, and benefits which are dubious. D also has UB and IDB, just a lot
less of it.

On Intel, I believe it worked fine with every
compiler I had handy at the time as well (at least with MS, Borland,
Intel and GNU).

Windows compilers are pretty compatible in their handling of UB and IDB,
quite deliberately so as each tried to lure customers away from their
competitors. It's a lot easier to lure a customer if their code compiles
and *works* without changes.

Hence, evidence of portability among Windows C++ compilers is not much
evidence in favor of allowing UB and IDB.

I define it as the likelihood of if a program compiles *and* works on X
that it will also compile *and* work on Y, regardless of how good the
programmer is.

That makes it sound a great deal like D probably won't support most of
what I develop at all.

I was defining portability with that statement, not D. D isn't intended
to be 100% portable. But D does intend to remove non-portable aspects of
language design whose costs exceed the benefits.

1) reliance on UB or IDB is not mechanically detectable, making programs
*inherently* unreliable

You're overstating the situation. Certainly some reliance on some UB
and/or IDB is mechnically detectable.

Runtime integer overflow isn't.

Yes and no. It can detect when runtime integer overflow becomes a
possibility, and even (for example) show the source of the values that
could result in the overflow.

This could result in overflow:

int sum(int a, int b) { return a + b; }

A compiler that nagged about this would be more of a nuisance than a help.

A typical C++ compiler has a bewildering array of switches that change
its behavior.

Change what behavior?

man g++

will list 40 of them under "C++ Language Options".

How much effort have you seen, time and again, going into dealing with
the implementation defined size of an int? Everybody deals with it, 90%
of them get it wrong, and nobody solves it the same way as anybody else.
How is this not very expensive?

I've seen a lot of effort put into it repeatedly, but I'd say over 99%
of the time, it's been entirely unnecessary from beginning to end.

In D, the effort to deal with it is 0 because the problem is defined out
of existence.

I suppose that depends on your viewpoint. At least in Java, the attempt
at defining the problem out of existence has created a problem that I'd
say is at least twice as bad.

The size of an int is a bad problem in Java?

Ideally, if a program
compiles, then its output should be defined by the language.

I disagree. There is a great deal of room for non-portable code in the
world. Attempting to define the output for all code is pointless and
foolish.

D doesn't eliminate all UB or IDB. But it does try to eliminate all the
ones where the costs exceed the benefits.

In other words, C++ has de facto standardized around 32 bit ints.

I disagree. If anything, the prevelance of 32-bit ints has created
problems, not cured them.

What problems?

languages like Java and D don't even allow that.

In D, you can use a variable sized int if you want to:
typedef int myint;
and use myint everywhere instead of int. To change the size, change the
typedef. Nothing is taken away from you by fixing the size of int. It
just approaches it from the opposite direction:

That doesn't fix the problem.

Why not?

C++: use int for variable sizes, typedef for fixed sizes
D: use int for fixed sizes, typedef for variable sizes
Java: doesn't have typedefs, oh well :-)

You've got things backwards: in C++ you get code that works correctly on
different sizes of machines, unless you take steps to stop it from doing
so. In D you get code that works correctly on different sizes of
machines only by going through massive brain damage to undo its
mistakes.

A typedef is massive brain damage? I'm not following this at all.

Contrary to your previous claims, targets you see fit to
ignore have not gone away, nor are they likely to do so anytime soon.

I think 16 bit DOS and 36 bit PDP-10's are dead and are not likely to
rise from their graves.

So why did you make statements about 16-bit DOS as if it was a relevant
target?

I used it as an example of where C++ tried to be portable, but failed.
C++98 *did* try to accommodate 16 bit targets.

Yes, and Digital Mars C++ does, too. I know of nobody who actually tests
their code using those switches.

You do now!

There's one!

Most switches of that sort are of limited utility anyway because they
screwup the abi to existing compiled libraries.

What problems have you verified along that line? I've used unsigned char
quite a bit without seeing such problems.

Well, anything that depends on CHAR_MAX and CHAR_MIN being constant
throughout the program, for example.

The C++ compiler for sharc has many sharc specific extensions. It isn't
hard to imagine a D variant that would do the same. You'd have the same
difficulties porting C++ code to the sharc C++ compiler as you would
porting standard D to sharc specific D.

You've got things backwards -- while the SHARC extensions certainly make
it easy to write C++ code for it that won't port elsewhere, they do NOT
prevent portable code from working on the SHARC.

That's kinda obvious, code that's portable to the SHARC works on the
SHARC <g>.

In the D case, however, there seems to be no way to write code that's
reasonably portable TO the SHARC.

If you eschewed char and short in favor of dchar and int, your code will
port to the SHARC.

At best D appears to make that much more difficult, and for most
practical purposes appears to rule it out completely.

That's just false. If the SHARC had 7 bit bytes, I'd concede the point.
But that's not the case. If you stick with D types that do match SHARC
types, it's just as portable as the C++ code is.

I challenge you to port zlib (written in C) to the SHARC. I think you'll
find it every bit as much work as if it were in D.

--------
Walter Bright
http://www.digitalmars.com
C, C++, D programming language compilers

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]