Re: Descriptive exceptions

From:

"Alf P. Steinbach" <alfps@start.no>

Newsgroups:

comp.lang.c++.moderated

Date:

Wed, 21 Feb 2007 17:02:48 CST

Message-ID:

<5435q2F1u7k52U1@mid.individual.net>

* Eugene Gershnik:

Alf P. Steinbach wrote:

* Eugene Gershnik:

On Feb 20, 6:58 pm, "Alf P. Steinbach" <a...@start.no> wrote:

[huge snip of everything related to i18n and exceptions]

Let's just say that many people disagree with your position with regards

the language of error messages. Myself I would prefer the UN to proclaim
English as the only language on Earth starting from tomorrow but this has
little chance of happening ;-)

Good that we agree on an ideal, then, but as I understand it you see
some practical obstacles that I don't. What would those be?

I'm not sure what you're arguing against, or for.

As I said in my original post I argue that storing exception 'description'
strings within an exception is unnecessary and wasteful.

Well, that depends.

Presumably by "wasteful" you're not talking about data baggage, since
you wrote, quote, "it is a very good idea to include as much internal
information as you can".

Programmer time, then.

And for the id based scheme you're advocating, maintaining a repository
of strings is added programmer or at least project time, not to mention
for a repository of strings in various languages, as you've argued for
(e.g. above).

It seems to me that is the choice of the worst from all worlds: added
programmer/project work load, /and/ inefficiency, /and/ complexity.

On the other hand, an id based scheme doesn't in principle need to have
all that overhead, if (for example) exception messages are all in
English, and/or the number of different id's is very restricted or
relying on someone else's or an existing standardized repository of
message strings, e.g. the Windows API. But on the third & gripping
hand, the problem with an existing repository is the overhead of looking
up a suitable string and id, and creating new ones if no suitable ones
are found. My experience with using and maintaining such repositories
(admittedly years ago) is that in practice it is a /lot/ of overhead,
enough so that small helper tools are developed, for very little gain.

With regards to what follows I think I can summarise your arguments as
follows

1. throw/catch process is unreliable and so it is better to log any

problem

prior to using it

The throw/catch process isn't unreliable (where on earth did you get
that idea?).

Client code, on the other hand, is unreliable (it's in the nature of
libraries to be better tested and more reliable than their driver code),
especially in the context where the log is needed, where the client code
has failed. But perhaps best to repeat & rephrase. When you have
client code that has failed so that you need the log, what can you say
about the reliability of the client code in that particular situation?

If you're going to rely on logs, as you've stated you are relying on,
don't you think it's better to guarantee that you have the log entries
you need, or as much as possible, than to not guarantee it?

2. it is better to log as soon as any problem is detected because there is
no guarantee that any code above would do it.

When you need the logs, yes.

It all depends on the kind of system.

I think that 1 is simply wrong (at least with modern compilers and code
that doesn't throw from destructors)

Yes.

and 2 is bad engineering.

You'd need to justify that position in some way. Saying it's so doesn't
make it so.

More details are
given below in response to your particular points.

I don't propose always logging exceptions at the throw point, only,
as I wrote, if you can, and that was in the context of exceptions
being logged anyway. If you're going to log all exceptions anyway,

making

an initial log entry at the throw point guarantees that that
exception will be logged. Such a guarantee helps you guard against (1)
client code
that doesn't log as it should, (2) crashes, with no (useful) log
entry, and in addition it (3) centralizes things, which is generally

good.

I think I still cannot get my point through. To address each of your

points

above

1) "if you can" - you never can since you don't know whether the caling
code wants you to log

Why shouldn't one know that? It's easy to arrange.

2) "If you're going to log all exceptions anyway" - unless you have the
entire application code under your control how do you know this?

It seemed you knew that because you wrote, quote, "all exceptions are
caught".

3) "client code that doesn't log as it should" - you cannot decide what
amount of logging is proper for client code and what it should do.

That one's easy: let the client code control the logging. Note the
difference between "control" and "do".

4) "crashes" - more about it below
5) "centralizes things" - on the contrary, instead of having one place to
log in the catch block you would have it accompany every throw. In any
decent code there are many more throw-s than catch-es.

To "accompany" every throw with a logging call is a very bad idea, yes,
I agree :-). Doing that kind of code duplication would indeed be UnGood
to the extreme. On the other hand, wrapping the throw in a (usually
member) function centralizes the throw and logging logic in one place,
instead of duplicating it in an unbounded number of places.

Note that for every throw there is an unbounded number of possibly
receiving catch clauses.

With the wrapping you have the logging, or at least an initial logging,
in exactly 1 place -- which happens to be the place where you have the
most detailed information available (otoh., not contextual information,
which suggests logging at more than one level, placing responsibility
for each action X where the information to support action X resides).

Without the wrapping you have the logging code potentially duplicated in
an unbounded number of places.

With the throw/log wrapping, if the client code programmer so wishes,
she or he can achieve that duplication of code by turning off all
automatic initial logging and do it all in the catch clauses with
multiply duplicated, redundant code. Which usually translates as:
duplicated bugs, complexity, and a maintainance nightmare. Without the
throw/log-wrapping, the client code can only achieve centralized logging
code by relying on conventions for other duplicated code, such as always
using a set of macros, or always calling a rethrow-catch-and-log
function: the code duplication can't be avoided, AFAICS.

About the only time throw-site logginbg makes sense is when the client

code

or the end-user explicitly asks you to via some environment "verbose mode
on" switch.

I mentioned filtering logs in an earlier posting, and had hoped that
you'd connected that with e.g. your point 5 above. There is of course
also the possibility of selecting the degree of logging via a policy at
compile time.

[snip]

Let's try again. How are you going to log? What log facility are you
going to use if you write a library for an unknown client? Do you
intend to require a logging callback? Or write to your own
mylibrary.log file and ruin my nice logging scheme?
Additionally why would you force your logging (necessarily
synchronized and necessarily slow) on my thread running your code
*even if I don't want you to*?

Those are engineering decisions. Obviously the Windows API, say,
cannot be forced to log on your behalf (well, unless you're using a part
of
the API that can be configured that way: one should never say never). So
mostly the answer is, "it depends": there's no silver bullet, but hard
trade-off decisions to make.

You've avoided answering my questions ;-) These are engineering decisions
that need to be made following a decision to log at the throw site. Which
makes one question the original decision. Let me summarise the engineering
perspective

1) Usually there are many throw sites all over the code in every module.

is very hard to organize logging from all these places

Nope. As a simplified example

bool throwX( char const s[] )
{ log( s ); throw std::runtime_error( s ); }

can be used in every module (except in libraries you don't control).

Calls to third-party libraries need to be wrapped, at some level, if
logging is desired, but then, they need to be wrapped at some level no
matter which logging philosophy one subscribes to.

2) Usually there are few catch sites which concentrate in

'decision-making'

portions of the code that already have access to logging facilities for
other reasons.

That sounds like a monolithic system designed around a purely
hierarchical control flow. Yes, it's unfortunately not uncommon, when
structuring from "business cases". And C++ is a multi-paradigm language
supporting also that.

However, I fail to see the significance as a general argument, and as an
argument for a specific case it only makes it /not impractical/ to only
do logging at some top-dog control level, and only for that case.

In particular, the idea of assuming a few 'decision-making' points in
client code seems to me overly restrictive for library design (which I
mention because my impression is that at least some of your arguments
earlier have been based on the view of library design with no knowledge
of the client code), and without that overly restrictive assumption
logging in catch clauses (without an initial logging) causes a lot of
redundant logging, resulting in, well, a mess.

[snip]

The point I was making is that if you don't know how to handle
some situation as far as the larger system is concerned you have no
business logging it.

It seems you're now dicussing a situation where you want some, but not
all, exceptions logged.

Yes

In that case you can use a filtering log
facility. The throws will then attempt to log, but whether that
results in actual log entries will then depend on the call context.

But it also has to depend on the kind of exception. So you will do
something like

my_exception ex;
log_context.log(ex, more, info);
throw ex;

Don't you see that this is simply a different from of catch block only
invoked before unwinding?

No.

First, the above code really should read

throwX( more, info );

centralizing the logging etc.

Second, the job of a catch clause is to... Ah, this requires some
background. So. Assume that each function (let's focus on functions
instead of smaller code blocks) has a well-defined goal, a job to do,
with some requirements for doing the job, and some guarantees if the
requirements are met. If a function f can achieve its goal, fine. If
not then it's obligated to /not return without signalling failure/. Now
further assume that failure is signalled via exceptions: it could just
as well be a special return value, what I'm discussing here is failure
versus success, however it's signalled.

What should the function f do if some function it relies on, one that's
necessary for the goal, signals failure?

f can either try to achieve its contractual goal in some other way
(perhaps by just retrying), or it can in turn signal failure to its
caller. There's nothing else it can do without breaking its contract.

There are only three possible roles for catch clauses within that
scenario: cleanup/logging, failure indicator translation, and "achieve
its contractual goal in some other way". For anything else, success or
immediate failure, is handled simply by not doing anything special.

Note that I'm not advocating doing cleanup or logging via catch clauses
instead of RAII, but I'm mentioning the possible roles to be thorough.

Now, the throwX call above isn't cleanup, it's not failure indicator
translation, and it's not "achieve its contractual goal in some other
way". So it doesn't perform any of the possible main roles of a catch
clause in a well-designed program. The only thing it has in common with
a catch clause is that a top-level catch clause, or in a badly designed
program an intermediate level catch clause, may be used for logging (the
intermediate one redundantly logging the original exception). Hence the
throw+log is not a "different form of catch block": it simply has
nothing in common with a catch clause in a well-designed program, except
the logging that may be in the single uppermost top-level catch clause
to handle catastrophic failure.

It may happen, for example, that an exception leads to catastrophic
failure (all the more likely when it's outfitted with a lot of data
baggage).

I don't understand what exactly do you mean by "catastrophic failure"
here.

A catastrophic failure is, for example, depending on the system, an
integer divide by zero, or executing an invalid machine code
instruction resulting from a corrupted stack.

None of which have any higher probability to happen during throw/catch

than

during normal code execution.

On the contrary, an exception is a transfer of control to a place you
don't control, in a failure state, and with std::terminate the only
mechanism for "double" exceptions. All three mean, separately, a higher
probability of catastrophic failure.

(Incidentally is there a term that cover the
entire throw/catch process starting from construction of throw arguments
and
ending inside the catch block?)

Yes, it's called "stack unwinding".

But I'm still searching for a term for the phases of stack unwinding
where a new throw will result in std::terminate.

I.e., stack unwinding outside try-catches in automatically invoked
destructors.

Or, as may be likely
with a lot of baggage data in an exception object (as you argued for), a
std::bad_alloc exception during stack unwinding. Generally such
failures prevent after-the-fact logging of the original exception.

You can hit bad_alloc during logging too. In fact you can hit many more
things since you are going to do formatting and I/O.

Am I going to do formatting and i/o?

Why?

OK, I'm playing devil's advocate here. Actually I agree that in
practice logging will involve some formatting and i/o. But not
necessarily memory allocation. With centralized logging you control the
logging, including aspects such as whether it uses dynamic allocation or
not, whether the I/O's done on the current thread or not, etc. In
short, the "you can hit" assumes some kind of lack of control, whereas
what centralization (of code, not flow control) buys you is control --
to the extent possible in a failure situation.

So in both cases there is a danger of nested error. For some applications
(but only for some - many would never see bad_alloc at all) it might make
sense to guard against it.

Yes, I agree that std::bad_alloc is a rare event, at least on current
desktop machines and larger. And as mentioned earlier, IMHO throwing an
exception or letting that std::bad_alloc propagate isn't the way to
handle memory exhaustion. Better to install a logging and terminating
new-handler, because termination is where it's heading anyway.

At the point of throw it's not known what may happen later.

Hence at that point, the code must act as if anything can happen.

Precisely ;-) Which to me means not do anything other than throw.

What do you trust more to not screw up: arbitrary client code above
that has demonstrably violated your preconditions or resource
requirements,
or a true and tested logging facility?

Sorry but this doesn't make any sense other than to appeal to feelings.
First usually while writing a function I have no idea what it means to
"screw up" for my client's logging.

A screw-up means that things aren't done as they should be, and that
some kind of mess results.

But I apologize: I uncritically adopted just one viewpoint, the one of
implementing a library, because that was the context of this argument.

First consider implementing library code, with unknown client code.
You're throwing an exception, passing all info you can, but not logging.
The client code works OK and logs at a higher level. In this case you
only need the log if there's a problem with your library. And if there
is, it much helps testing if it logs or at least can log at the throw
sites. Then there is the case where the client code doesn't work OK and
perhaps crashes or hangs. It's probable that in this case it doesn't
manage to log the exception: after all, it's crashing, or hanging. But
this is where you or the client code programmer most need the log. And
would have it if the library logged at the throw sites.

Then consider implementing client code, using third-party non-logging
library code. The library is throwing an exception at you. If this is
because of an error in your client code, then your code may easily have
other such errors, or the process might be in an unstable state, and
won't necessarily manage to log the exception. But this is where you
most need the log. If on the other hand your client code works
perfectly, and logs the exception, then it's most probably a problem
with the library. And this rare case, for a well-tested non-logging
library, is the only one where there is guaranteed to be a log on hand
when someone needs it.

Second even if you throw as a result of
violated precondition or resource requirement (and some would argue that

you throw in this case rather than assert() and crash there is no

violation

to begin with) this says nothing about stability or quality of the calling
code.

Depends what one calls precondition or resource requirement. Consider
that there must be /some/ reason for throwing an exception, some
failure. Perhaps time's run out, time's a resource. Perhaps there's no
more disk space or available network connections. Those are resources.
  Anyway, the question is not the stability or quality of the client
code in general. It's the stability or quality in the situation where
something so Bad has happened that the only way out is throw, to fail.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated. First time posters: Do this! ]