Re: Descriptive exceptions
 
* Eugene Gershnik:
Alf P. Steinbach wrote:
* Eugene Gershnik:
On Feb 20, 6:58 pm, "Alf P. Steinbach" <a...@start.no> wrote:
[huge snip of everything related to i18n and exceptions]
Let's just say that many people disagree with your position with regards
to
the language of error messages. Myself I would prefer the UN to proclaim
English as the only language on Earth starting from tomorrow but this has
little chance of happening ;-)
Good that we agree on an ideal, then, but as I understand it you see 
some practical obstacles that I don't.  What would those be?
I'm not sure what you're arguing against, or for.
As I said in my original post I argue that storing exception 'description'
strings within an exception is unnecessary and wasteful.
Well, that depends.
Presumably by "wasteful" you're not talking about data baggage, since 
you wrote, quote, "it is a very good idea to include as much internal 
information as you can".
Programmer time, then.
And for the id based scheme you're advocating, maintaining a repository 
of strings is added programmer or at least project time, not to mention 
for a repository of strings in various languages, as you've argued for 
(e.g. above).
It seems to me that is the choice of the worst from all worlds: added 
programmer/project work load, /and/ inefficiency, /and/ complexity.
On the other hand, an id based scheme doesn't in principle need to have 
all that overhead, if (for example) exception messages are all in 
English, and/or the number of different id's is very restricted or 
relying on someone else's or an existing standardized repository of 
message strings, e.g. the Windows API.  But on the third & gripping 
hand, the problem with an existing repository is the overhead of looking 
up a suitable string and id, and creating new ones if no suitable ones 
are found.  My experience with using and maintaining such repositories 
(admittedly years ago) is that in practice it is a /lot/ of overhead, 
enough so that small helper tools are developed, for very little gain.
With regards to what follows I think I can summarise your arguments as
follows
1. throw/catch process is unreliable and so it is better to log any
problem
prior to using it
The throw/catch process isn't unreliable (where on earth did you get 
that idea?).
Client code, on the other hand, is unreliable (it's in the nature of 
libraries to be better tested and more reliable than their driver code), 
especially in the context where the log is needed, where the client code 
has failed.  But perhaps best to repeat & rephrase.  When you have 
client code that has failed so that you need the log, what can you say 
about the reliability of the client code in that particular situation?
If you're going to rely on logs, as you've stated you are relying on, 
don't you think it's better to guarantee that you have the log entries 
you need, or as much as possible, than to not guarantee it?
2. it is better to log as soon as any problem is detected because there is
no guarantee that any code above would do it.
When you need the logs, yes.
It all depends on the kind of system.
I think that 1 is simply wrong (at least with modern compilers and code 
that doesn't throw from destructors)
Yes.
and 2 is bad engineering.
You'd need to justify that position in some way.  Saying it's so doesn't 
make it so.
More details are
given below in response to your particular points.
I don't propose always logging exceptions at the throw point, only,
as I wrote, if you can, and that was in the context of exceptions
being logged anyway.  If you're going to log all exceptions anyway,
making
an initial log entry at the throw point guarantees that that
exception will be logged.  Such a guarantee helps you guard against (1) 
client code
that doesn't log as it should, (2) crashes, with no (useful) log
entry, and in addition it (3) centralizes things, which is generally
good.
I think I still cannot get my point through. To address each of your
points
above
1) "if you can" - you never can since you don't know whether the caling 
code wants you to log
Why shouldn't one know that?  It's easy to arrange.
2) "If you're going to log all exceptions anyway" - unless you have the
entire application code under your control how do you know this?
It seemed you knew that because you wrote, quote, "all exceptions are 
caught".
3) "client code that doesn't log as it should" - you cannot decide what
amount of logging is proper for client code and what it should do.
That one's easy: let the client code control the logging.  Note the 
difference between "control" and "do".
4) "crashes" - more about it below
5) "centralizes things" - on the contrary, instead of having one place to
log in the catch block you would have it accompany every throw. In any
decent code there are many more throw-s than catch-es.
To "accompany" every throw with a logging call is a very bad idea, yes, 
I agree :-).  Doing that kind of code duplication would indeed be UnGood 
to the extreme.  On the other hand, wrapping the throw in a (usually 
member) function centralizes the throw and logging logic in one place, 
instead of duplicating it in an unbounded number of places.
Note that for every throw there is an unbounded number of possibly 
receiving catch clauses.
With the wrapping you have the logging, or at least an initial logging, 
in exactly 1 place  --  which happens to be the place where you have the 
most detailed information available (otoh., not contextual information, 
which suggests logging at more than one level, placing responsibility 
for each action X where the information to support action X resides).
Without the wrapping you have the logging code potentially duplicated in 
an unbounded number of places.
With the throw/log wrapping, if the client code programmer so wishes, 
she or he can achieve that duplication of code by turning off all 
automatic initial logging and do it all in the catch clauses with 
multiply duplicated, redundant code.  Which usually translates as: 
duplicated bugs, complexity, and a maintainance nightmare.  Without the 
throw/log-wrapping, the client code can only achieve centralized logging 
code by relying on conventions for other duplicated code, such as always 
using a set of macros, or always calling a rethrow-catch-and-log 
function: the code duplication can't be avoided, AFAICS.
About the only time throw-site logginbg makes sense is when the client
code
or the end-user explicitly asks you to via some environment "verbose mode
on" switch.
I mentioned filtering logs in an earlier posting, and had hoped that 
you'd connected that with e.g. your point 5 above.  There is of course 
also the possibility of selecting the degree of logging via a policy at 
compile time.
[snip]
Let's try again. How are you going to log? What log facility are you
going to use if you write a library for an unknown client? Do you
intend to require a logging callback? Or write to your own
mylibrary.log file and ruin my nice logging scheme?
Additionally why would you force your logging (necessarily
synchronized and necessarily slow) on my thread running your code
*even if I don't want you to*?
Those are engineering decisions.  Obviously the Windows API, say,
cannot be forced to log on your behalf (well, unless you're using a part 
of
the API that can be configured that way: one should never say never).  So
mostly the answer is, "it depends": there's no silver bullet, but hard
trade-off decisions to make.
You've avoided answering my questions ;-) These are engineering decisions
that need to be made following a decision to log at the throw site. Which
makes one question the original decision. Let me summarise the engineering
perspective
1) Usually there are many throw sites all over the code in every module.
It
is very hard to organize logging from all these places
Nope.  As a simplified example
   bool throwX( char const s[] )
   { log( s ); throw std::runtime_error( s ); }
can be used in every module (except in libraries you don't control).
Calls to third-party libraries need to be wrapped, at some level, if 
logging is desired, but then, they need to be wrapped at some level no 
matter which logging philosophy one subscribes to.
2) Usually there are few catch sites which concentrate in
'decision-making'
portions of the code that already have access to logging facilities for
other reasons.
That sounds like a monolithic system designed around a purely 
hierarchical control flow.  Yes, it's unfortunately not uncommon, when 
structuring from "business cases".  And C++ is a multi-paradigm language 
supporting also that.
However, I fail to see the significance as a general argument, and as an 
argument for a specific case it only makes it /not impractical/ to only 
do logging at some top-dog control level, and only for that case.
In particular, the idea of assuming a few 'decision-making' points in 
client code seems to me overly restrictive for library design (which I 
mention because my impression is that at least some of your arguments 
earlier have been based on the view of library design with no knowledge 
of the client code), and without that overly restrictive assumption 
logging in catch clauses (without an initial logging) causes a lot of 
redundant logging, resulting in, well, a mess.
[snip]
The point I was making is that if you don't know how to handle
some situation as far as the larger system is concerned you have no
business logging it.
It seems you're now dicussing a situation where you want some, but not
all, exceptions logged.
Yes
In that case you can use a filtering log
facility.  The throws will then attempt to log, but whether that
results in actual log entries will then depend on the call context.
But it also has to depend on the kind of exception. So you will do 
something like
my_exception ex;
log_context.log(ex, more, info);
throw ex;
Don't you see that this is simply a different from of catch block only
invoked before unwinding?
No.
First, the above code really should read
   throwX( more, info );
centralizing the logging etc.
Second, the job of a catch clause is to...  Ah, this requires some 
background.  So.  Assume that each function (let's focus on functions 
instead of smaller code blocks) has a well-defined goal, a job to do, 
with some requirements for doing the job, and some guarantees if the 
requirements are met.  If a function f can achieve its goal, fine.  If 
not then it's obligated to /not return without signalling failure/.  Now 
further assume that failure is signalled via exceptions: it could just 
as well be a special return value, what I'm discussing here is failure 
versus success, however it's signalled.
What should the function f do if some function it relies on, one that's 
necessary for the goal, signals failure?
f can either try to achieve its contractual goal in some other way 
(perhaps by just retrying), or it can in turn signal failure to its 
caller.  There's nothing else it can do without breaking its contract.
There are only three possible roles for catch clauses within that 
scenario: cleanup/logging, failure indicator translation, and "achieve 
its contractual goal in some other way".  For anything else, success or 
immediate failure, is handled simply by not doing anything special.
Note that I'm not advocating doing cleanup or logging via catch clauses 
instead of RAII, but I'm mentioning the possible roles to be thorough.
Now, the throwX call above isn't cleanup, it's not failure indicator 
translation, and it's not "achieve its contractual goal in some other 
way".  So it doesn't perform any of the possible main roles of a catch 
clause in a well-designed program.  The only thing it has in common with 
a catch clause is that a top-level catch clause, or in a badly designed 
program an intermediate level catch clause, may be used for logging (the 
intermediate one redundantly logging the original exception).  Hence the 
throw+log is not a "different form of catch block": it simply has 
nothing in common with a catch clause in a well-designed program, except 
the logging that may be in the single uppermost top-level catch clause 
to handle catastrophic failure.
It may happen, for example, that an exception leads to catastrophic
failure (all the more likely when it's outfitted with a lot of data
baggage).
I don't understand what exactly do you mean by "catastrophic failure"
here.
A catastrophic failure is, for example, depending on the system, an
integer divide by zero, or executing an invalid machine code
instruction resulting from a corrupted stack.
None of which have any higher probability to happen during throw/catch
than
during normal code execution. 
On the contrary, an exception is a transfer of control to a place you 
don't control, in a failure state, and with std::terminate the only 
mechanism for "double" exceptions.  All three mean, separately, a higher 
probability of catastrophic failure.
(Incidentally is there a term that cover the
entire throw/catch process starting from construction of throw arguments 
and
ending inside the catch block?)
Yes, it's called "stack unwinding".
But I'm still searching for a term for the phases of stack unwinding 
where a new throw will result in std::terminate.
I.e., stack unwinding outside try-catches in automatically invoked 
destructors.
Or, as may be likely
with a lot of baggage data in an exception object (as you argued for), a
std::bad_alloc exception during stack unwinding.  Generally such
failures prevent after-the-fact logging of the original exception.
You can hit bad_alloc during logging too. In fact you can hit many more
things since you are going to do formatting and I/O.
Am I going to do formatting and i/o?
Why?
OK, I'm playing devil's advocate here.  Actually I agree that in 
practice logging will involve some formatting and i/o.  But not 
necessarily memory allocation.  With centralized logging you control the 
logging, including aspects such as whether it uses dynamic allocation or 
not, whether the I/O's done on the current thread or not, etc.  In 
short, the "you can hit" assumes some kind of lack of control, whereas 
what centralization (of code, not flow control) buys you is control -- 
to the extent possible in a failure situation.
So in both cases there is a danger of nested error. For some applications
(but only for some - many would never see bad_alloc at all) it might make
sense to guard against it.
Yes, I agree that std::bad_alloc is a rare event, at least on current 
desktop machines and larger.  And as mentioned earlier, IMHO throwing an 
exception or letting that std::bad_alloc propagate isn't the way to 
handle memory exhaustion.  Better to install a logging and terminating 
new-handler, because termination is where it's heading anyway.
At the point of throw it's not known what may happen later.
Hence at that point, the code must act as if anything can happen.
Precisely ;-) Which to me means not do anything other than throw.
What do you trust more to not screw up: arbitrary client code above
that has demonstrably violated your preconditions or resource 
requirements,
or a true and tested logging facility?
Sorry but this doesn't make any sense other than to appeal to feelings.
First usually while writing a function I have no idea what it means to
"screw up" for my client's logging.
A screw-up means that things aren't done as they should be, and that 
some kind of mess results.
But I apologize: I uncritically adopted just one viewpoint, the one of 
implementing a library, because that was the context of this argument.
First consider implementing library code, with unknown client code. 
You're throwing an exception, passing all info you can, but not logging. 
  The client code works OK and logs at a higher level.  In this case you 
only need the log if there's a problem with your library.  And if there 
is, it much helps testing if it logs or at least can log at the throw 
sites.  Then there is the case where the client code doesn't work OK and 
perhaps crashes or hangs.  It's probable that in this case it doesn't 
manage to log the exception: after all, it's crashing, or hanging.  But 
this is where you or the client code programmer most need the log.  And 
would have it if the library logged at the throw sites.
Then consider implementing client code, using third-party non-logging 
library code.  The library is throwing an exception at you.  If this is 
because of an error in your client code, then your code may easily have 
other such errors, or the process might be in an unstable state, and 
won't necessarily manage to log the exception.  But this is where you 
most need the log.  If on the other hand your client code works 
perfectly, and logs the exception, then it's most probably a problem 
with the library.  And this rare case, for a well-tested non-logging 
library, is the only  one where there is guaranteed to be a log on hand 
when someone needs it.
Second even if you throw as a result of
violated precondition or resource requirement (and some would argue that
if
you throw in this case rather than assert() and crash there is no
violation
to begin with) this says nothing about stability or quality of the calling
code.
Depends what one calls precondition or resource requirement.  Consider 
that there must be /some/ reason for throwing an exception, some 
failure.  Perhaps time's run out, time's a resource.  Perhaps there's no 
more disk space or available network connections.  Those are resources. 
  Anyway, the question is not the stability or quality of the client 
code in general.  It's the stability or quality in the situation where 
something so Bad has happened that the only way out is throw, to fail.
-- 
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]