Re: Throwing error is potentially buggy

From:

wij@seed.net.tw

Newsgroups:

comp.lang.c++

Date:

Thu, 10 Jun 2010 07:54:25 -0700 (PDT)

Message-ID:

<2a59e53c-a55b-4a3b-80d3-9fd0167146c9@j36g2000prj.googlegroups.com>

On 6=A4=EB10=A4=E9, =A4U=A4=C83=AE=C947=A4=C0, Goran <goran.pu...@gmail.com=

wrote:
On Jun 10, 6:58 am, w...@seed.net.tw wrote:

I am just saying one thing that "throwing error is potentially buggy".
Not that it is convenient for an error string to be written to a log
file, nor that it is elegant to mean stack unwind to exit. I am not
even really talking about simplicity neither, even whether it is an
real error or not, but merely correctness of the error reporting
mechanism. Again, I am not talking about any implement of your or his
or her correct throwing strategy, but the prevailing, the frequently
taught, probably the 'standard' one.

Let's first confine errors to only errno-like classes, assume they
all inherit from std::exception.
class E_PERM;
class E_NOENT;
class E_SRCH;
class E_INTR;
class E_IO;
class E_BADF;
class E_AGAIN;
class E_NOMEM;
class E_NOSPC;
class E_INVAL;
class E_PIPE;
class E_TIMEDOUT;
class E_NAMETOOLONG;
class E_FBIG;
class E_LOOP;

Well, now... Why would you do something like this!? Honestly, you just
replaced a single number (errno) by an exception class! That really
does not sound reasonable. When I want something similar, I do this:

class crt_exception : ...
{
public:
crt_exception() _errno(errno) {}

};

Note that if you do this, then that large swaths of code you put down
here change drastically (they get shorter) and you do not lose
anything in any sort of precision. In other words, what you propose is
too fine-grained and I think you'll get not enough benefit for doing
it.

(As a side note... exceptions are a significantly better mechanism to
report errors then error-return in real-world code that uses errno,
because if used like above, the copy of the original errno is in
exception object, so any cleanup operation during stack unwinding,
that might e.g. reset errno, does no affect error reporting. That's
simply not the case with error-return. If it's important to preserve
"original" errno (which I believe is), then things get quite messy.)

// Class String has all similar members as std::string
// except throwing E_... class
//
class String {
// Take one for instance:
// Throw: E_INVAL $index is invalid
// E_FBIG String length too long
// E_NOMEM Not enough memory
//
void insert(size_t index, const String& str);

};

Now we are writting a funcion process_msg. All called functions throw
E_... in exceptional condition.

// Throw: E_INVAL $in is invalid
// E_FBIG result string is too long
// E_NOMEM not enough memory
//
void process_msg(const String& in, String& out);
{
MutexLock lock(mutex); // Not likely to fail. But shouldn't
// a robust program
catch the possible
// E_INVAL, E_PERM,
E_AGAIN from leak?

I reckon you mean "from lock", not "from leak"? I believe that a
robust program should __not__ catch such error. Basic question is why
would it? Look at it this way: if locking a mutex returns an error
code, what would you do if "lock" fails? I say, you can't continue.
You can only clean up whatever there is to clean and leave the
function. And that, that is the same as throwing an exception! (Bar
the "if" statement in "if (lock(mutex)) {whatever}") So throwing (and
not catching) an exception will result in shorter code.

String s1,s2;
if(in.size()<=0) {
throw E_INVAL;

Honestly, this "if" here is just completely wrong. First, at this
point in code, you don't care about the size of the string, so why is
there a check? Second, why is the code written so that size can be
negative?

}
db.read(s1,1000); // may throw E_INVAL, E_FBIG, E_NOMEM,...
parse(s1,s2); // may throw E_INVAL, E_FBIG, E_NOMEM,...
process(s1,s2,out); // may throw E_INVAL, E_FBIG, E_NOMEM,...

}'

Yes, that's OK. All calls may throw who knows what exception, and in
the real world, there will be a great number of possible failure modes
in them.

Then, a tiny part of the loop in the server-like program:

try { process_msg(in,out); }
catch(E_INVAL& e) {
  send("Invalid argument",e ); // peer end could have a wrong message
  // Probably one want to log message here, but there are more story
  // about stack unwinding that can occur here. If error is
  // misinterpreted, the program goes to std::terminate for the good.
  // The error report is not just for logging purpose.
  // This is one reason a function should be somewhat responsible
  // to its declared thrown objects.}

Ok, here, the problem is that process_msg throws E_INVAL due to a poor
"in". There are two ways to handle this, and it's not what you have
here.

First, presuming that "in" comes from an external system that can't be
trusted, I would argue that process_msg should simply fill in "out"
with content saying that input was invalid. No exception needed.

Second, even if process_msg threw, some response is probably better
than no response, so perhaps overall loop structure should be like
this:

loop
{
  try
  {
    receive(in);
    process(in, out);
    send(out);
  }
  catch(const exception& e)
  {
    try { send(ERROR_OUT(e)); } // e.g. ERROR_OUT derives from OUT
    catch(const exception& e) { /*we're screwed. log error?*/ }
  }

}

catch(E_FBIG& e) {
send("Result too long",e); // peer end could have a wrong message

}

This pattern has been explained several times. It should be clear
that at least all process_msg(..) documented errors should be caught
(probably convert to another one) in the function process_msg(..).

They should not. Why do you think that?

Look at things this way: what are you going to do if e.g. your lock
throws? I'd say, there's nothing meaningful you do. You will typically
give up with the rest of processing, you might want to signal the
error to the operator (e.g. log), and you might want to send and
"error occurred" response out.

I agree that, when something goes wrong, it's beneficial you can
convert the error in another one, but so far, your examples do not
warrant that. On top of that, it's a horrible idea to lose original
error. E.g. one of worst sins Java people do is to catch one error,
then throw another without the original error as "inner". (And due to
checked exceptions of Java, they are forced to employ said "catch -
throw another exception type" more than needed, so much so that some
well-established Java code-bases reach for (gasp!) unchecked
RuntimeException as a base for all their exception types).

The prevailing "catch only those you know how to handle" style and
the "rethrow error" preaches won't work well in such error handling
or message passing scenario.

I don't know where you got this, but it's simply not true in reality,
and it's not true in the code snippets we've seen so far. Again, I
have to press you do show what is it, in externally-observable
behavior, that you want to achieve, where use of exceptions will be in
the way. Particularly, consider my loop from above. It will pretty
much do what you want, if you just fill a couple of blanks in.

So the revised version (not the popular style):

void process_msg(const String& in, String& out)
try {
MutexLock lock(mutex); // Note: E_INVAL, E_PERM, E_AGAIN might be
                                             // thrown here, but no
way for a local
                                             // catch. So we added a
try/catch layer.
String s1,s2; // E_NOMEM might be thrown
if(in.size()<=0) {
   throw E_INVAL; // this is the responsible one
}

try { db.read(s1,1000); }
catch(E_INVAL&) {
   // What to throw? process_msg(..) only documented 3 errors.
   // Do we need more errors to report? Yes, so add E_IO to mean
   // vague error.
   throw E_IO;
}
catch(E_INTR&) {
   // Let this be handled correctly
}
catch(E_IO&) {
   // E_IO had been decided for a vague error. So this error
   // can't be directly reported.
   throw E_IO;
}
catch(E_PERM&) {
   // process_msg(..) intends to hide the implement detail that it
   // reads data from a database or config file. So throw E_IO.
   throw E_IO;
}
catch(E_FBIG&) {
   // To be defensive. Not knowing how 'db.read(..)' will actually be
   throw E_IO;
};
// There are other possible thrown type by definition.

s1+=in; // Lucky, E_FBIG is appropriate

try { parse(s1,s2); }
catch(E_INVAL&) {
   throw E_IO;
}
catch(E_NOENT&) {
   throw E_IO;
}
catch(E_FBIG&) {
   throw E_IO;
};
// There are other possible thrown type by definition.

try { process(s1,s2,out); }
catch(E_INVAL&) {
   throw E_IO;
}
catch(E_RANGE&) {
   throw E_IO;
}
catch(E_BADMSG&) {
   throw E_IO;
}
catch(E_FBIG&) {
   throw E_IO;
};
// There are other possible thrown type by definition.

// Can the above annoying catch handlers be saved?
// My answer is no.}

catch(E_INVAL&) {
throw; // *** BUG ***
                 //E_INVAL may be from the lock failure. There
                // are occasionally similar codes causing such
                // conflicts.

};

This snippet is completely, utterly wrong.

First, as I pointed out above, using one exception class per errno is
silly.

Second, you just lost original error, a __major__ sin in any error
reporting. E.g. you go from from E_BADMSG to E_IO). So what are you
going to send out (or log) eventually? You will see E_IO somewhere up
the stack, so only thing you can say will be "there was an IO error",
but that is obviously a blatant lie!

Third, you are catching/rethrowing way too much without achieving...

=BE\=C5=AA=A7=F3=A6h >>

What should a function(e.g. process()) respond if it caught
std::invaid_argument (many are from elementary functions)?

Let's say you don't (or do) use Standard Library. Then, what should
a user function respond to your crt_exception (having gone through
many layers of fucntions) in a server loop (should do
everything possible to keep on servicing) in your design?