Re: Reference to show that if (this == NULL) implies undefined
behaviour
On Nov 7, 12:02 pm, Carlos Moreno <moreno_n...@mailinator.com> wrote:
So, the issue: I'm trying to convince a colleague of mine that
using recursion in a tree structure, and allowing it to step outside
the tree by calling the method on the NULL pointers is not legal
(as in, it invokes undefined behaviour).
The thing is, it's proving to be an "uphill battle", in that the
solution, legal or not, simply happens to work (quite likely works
on every compiler that has ever existed --- at least every "real"
compiler, as opposed to academic/proof-of-concept compilers),
and it simplifies the code a bit. For example, to compute the
sum of all nodes in a tree (say that it's a binary tree of integer
values). So, the solution would go more or less like:
int Node::sum () const
{
if (this == NULL)
{
return 0; // no data members accessed (and this is why
// it *happens* to work)
}
else
{
return value() + left_node()->sum() + right_node()->sum();
}
}
I know you asked for standardese, but let me try to reach your actual
goal (of convincing your colleagues) from a different angle. Since
the standard implicitly makes calling a non-static member function
through a NULL pointer undefined behavior, there is nothing to stop a
future revision of a compiler from optimizing out the check for NULL,
and still being standards-compliant. So a compiler could easily end
up producing object code from the above that is the equivalent of
this:
int Node::sum() const
{ return value() + left_node()->sum() + right_node()->sum(); }
which will clearly break if any child nodes are NULL.
People have been bitten before by new versions of compilers optimizing
away NULL checks in cases where having a NULL pointer would cause
undefined behavior anyway. This has caused for instance an exploit in
the Linux kernel:
http://lwn.net/Articles/342330/
and in fact GCC has been making this optimization in at least some
cases for > 10 years now:
http://gcc.gnu.org/news/null.html
Let me give you a second argument, too. In practice, the code above
is also fragile as soon as you involve inheritance even if the NULL
check isn't optimized away. Suppose you refactor and make Node a
derived class of, let's say, NodeBase, all the child nodes are
pointers to NodeBase rather than Node, and suppose sum() is made
virtual. This seems not an unreasonable thing to do if you also want
to have binary trees where the sum() is only over leaves, or only over
branches.
In the body of Node::sum(), suppose that 'this' is non-NULL but that
left_node() is NULL. So you reach the call to left_node()->sum(). In
a typical implementation, that will try to read the class vtable from
left_node() to find the correct override of sum(). But that's NULL.
Oops. Note that this happens BEFORE the actual call to left_node()-
Node::sum() and its own check for 'if (this == NULL)' can be reached.
- Kevin B. McCarty
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]