Re: Aliasing, casting and undefined behaviour

From:

"Alf P. Steinbach" <alfps@start.no>

Newsgroups:

comp.lang.c++.moderated

Date:

Sat, 12 May 2007 12:49:17 CST

Message-ID:

<5al2naF2omifhU1@mid.individual.net>

* Edward Rosten:

Consider a very simple implementation of a matrix class. The class has
the option of storing data as row major, or column major, so
transposes can be done efficiently by type casting:

class RowMajor{};
class ColMajor{};

template<int rows, int cols, class Type = RowMajor> struct Matrix:
public Container<rows * cols>
{
    double& operator()(int row, int col)
    {
        return Container<rows*cols>::data[row * cols + col];
    }

    Matrix<rows, cols, ColMajor>& T()
    {
        return static_cast<Matrix<rows, cols,
ColMajor>&>(static_cast<Container<rows*cols>& >(*this));
    }

};

template<int rows, int cols> struct Matrix<rows, cols, ColMajor>:
public Container<rows * cols>
{
    double& operator()(int row, int col)
    {
        return Container<rows*cols>::data[row + rows * col];
    }

    Matrix<rows, cols, RowMajor>& T()
    {
        return static_cast<Matrix<rows, cols,
RowMajor>&>( static_cast<Container<rows*cols>& >(*this));
    }
};

In this case, a good compiler will optimize away calls to T(), making
the transpose operation zero cost. My guess is that the code is
technically undefined behaviour (though I'd be suprised if there was
an implementation on which it didn't work). Is this the case? and if
so, is there a portable, well defined way of achieving the free
transpose?

The different matrix types will never have extra data members, they
simply interpret the existing data differently.

As I recall I've replied to this article before, but Thunderbird lists
it without any follow-ups.

Technically it's Undefined Behavior, but it's difficult to think of any
compiler where it wouldn't work as a simple reinterpretation of the
Container data.

However, that reinterpretation may not necessarily do what you think it
will do: it will not necessarily transpose the matrix. One way to see
that is to (hypothetically) let the column major operator() forward to
the row major one, with the arguments switched. Then you see that for
transposition effect the correct indexing expression is not
"row+rows*col" but "col*cols+row", which works out the same if and only
if rows == cols.

Anyway, one way to more safely do what(ever) you're doing above is to
use encapsulation instead of inheritance, another way is to use virtual
inheritance, a third is to let client code differentiate between row and
column major indexing instead of differentiating between classes that
offer these operations, a fourth is to use run-time selection by setting
an invert flag in the matrix, and there may be yet other ways.

I'd strive to get rid of the UB, not the least because it's a complex
solution to something that probably isn't a problem, namely, a case of
evil premature optimization.

Hth.,

- Alf

%% The campaign to answer articles with no follow-ups so far.
%% Posted 12.05.2007 07:46 (my articles often linger in the queue since
%% I can't process my own articles, hence the datetime stamp).

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]