Re: Conditionally initializing a const reference without making a copy

From:

"James Kanze" <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++

Date:

12 Apr 2007 03:44:26 -0700

Message-ID:

<1176374666.123676.3810@q75g2000hsh.googlegroups.com>

On Apr 11, 3:40 pm, JurgenvonOert...@hotmail.com wrote:

Consider the classes Base, Derived1 and Derived2. Both Derived1 and
Derived2 derive publicly from Base.

Given a 'const Base &input' I want to initialize a 'const Derived1
&output'.
If the dynamic type of 'input' is Derived1, then 'output' should
become a reference to 'input'.
Otherwise 'output' should become a reference to the (temporary) result
of the member function 'input.to_der1()' which returns a Derived1
object by value.

You want a reference which sometimes points to an existing
object, and sometimes to a temporary? In other words, you want
a reference for which the referred to object must sometimes be
destructed when the reference goes out of scope, and other times
not.

For performance reasons the copy constructor of Derived1 must not be
called.

That's a compiler detail. The copy constructor may always be
called.

In trying to achieve this I used the following code:

Example #1:
const Derived2 *der2_ptr = dynamic_cast<const Derived2*>(&input);
const Derived1 &output =
    (der2_ptr)?
      der2_ptr->to_der1() :
      static_cast<const Derived1 &>(input);

This works fine if 'der2_ptr != NULL', but when 'der2_ptr == NULL'
then Derived1's copy constructor is called (possibly because the
second and third argument of the conditional operator have to be
converted to the same type).

Not just possibly. And it's not just a question of type. The
results of the ?: expression here is an rvalue---it can only be
an lvalue if both the second and the third arguments are
lvalues. You get an lvalue to rvalue conversion with the third
parameter. Which is a copy.

Alternatively I used the following code:

Example #2:
const Base &base =
    (der2_ptr)?
      der2_ptr->to_der1() :
      input;
const Derived1 &output = static_cast<const Derived1&>(base);

This is even worse. It exposes a bug in g++ 3.4.x (the temporary
result of 'to_der1()' is destroyed before it has been used) and with g+
+ 4.x the copy constructor of Base (!) is called on the result of
'to_der1()' making the static_cast from 'base' to 'Derived1' invalid
(and still, a copy is made).

I'm not sure there's an error in g++ there, either. You can't
mix rvalues and lvalues in a ?: expression. If either of the
two expressions is an rvalue, the other will be converted to an
rvalue as well. And rvalues don't support polymorphism; an
rvalue has a specific type.

So why is this copy necessary?

In the end, because the compiler has to know whether to call the
destructor on what the reference refers to when the reference
goes out of scope. Here, because you can't have an ?:
expression that is sometimes an rvalue, and sometimes an lvalue.

All the objects are there. It should be
possible to setup a reference to them without making a meaningless
copy.

Al the objects aren't there. That's the problem. Sometimes,
the object is there, and sometimes it's not, and must be
created. And if it's created, it must be destructed. So the
answer is: create it every time.

G++ (incl. 4.x) even makes a copy in the following case:

Example #3:
Base my_base;
const Base &output =
   (der2_ptr)?
      der2_ptr->to_base() :
      my_base;
The 'to_base' function returns a Base object by value.

Obviously. As soon as you have an rvalue, everything must be an
rvalue. Otherwise, you have serious lifetime of object issues.

I don't see why this copy is necessary while arguments 2 and 3 of
the ?: operator are both of type 'Base'.

If they're both lvalues, it isn't. If they're both rvalues, I
think the compiler may be able to skip the final copy as well
(but I wouldn't swear to it).

I would say that at least in example #3 the situation is as described
by the C++ standard in 5.16:3 (first bullet).

I don't see how. E1 (der2_ptr->to_der1(), in your first
example) cannot be converted to reference to T2 (i.e Derived1&).

So either no conversions
or implicit conversions should occur. However, it looks like g++
treats the situation as in the second bullet, where a conversion
occurs by creating a temporary object.

What else could it do? It tries conversion in both directions
(i.e. aligning E1 to the type of E2, and vice versa). The first
case fails, because E2 is an lvalue (point 1), and E1 cannot be
converted to a compatible lvalue. Inversing the roles of E1 and
E2, however, results in the second point matching.

Another possibility is that I am wrong and g++ is right because
paragraph 8.5.3:5 last bullet applies and the copy is 'necessary' to
initialize the reference (although it just looks like a waste to me).

My questions are:
- Why is the copy constructor called in the examples above? Is that
because of the standard (if so, what section) or is it because of g++?
- If the copies are necessary, is there another way to set up 'output'
without making copies?

The basic problem here is that the lifetime of the objects is
different. So you'll have to manage it yourself. Something
like:

Derived2 const* der2_ptr = dynamic_cast< Derived2 const*

( &input ) ;

    std::auto_ptr< Derived1* >
                    der1_ptr( der2_ptr == NULL
                               ? NULL
                               : new Derived1( der2_ptr-

to_der1() ) ) ;

Derived1 const& output( der2_ptr == NULL
? dynamic_cast< Derived1 const&

( input )

: *der1_ptr ) ;

A more elegant solution might be based on boost::shared_ptr.
Base would contain a virtual function which returns a
boost::shared_ptr< Derived1 >: in Derived1, this function
constructs the results from this and a no-op deleter; in the
other classes, it constructs a Derived1 on the heap, and returns
a shared_ptr to the new object.

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34