Re: Implicit conversions for value subtyping

From:

David Barrett-Lennard <davidbl@iinet.net.au>

Newsgroups:

comp.lang.c++.moderated

Date:

Mon, 26 Apr 2010 03:29:30 CST

Message-ID:

<b4678120-f19f-4a77-8230-6ad02d9fb337@w32g2000prc.googlegroups.com>

On Apr 24, 8:15 am, Keith H Duggar <dug...@alum.mit.edu> wrote:

On Apr 22, 9:09 am, David Barrett-Lennard <davi...@iinet.net.au>
wrote:

On Apr 22, 3:35 am, Keith H Duggar <dug...@alum.mit.edu> wrote:

And no
matter what definition you choose, if it allowed this to be a
"form of inheritance" I would argue that said definition is at
total odds with any common sense meaning of "inheritance".

I was thinking of it as "inheritance" in the following sense: Let a
data type mean a set of abstract values plus operators on those values
(using operator in an algebraic sense). The "behaviour" of a datatype
is only externally visible through those operators. Relating this
back to C++ code, assuming all operators are expressed as free
functions without in-out parameters, an implicit conversion between
two datatypes means that all the operators of one are available to the
other - because of value substitutability. I think it is reasonable
to call that "inheritance" treating the word as simply meaning that
"stuff" associated with one "thing" is automatically available to
another "thing". Anyway that was the only point I wanted to make! I
just looked for synonyms of "inherit" at synonym.com and the only
alternatives it gave me are "get" and "acquire".

What about all the numerous
other types we made add in future, are we now talking about
multiple inheritance?

In the sense of inheritance I described above, definitely!

Then as they say "Houston we have a problem" ;-) That is because
traditionally (and I personally haven't seen any exceptions to
this so I would tentatively say "universally") the "inheritance"
concept in both type theory and practical programming, is thought
of if not /defined/ as a partial ordering. And in that case we
have the antisymmetry law of partial order logic ie

x <: y AND y <: x IMPLIES x = y

Informally let

     value type = set of values + algebraic operators on those values

and T1 is a subtype of T2, written T1 <: T2 if

   1) values of T1 is a subset of the values of T2; and
   2) operators of T1 is a superset of the operators of T2.

The subtype relation is reflexive, antisymmetric, and transitive, so
it is a partial order. Note that T1 = T2 means T1,T2 have the same set
of values and have the same operators.

The definition of subtype is appropriate for defining implicit
conversions that avoid surprises. Condition 1 ensures that the
implicit conversion is infallible. Condition 2 ensures that the 'copy'
created by the implicit coercion can only be used in operators that
are already available to the original.

It is important to distinguish between a C++ class and the abstract
value-type it represents (if any). Indeed two distinct classes can
represent the same value-type. For example, simply copy and paste a
class definition and give the copy a different name.

Classes named Polar and Cartesian could be regarded as distinct
classes that together represent a single underlying value-type.
Implicit conversions in both directions help to ensure this is the
case, for then all operators defined on one are available to the
other. The only reason for the implementation to supply alternative
representations is for physical performance reasons. From a "logical"
point of view there is no need for it. Indeed one could consider the
alternative implementations and use of constructors as only hints to
the compiler.

One can distinguish between an instance of an encoding (i.e. a
variable in memory of a value-type like Square) and the abstract value
it is deemed to represent. This is analogous to a formal semantics on
a first order logic, where a type name 'Square' plays the role of a
function symbol, and interpretation of the symbol is a function that
maps the parameters of the underlying implementation of the variable
to an abstract square value. One can also draw a similar analogy when
considering the constructors of C++ classes that represent value
types, because they are like functions that select abstract values.

With the analogy to a formal semantics on a first order logic there is
nothing wrong with two different function symbols 'Polar' and
'Cartesian' that under interpretation represent different functions
that happen to have the same codomain. This distinction for C++ value
types only manifests itself in two ways. One is in how a variable in
memory is interpreted according to its type, and the other is in the
constructors of a class which are used to construct a variable in the
first place. One could consider the underlying type to have the union
over all the constructors, and these can be modelled as operators.

where "<:" is "inherits from" So, if we take you view that it's
reasonable for (since you allow coercions define "inheritance")

    Rect <: Square
    Square <: Rect

then

    Square = Rect

which is obviously false. So I just think it's very unwise from

Yes but it is obviously false that

Rect <: Square

It is a bad idea to provide implicit conversions just because two
types overlap in the values they can represent. As far as the
abstract type system is concerned implicit coercions must be
infallible.

Only some rectangle values are square values, so this conversion must
be explicit and regarded as fallible. In the context of particular
client code it may be provable that the conversion will succeed and so
an assertion should probably be used to verify the design. Otherwise
a run time test is required. E.g. call

bool IsSquare(Rect r) { return width(r) == height(r); }

a communication perspective to call '"stuff" associated with one
"thing" is automatically available to another "thing"' inheritance.
Other phrases such as convertible, compatible, associated, etc come
to mind. Perhaps there is a traditional word for it in algebraic
type theory but I can't recall.

Suppose I create a coercion from complex<int,int> to rectangle,
does complex<int,int> now "inherit" from rectangle? What if we
add a coercion from rectangle to complex<int,int>, does rectangle
no circularly inherit from complex?

That's just weird so I'm not sure what your point is here.

Geometric interpretation of complex is VERY common. And the
interpretation of a vector defining a corner of a rectangle
rooted at the origin is also VERY common. So I don't see why
you call this "weird". But this is a nit anyhow.

Fair enough, but then the source code shouldn't care to draw a
distinction either, so why not use a single class for both?

A better
example for me would be classes named Polar and Cartesian and we would
like implicit conversions in both directions. I haven't studied the
rules of implicit conversions for C++ to know whether such a thing
works in practise, but in principle I can't see any problems with
alternative representations of abstract values.

Sure you can define implicit conversion in both directions
(for user defined types anyhow).

BTW the conversions between polar and cartesian representations
involve cos, sin, sqrt, atan2 which are inexact for rational numbers
(including floats). That could mean we have an example of the
following assertion breaking which I find extremely unappealing:

    T1 x;
    T2 y = x;
    assert(x == y);

This can fail if coercions aren't invertible and y is coerced in order
to perform comparison using T1::operator==. On that basis I would
reject implicit conversions between polar and cartesian
representations.

Meh, that's an implementation issue that can be solved in a
variety of ways.

I can only imagine solving it with symbolic logic, and that would have
very specialised application.

I would hope that implicit conversions between datatypes satisfy
reasonable "laws" to make program correctness easy to reason about.

Unfortunately C++ doesn't give you a transitive closure of the

Well, unfortunately or fortunately depending on your viewpoint,
the standards committee decided that disallowing such transitive
closure was a "reasonable law" to help ensure correctness. Maybe
one of the members will comment. I vaguely recall this briefly
discussed in either D&E or ARM but I can't be asked to track that
reference down right now. (I need grep'able copies of my books LOL).

Yes, I would be very interested in knowing the justification.

We can define an equivalence relation on encodings according to the
different ways to represent the same value. By convention in C++ this
is associated with operator==(). Unfortunately this is typically
defined using a class member function, so with implicit conversions
only acting on the rhs it doesn't tend to be symmetric. In addition,
since implicit coercions aren't transitive it is all too easy for
operator==() to not be transitive as well.

E.g.

   T1 x;
   T2 y;
   T3 z;
   if (x == y && y == z)
   {
     assert(x == z); // Oops. May not compile!!!
   }

So far this thread has only provided examples using unary functions.
Specialisations become more labour intensive with binary functions.
E.g. == could be written differently for the following signatures:

     Quad == Quad
     Quad == Rect
     Quad == Square
     Rect == Quad
     Rect == Rect
     Rect == Square
     Square == Quad
     Square == Rect
     Square == Square

I think in large programs one would hope to avoid writing lots of
special cases. It helps to use free functions, transitively applied
coercions and decent compiler optimisation.

defined implicit conversions. Consider that we extend my previous
example with an additional type to represent an arbitrary
quadrilateral and we recognise that square value is-a rectangle value
is-a quadrilateral value.

*sigh* you keep forcing the "is-a" thinking in. Do you really see
is-a as a necessary concept? Even a particularly helpful one? Do
you not see any value in my suggestion to drop that hierarchical
thinking in favor of "flatter" relational thinking ie algebraic
type theory?

I cannot answer that without knowing more about what you mean. For
example, what do you see as the underlying basis for using implicit
coercions in practise? Do you require implicit coercions to be
transitive? Do you require x==y to be true after performing
assignment x=y? Do you require f(x)==f(y) whenever x==y? Do you
require == to be an equivalence relation? What does it mean if x==y
is undefined? Do implicit coercions need to be infallible? Can they
give the copy access to operators not available on the original?

I know an ostream is not a value type, but what are your impressions
of this?

   void foo(std::ostream& os1, std::ostream& os2)
   {
     if (os1 == os2)
     {
         // What does this mean?
     }
   }

Not anywhere close in my mind. How would
such "inheritance" thinking or terminology even be useful?

In the case of square value isa rectangle value, it is useful in the
sense that it is a reminder that one doesn't necessarily need to
implement operators on square that have already been written for
rectangles. If one forgets that then one may end up writing more code
than needed (and the C++ compiler won't complain of course).

Is that it?? A useful reminder? Sorry, that seems quite a tiny
(if any) benefit to force "is-a" on to every (perhaps any)
analysis of syntax or concepts.

No, there is a miscommunication. I think 'is-a' has a more far
involved purpose (see below). I was only talking about the utility of
the word "inheritance" in the context of value-types.

For example, it is far more useful to keep in mind that division
is not closed over integers than it is to sing "an integer is-a
rational" all day long. The first viewpoint emphasizes knowing
the algebra, the second knowing some shoe-horned "hierarchy".

Those two viewpoints are somewhat orthogonal. The rationals can be
defined axiomatically or else as equivalence classes over pairs of
integers. Either way, that is not the same as pointing out that there
exists a unique mapping from the integers into the rationals such that
the operators on the integers map to restrictions of corresponding
operators on the rationals, and therefore that there is a subset of
the rationals that is isomorphic to the integers. Since the integers
are only unique up to isomorphism it is convenient to say this subset
of the rationals is the integers. That rather complicated idea is
what 'is-a' means.

Very significantly, the nonzero integers inherit rational valued
multiplicative inverses. Indeed I recall a discussion on a database
newsgroup a few years ago where someone claimed Chris Date's notion of
subtype = subset made no sense, for example because although the
integers are a subset of the reals, they don't inherit the axiom of
multiplicative inverses and therefore it was claimed that it cannot be
said that the integers are a subtype of the reals. However this
problem disappears if we say that the integers "inherit"
multiplicative inverses from the larger system. Although it may be
questionable for whether mathematicians find this notion of subtype
useful, it does appear useful in computer science because it relates
to where implicit conversions work and behave nicely.

Square inherits functions defined on rectangles but not

No it doesn't "inherit" the functions. Instead you have created
an algebra in which an implicit coercion makes syntax such as

width(square)

valid. Suppose I now declare and explicit function for square

int area ( Square s ) { return side(s) * side(s) ; }

would you now propose to say that I have "overridden the
inherited (from rectangle) area function"? That doesn't make
any bit of sense to me. "inheritance" has no place here.

I'm not sure why. I thought the "override" metaphor was quite a good
one. I would however say it's a very dangerous analogy because it
means something quite different to overrides of virtual methods in the
context of subclassing. However, on the other hand we are talking

Yes, that was careless. I was using "override" in the type theory
usage forgetting that it has a restricted meaning in C++.

about data types here and we know that subclassing is completely and
universally useless for data types so I would suggest that confusion
cannot occur for C++ programmers that properly understand datatypes.

Eh, I would rather just avoid words like "inheritance" for this
mechanism since it already has such a strong nearly universal
connotation of ordering that does not apply here.

Perhaps, but I think OO programmers tend to emphasise state machines
instead of value types, even at the expense of layering. E.g. OODBMS
misused to make data (i.e. values) look like persistable state
machines, or an abstract syntax tree where nodes represent state
machines rather than values.

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]