Re: Announcing Xrtti - Extended Runtime Type Information for C++

From:

Le Chaud Lapin <jaibuduvin@gmail.com>

Newsgroups:

comp.lang.c++.moderated

Date:

Sun, 6 May 2007 14:23:07 CST

Message-ID:

<1178475804.787840.241570@o5g2000hsb.googlegroups.com>

On May 6, 8:50 am, bji-gg...@ischo.com wrote:

Although I wonder if maybe you are thinking about the deserialize
case, where there must be an identifier encoded in the serialized form
of the object that allows the serialization library to re-acquire the
type information about the serialized object. In this case I can
think of a couple of ways:

Yes, I was thinking of the class problem of deserializing polymorphic
objects. If the complete type of a class could be capture as a string
whose generation from the type is specified and standardized, then it
would be possible to send that string over the wire before sending the
object, whereupon the target of serialization would use the string in
a map<string, Deserializer *>, where Deserializer would be a function
that yields a pointer to base class of the serializable objects,
synthesizing the object by extraction from the wire.

1. Store the full class name of the object in the serialized form,
i.e. Foo::Bar::Baz. Then assume that if the program has Xrtti info
for the class named Foo::Bar::Baz (i.e. const Xrtti::Context *pContext
= Xrtti::LookupContext("Foo::Bar::Baz") returns a non-NULL Context),
then it must be the "same" class and can be deserialized with the type
information thus returned. This would fail if a program happened to
use the same full name for a class as another program, but the classes
had different forms. This could be avoided by all programs using
their own private namespace to define their classes, but this is non-
ideal because it requires cooperation between all programs ever
written.

Hmm..."with the type information thus returned."

There is a more fundamental problem I think, which gets back to the
flaw in the "my code is intelligent" mode of thought (not that you
were promoting that mode of thought). Knowing the types of the bases
and members of a class will not be sufficient for "automatic"
serialization. A programmer, a human being, must ultimate decide
which data members should be serialized, and which are irrelevant. No
computer program can do that without the aid of the programmer, which
brings us back to a simple fact - without artificial intelligence, in C
++, the programmer will have to be a significant participant in
determining how code should behave.

While we are on this subject, I have a rhetorical question for those
reading this thread:

Question: What do:

1. COM
2. NET
3. XML
4. Introspection in C++
5. CORBA
6. Polymorphism As Default Paradigm
7. Class Factories
[many other concepts which I cannot think of immediately]

....all have in common?

Answer:

If you step back away from the intricacies of these "technologies" and
regard them with healthy objectivity, you will see a subtle pattern -
all of them subscribe to the (IMO erroneous) notion that code is
intelligent, at least at some basic level. The classic illustration
of this is XML. Many a programmer will look at an XML file, see the
hierarchy present, and see the hierarchy present in definition of
their C++ classes, and say, "Wow!! I can see the mapping between the
hierarchy in my C++ objects and the hierarchy of my configuration
file. That's incredible!!!" Then they will dive head first into trying
to figure out a way to automatically output XML data from C++ object
hierarchy and vice-versa, just like the other 1000 programmers who
have tried. Some of them will abort at this point after being stopped
by the tedium of compiler theory. Others will forge head, determined
to capture the gold at the end of the rainbow. After all, "The
parallels are obvious!!!!"

Sigh.

This is a rat hole. Of course there is a parallel. That parallel
exists because models typically have multiple manifestations, and in
almost all of those manifestations, the parallels will be obvious. If
someone were to come up with a new OO language, for example, without
knowing this language, I would be able to pre-determined the structure
of my C++ data structures within that language.

So what is going on here? Why do so many people march so quickly into
this rat hole?

False assumption.

The human programmer looks at a binary configuration file and is
unable to make sense of it, though it is known that it is the
repository of the configuration data. That same programmer looks at
an XML file, and is not only able to make sense of it, but something
peculiar happens: The programmer is able to interpret almost *any* XML
configuration file. Wow! Looking at an XML file containing
information about doctor/patient records? Wow, I can understand it!!

The programmer can understand it. The computer cannot. The
programmer makes the erroneous assuming that the computer do what a
human can, which is simply wrong. Another mistake that is made is,
"Once I get my code to generate the XML file from C++ objects, I will
be basking in truckloads of flexibility, because XML is so
extensible."

Computers do not like "extensible". Unlike a human, a computer _can_
_not_ look at at XML file and say, "Hmm...I see that the XML data has
changed slightly in 4 places.. fortunately, the new fields that have
been added are not pertinent to the current object structure that I
have in my programm."

Another mistake that is made is that XML eliminates the need to think
about type. "It's all just data. Maybe if I keep massaging my C++, it
will spit out code into the XML that captures its types
automatically." No, it will not. A compiler has to do that.

This last point also illustrates why the notion of a "Common Type
System" (http://en.wikipedia.org/wiki/Common_Type_System) is flawed.
Microsoft tried to sell the idea that languages are not so distinct
that they could not inter-operate seamlessly. Again, this was done to
eliminate the *N problem of Microsoft having to generate libraries for
all the languages they needed to support, in those native language.
So they tried hard to get the types between language to "work
together."

But type is most fundamental to a computer language. The whole
purpose of type is to give form to that which is formless, at
collection of bits with no semantic structure. Type within a language
is distinct among its elements. Certainly, between language, the only
way for two types to be equal is for them to be equal. Note that the
idea of a conversion between types, if successful under all
situations, is fraud, as the the types will either be truly equal to
start with. To try to get languages to pretend that the types are
separate-but-equal leads to a mess, like intermediate languages to
"assist" the beleaguered languages and coerce their types system into
something more homogeneous.

Microsoft decided to clean up .COM by going a step further. They
would define a new platform (.NET), a new language (C#), and they
would gradually invade the existing languages with newer, "better
behaved", fundamental types, ones that would still be separate-but-
equal, yet would avoid all the trouble that the old fundamental types
had. This change would apply only to scalar types, not super-complex
aggregates of the kind I routinely use in C++. There is something
strange about super-complex aggregates. Since a programmer defines
the form, when you try to get them move between language, there is
significant resistance. Some languages have no idea what a template
is for example. And even if they do, the model of construction,
destruction, etc...these seem to want to go with the type. Good
grief! We have to be careful!!! If we go to far in allowing the types
to move "seamlessly" between languages, our customers might discover
that they are, in fact, programming in the same language! Anyhow,
those super complex aggregates are bad, bad, bad, and should never
have been complex. Scalar types are good, good, good, and with much
procedural programming, no one will need super complex structure
anyway.

I think you get the point. We must all be careful spot wishful
thinking in software engineering, like the idea of self-repairing code
that was popular in the 1980's. Code is not human. It does not eat
or drink or sleep. Unlike physical machines, it does not even
degrade. In this way, from the architects point of view, it is the
purest embodiment of form that could possibly exist.

And at present, that embodiment can only be determined by a human, not
a machine.

-Le Chaud Lapin-

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]