Re: serialization

From:

Maxim Yegorushkin <maxim.yegorushkin@gmail.com>

Newsgroups:

comp.lang.c++

Date:

Sun, 08 Nov 2009 23:39:53 +0000

Message-ID:

<4af756ca$0$9751$6e1ede2f@read.cnntp.org>

On 08/11/09 21:09, Joshua Maurice wrote:

On Nov 7, 5:42 pm, Brian Wood<woodbria...@gmail.com> wrote:

On Nov 7, 4:00 pm, aegis<ae...@mad.scientist.com> wrote:

Thus, the ideal way/portable way, would be to write out
the value of each member of the given class.

I think so. Each member is handled separately and ideally
the process should be automated so users don't have to
maintain serialization functions by hand. Some existing
serialization libraries get the first part right, but
they don't automate the generation of the serialization
functions. To my knowledge only the C++ Middleware Writer,http://www.webEbenezer.net/cgi-bin/samb.cgi,
automates that step of the process.

Besides the fact that C++0x is behind schedule, there's
the fact that if it does eventually get finalized, it
won't have reflection support. That's a serious problem
in my opinion. I'm not a fan of Java or C#, but I think
their reflection support serves those languages well.

At the very least, IMAO, C++ should have reflection, but only at
compile time. Possibly / preferably through some template-like
facilities. Being able to iterate over members of a class at compile
time in a generic way would impose no additional costs, contrary to
the oft reason cited reason of "pay only for what you use".

Absolutely true.

My own company is forced to write its own fragile, intrusive
serialization framework because of this lack of C++ compile time
reflection.

In the company where I work we use a perl script to generate reflection
from annotated C++ header files. You may be pleased to know that the
main feature of the generated files is reflect() function template to
iterate over the base sub-objects and members of an object.

Here is what an annotated header looks like:

 struct /* @reflect_class */ A
 {
 int /* @reflect_member */ abc;
 double /* @reflect_member */ def;
 };

 struct /* @reflect_class */ B
 : /* @reflect_base */ A
 {
 A /* @reflect_member */ aaa;
 };

For every reflectable class it generates the following:

And here is what generated reflection looks like:

 #ifdef REFLECT_NAMESPACE_BEGIN
 REFLECT_NAMESPACE_BEGIN
 #endif

 // start of generated code for class A

 meta::Yes isReflectable(A const&);

The above function declaration (no implementation) allows for
IsReflectable<T> trait class which is used to tell reflectable classes
from non-reflectable at compile time (similar to boost type traits).

 template<class T> struct BaseIndexOf;
 template<> struct BaseIndexOf<A>
 {
 enum Type {
 ENUM_NIL = -1
 , ENUM_END
 , ENUM_BEGIN = 0
 };
 };

 template<class T> struct MemberIndexOf;
 template<> struct MemberIndexOf<A>
 {
 enum Type {
 ENUM_NIL = -1
 , abc
 , def
 , ENUM_END
 , ENUM_BEGIN = 0
 };
 };

Index of reflectable base classes and members accessible as
MemberIndexOf<A>::<member_name>. The enumeration is organized in such a
way that makes it easy to iterate over all members or base classes using
range [ENUM_BEGIN, ENUM_END).

 template<class Functor>
 void reflect(A& object, Functor& f)
 {
 f.onObjectBegin(object);
 f.onMember(object, object.abc, MemberIndexOf<A>::abc);
 f.onMember(object, object.def, MemberIndexOf<A>::def);
 f.onObjectEnd(object);
 }

 template<class Functor>
 bool reflect(A& object, Functor& f, MemberIndexOf<A>::Type member_index)
 {
 switch(member_index) {
 case 0: f.onMember(object, object.abc, MemberIndexOf<A>::abc);
return true;
 case 1: f.onMember(object, object.def, MemberIndexOf<A>::def);
return true;
 default: return false;
 }
 }

These are the fundamental reflect function templates which iterate over
all or particular members. This function templates accept a functor that
gets invoked for members (onMember() call), base sub-objects
(onBaseSubobject() call, see reflect for B below), and object begin/end
so that the functor can handle object nesting.

The functor passed in reflect() does the actual job of
serializing/deserializing object. The simple beauty of this approach is
that there is only one functor class for every particular serialization
format. This functor handles any reflectable classes using the rest of
generated C++ code.

 inline Sref toId(Type<A>)
 {
 return Sref("A", 1);
 }

 inline Sref toId(BaseIndexOf<A>::Type base_index)
 {
 switch(base_index) {
 default: return Sref();
 }
 }

 inline Sref toId(MemberIndexOf<A>::Type member_index)
 {
 switch(member_index) {
 case MemberIndexOf<A>::abc: return Sref("abc", 3);
 case MemberIndexOf<A>::def: return Sref("def", 3);
 default: return Sref();
 }
 }

This are the functions to get base class and member identifiers using
the generated indexes. A functor uses the base/member index (passed by
reflect() in its onBaseSubobject/onMember callback) to get any meta
information associated with that particular base/member. It relies on
the fact that indexes (enums) are stongly typed so that function
overloading picks up the correct function overload for a particular
index type).

On practice, we use more annotation to associate more meta information
with members. And we generate annotation for enums.

 // end of generated code for class A

 // start of generated code for class B

 meta::Yes isReflectable(B const&);

 template<class T> struct BaseIndexOf;
 template<> struct BaseIndexOf
 {
 enum Type {
 ENUM_NIL = -1
 , A
 , ENUM_END
 , ENUM_BEGIN = 0
 };
 };

Here class B has a reflectable base class A.

 template<class T> struct MemberIndexOf;
 template<> struct MemberIndexOf
 {
 enum Type {
 ENUM_NIL = -1
 , aaa
 , ENUM_END
 , ENUM_BEGIN = 0
 };
 };

And a reflectable member aaa.

 template<class Functor>
 void reflect(B& object, Functor& f)
 {
 f.onObjectBegin(object);
 f.onBaseSubobject(static_cast<A&>(object), BaseIndexOf::A);
 f.onMember(object, object.aaa, MemberIndexOf::aaa);
 f.onObjectEnd(object);
 }

 template<class Functor>
 bool reflect(B& object, Functor& f, MemberIndexOf::Type member_index)
 {
 switch(member_index) {
 case 0: f.onMember(object, object.aaa, MemberIndexOf::aaa);
return true;
 default: return false;
 }
 }

 inline Sref toId(Type)
 {
 return Sref("B", 1);
 }

 inline Sref toId(BaseIndexOf::Type base_index)
 {
 switch(base_index) {
 case BaseIndexOf::A: return Sref("A", 1);
 default: return Sref();
 }
 }

 inline Sref toId(MemberIndexOf::Type member_index)
 {
 switch(member_index) {
 case MemberIndexOf::aaa: return Sref("aaa", 3);
 default: return Sref();
 }
 }

 // end of generated code for class B

 #ifdef REFLECT_NAMESPACE_END
 REFLECT_NAMESPACE_END
 #endif

The perl script to parse annotation and generate the reflect code is
under 700 lines.

--
Max