Re: How to write a library with a static object?

From:

Adrian Hawryluk <adrian.hawryluk-at-gmail.com@nospam.com>

Newsgroups:

comp.lang.c++

Date:

Sun, 11 Mar 2007 04:17:05 GMT

Message-ID:

<5pLIh.30404$lY6.7626@edtnps90>

On Mar 10, 3:20 am, John Harrison <john_androni...@hotmail.com> wrote:
> AdrianHawrylukwrote:
> > John Harrison wrote:
>
> >> AdrianHawrylukwrote:
>
> >>> John Harrison wrote:
>
> >>>> AdrianHawrylukwrote:
>
> >>>>> Excuse me, but does anyone know why this fiasco even exist? Why is
> >>>>> there not some dependency mechanism in place? Or something like
> >>>>> the static local object being constructed when the control flows to
> >>>>> them (as opposed to over them)?
>
> >>>> It's a good question. One issue would be that with such a scheme the
> >>>> order of initialisation would be unpredictable, and that wouldn't
> >>>> mesh well with the one guarantee that the standard does give, which
> >>>> is that definitions within a single file are initialised in the
> >>>> order that they occur in that file. I know at times I found that
> >>>> certainty to be useful.
>
> >>> I wasn't saying that order of initialisation as they occur in the
> >>> source file should change. I ment that the order of initilisation in
> >>> the group of different source files should change based on the
> >>> dependency of one source file to another. It seems reasonable and
> >>> doable. 'make' can do it based on these dependencies (when given
> >>> that some compilers can spit out dependencies based on the included
> >>> files).
>
> >> That can't work, for one thing dependencies only occur at runtime,
> >> they can't be worked out in advance. Possible for constructors to
> >> contain if statements so a compiler or linker doesn't know which
> >> branch of the if statement will be taken and so can't work out which
> >> globals are dependent on which.
>
> >> Secondly it's perfectly possible for files to be mutally dependendent.
> >> One global in one file requires another global in a second file, but a
> >> different global in that second file requires yet another global in
> >> the first file.
>
> > Yeah, I had thought about this the night I posted it. Initialisation
> > would have to be worked out at compile time (very difficult, but not
> > impossible using path analysis) or run time (much easier to implement).
> > Mutual dependencies would be a problem if done on the file level,
so it
> > would have to initialise in order of occurrence in the file first, and
> > then order of dependency second.
>
> > This may cause some objects in a file to be initialised out of order,
> > but that shouldn't be a problem since the order of initialisation is
> > based on need. Since it is more difficult to perform path analysis on
> > the code, this could be done by the following algorithm:
>
> > Init static object:
> > 1. Put object at the end of the initialisation double linked list
> > 2. Store insertion point marker
> > 3. Init Object.
> > 4. Object needs other Object that is not yet initialised, so insert
> > other in initialisation list at insertion marker.
> > 5. Init other Object. Loop back to 4 if another dependency is found.
>
> > So you see, this is done for each object found in a file. Destruction
> > would be done in reverse order of the list. Everything is fixed up
nicely.
>
> >>>> There may be reasons why what you're suggesting isn't easy to
> >>>> achieve, I don't know, but representatives of the major compiler
> >>>> wiriters work on the C++ standards board so they wouldn't have
> >>>> decided on this fiasco with out good reason.
>
> >>> Sounds like they're lazy. :)
>
> >> I think you're underestimating the problem.
>
> > No, I don't think so.
>
> > Adrian
>
> Global initialisation dependencies can only be done at run time. Imagine
> a constructor which reads a file and then decides which path to go
> depending on what it reads.
>
> Something like what you propose could work (I don't follow it exactly)
> although you've made one simplification. When a global object is being
> constructed, a reference to another global that appears in the
> initialisation list would be constructed before the original object,
> whereas something that appears in the body of the constructor would be
> constructed after the original object. Objects are considered
> constructed when the body of the constructor is entered. But still
> something like that could work.
>
> But consider the cost. The compiler cannot know when it's compiling
> function f in file A.cpp whether any other file might use function f
> during the construction of a global object. So the compiler must put
> 'has it been constructed yet?' logic before every reference to a global
> object. And note that global object doesn't just mean global variable,
> it also means every global array element.

> Checks would also have to be done on every pointer indirection, since a
> pointer could be made to point to an uninitialised global variable. The
> compiler cannot know when compiling function f which takes pointer p as
> a parameter whether function f might be called during the initialisation
> of a global with the p containing the address of another uninitialised
> global.

_Assuming_ that the compiler is stupid and cannot optimise, the cost is
still not that great in comparison to code that works cleanly.

However, compilers are not that stupid anymore. It should know enough
that an object 'A' is static (it is initialising it after all) or not.
And when you touch another static object from that constructor (or
member function that it calls), it would check a static bool that would
indicate if the programme is in initialisation mode or in running mode.
  If in running mode, it could assume that everything is initialised and
not worry about using it. If in initialisation mode, it would determine
if that object is initialised and if not, initialise it if and only if
it about to call a member function.

As for passing pointers and references around, that is not a concern
unless it is referencing an class that is instantiated in the static
space and only if a member function or friend is called on the reference
or pointer. In which case, if in running mode assume all is well,
otherwise it is in initialisation mode so it would have to determine if
the pointer points to the static space. If it is not, then assume all
is well (as this has been allocated in the heap or is some io object),
otherwise determine if that object is not initialised and initialise it
if it is not.

You say 'consider the cost'. Well, what is that cost right now when you
call:

obj_t& getObj()
{
    static obj_t* pObj = new obj_t();
    return *pObj;
// -- or --
    static obj_t obj();
    return obj;
}

You think that is free? It is just the same as what I am describing as
that code requires the 'has it been constructed yet?' logic in it as
well. You are loosing very little if anything in speed (_slight_
pointer speed reduction for pointers to classes that have been declared
in the static space) in running mode. There will be a little more
overhead in initialisation mode, but then initialisation is usually a
little slow to begin with and that overhead will be negligible in
comparison. However you are gaining assured initialisation. What
exactly is the cost of running down an initialisation bug? Or any bug
for that matter that is beyond the control of the programmer? A lot
higher than one that they are in control, that is for sure.

Is it easy? Not like writing a Hello World programme, but writing good
programmes/algorithms rarely are. And since this is the foundation of
what we are coding on, then I don't give a flying f***. I say, "Do it
right!"

> I think this would be a completely unacceptable overhead.

There are lots of different overhead, running overhead is only a small
part. Maintenance is far greater. If a programme is modified which is
written on the edge of working and someone modifies it and it breaks
because of some initialisation problem, that could be far worse overhead
in terms of cost of maintenance.

However, if you really think it is, there is always execution code path
analysis which is used in optimisers. This too could be used to ensure
proper dependency initialisation. You said:

> >> That can't work, for one thing dependencies only occur at runtime,
> >> they can't be worked out in advance. Possible for constructors to
> >> contain if statements so a compiler or linker doesn't know which
> >> branch of the if statement will be taken and so can't work out which
> >> globals are dependent on which.

This may be partially true, but not entirely. Yeah you may not know
which execution path is going to be taken, but you can still setup a
dependency tree and use the methods that I have already described for
the rest 'indeterminable' paths, which would reduce the overhead that I
stated *by far*.

It can be done, it should be done. About my remark about
'representatives of the major compiler writers work on the C++ standards
board' being lazy. I'm probably wrong, they're not lazy, they just
don't want to take responsibility.

Adrian

--
========================================================
          Adrian Hawryluk BSc. Computer Science
--------------------------------------------------------
  Specialising in: OOD Methodologies in UML
                    OOP Methodologies in C, C++ and more
                                 RT Embedded Programming
--------------------------------------------------------
------[blog: http://adrians-musings.blogspot.com/]------
--------------------------------------------------------
    This content is licences under the Creative Commons
     Attribution-Noncommercial-Share Alike 3.0 License
     http://creativecommons.org/licenses/by-nc-sa/3.0/
=========================================================