Re: How to write a library with a static object?

From:

Adrian Hawryluk <adrian.hawryluk-at-gmail.com@nospam.com>

Newsgroups:

comp.lang.c++

Date:

Sun, 11 Mar 2007 14:24:07 GMT

Message-ID:

<biUIh.32784$lY6.13253@edtnps90>

John Harrison wrote:

Adrian Hawryluk wrote:

On Mar 10, 3:20 am, John Harrison <john_androni...@hotmail.com> wrote:
> AdrianHawrylukwrote:
> > John Harrison wrote:
>
> >> AdrianHawrylukwrote:
>
> >>> John Harrison wrote:
>
> >>>> AdrianHawrylukwrote:
>
> >>>>> Excuse me, but does anyone know why this fiasco even exist?
Why is
> >>>>> there not some dependency mechanism in place? Or something like
> >>>>> the static local object being constructed when the control
flows to
> >>>>> them (as opposed to over them)?
>
> >>>> It's a good question. One issue would be that with such a
scheme the
> >>>> order of initialisation would be unpredictable, and that wouldn't
> >>>> mesh well with the one guarantee that the standard does give,
which
> >>>> is that definitions within a single file are initialised in the
> >>>> order that they occur in that file. I know at times I found that
> >>>> certainty to be useful.
>
> >>> I wasn't saying that order of initialisation as they occur in the
> >>> source file should change. I ment that the order of
initilisation in
> >>> the group of different source files should change based on the
> >>> dependency of one source file to another. It seems reasonable and
> >>> doable. 'make' can do it based on these dependencies (when given
> >>> that some compilers can spit out dependencies based on the
included
> >>> files).
>
> >> That can't work, for one thing dependencies only occur at runtime,
> >> they can't be worked out in advance. Possible for constructors to
> >> contain if statements so a compiler or linker doesn't know which
> >> branch of the if statement will be taken and so can't work out
which
> >> globals are dependent on which.
>
> >> Secondly it's perfectly possible for files to be mutally
dependendent.
> >> One global in one file requires another global in a second file,
but a
> >> different global in that second file requires yet another global in
> >> the first file.
>
> > Yeah, I had thought about this the night I posted it.
Initialisation
> > would have to be worked out at compile time (very difficult, but not
> > impossible using path analysis) or run time (much easier to
implement).
> > Mutual dependencies would be a problem if done on the file
level, so it
> > would have to initialise in order of occurrence in the file
first, and
> > then order of dependency second.
>
> > This may cause some objects in a file to be initialised out of
order,
> > but that shouldn't be a problem since the order of initialisation is
> > based on need. Since it is more difficult to perform path
analysis on
> > the code, this could be done by the following algorithm:
>
> > Init static object:
> > 1. Put object at the end of the initialisation double linked list
> > 2. Store insertion point marker
> > 3. Init Object.
> > 4. Object needs other Object that is not yet initialised, so insert
> > other in initialisation list at insertion marker.
> > 5. Init other Object. Loop back to 4 if another dependency is
found.
>
> > So you see, this is done for each object found in a file.
Destruction
> > would be done in reverse order of the list. Everything is fixed
up nicely.
>
> >>>> There may be reasons why what you're suggesting isn't easy to
> >>>> achieve, I don't know, but representatives of the major compiler
> >>>> wiriters work on the C++ standards board so they wouldn't have
> >>>> decided on this fiasco with out good reason.
>
> >>> Sounds like they're lazy. :)
>
> >> I think you're underestimating the problem.
>
> > No, I don't think so.
>
> > Adrian
>
> Global initialisation dependencies can only be done at run time.
Imagine
> a constructor which reads a file and then decides which path to go
> depending on what it reads.
>
> Something like what you propose could work (I don't follow it exactly)
> although you've made one simplification. When a global object is being
> constructed, a reference to another global that appears in the
> initialisation list would be constructed before the original object,
> whereas something that appears in the body of the constructor would be
> constructed after the original object. Objects are considered
> constructed when the body of the constructor is entered. But still
> something like that could work.
>
> But consider the cost. The compiler cannot know when it's compiling
> function f in file A.cpp whether any other file might use function f
> during the construction of a global object. So the compiler must put
> 'has it been constructed yet?' logic before every reference to a
global
> object. And note that global object doesn't just mean global variable,
> it also means every global array element.

> Checks would also have to be done on every pointer indirection,
since a
> pointer could be made to point to an uninitialised global variable.
The
> compiler cannot know when compiling function f which takes pointer
p as
> a parameter whether function f might be called during the
initialisation
> of a global with the p containing the address of another uninitialised
> global.

_Assuming_ that the compiler is stupid and cannot optimise, the cost
is still not that great in comparison to code that works cleanly.

Well I guess the point is that the compiler cannot make these kind of
optimisations. Only the linker which sees the whole program can.
Traditionally linkers are dumb. They have to cope with many different
languages so have a pretty simple model of how a program works. C++
specific optimisations are not part of traditional linkers.

It's true that more sophisticated linkers are now on the market. I don't
know of any that do the kind of optmisation that you're suggesting, but
it would be possible. In fact this discussion has increased by
understanding of what the C++ standard says and why. They have
deliberately left the door open for exactly what you are proposing. But
I don't know of any compiler/linker that actually does this kind of thing.

True. But there are compilers that generate an executable. Are all of
these compilers executing a linker programme? If not, then the compiler
can do the smarts. Otherwise, perhaps you are right in that the linker
would have to become smarter, and/or the object format would have to
change to describe dependencies to simplify this work.

However, compilers are not that stupid anymore. It should know enough
that an object 'A' is static (it is initialising it after all) or not.
And when you touch another static object from that constructor (or
member function that it calls),

or global function that it calls. Almost no code can be assumed to not
require this kind of checking.

Yeah, I meant to say any function that it calls that operates on a
static object.

it would check a static bool that would

indicate if the programme is in initialisation mode or in running
mode. If in running mode, it could assume that everything is
initialised and not worry about using it. If in initialisation mode,
it would determine if that object is initialised and if not,
initialise it if and only if it about to call a member function.

As for passing pointers and references around, that is not a concern
unless it is referencing an class that is instantiated in the static
space and only if a member function or friend is called on the
reference or pointer.

Any form of pointer dereference would require a check. The compiler
cannot know if a particular class is instantiated in the static cpace
because it doesn't have access to the whole program, so it must include
the check on every pointer dereference. The linker could then remove the
check, but traditional linkers don't.

You are right, this sounds like it would have to be done on the linker side.

In which case, if in running mode assume all is well,

otherwise it is in initialisation mode so it would have to determine
if the pointer points to the static space. If it is not, then assume
all is well (as this has been allocated in the heap or is some io
object), otherwise determine if that object is not initialised and
initialise it if it is not.

You say 'consider the cost'. Well, what is that cost right now when
you call:

obj_t& getObj()
{
   static obj_t* pObj = new obj_t();
   return *pObj;
// -- or --
   static obj_t obj();
   return obj;
}

You think that is free? It is just the same as what I am describing
as that code requires the 'has it been constructed yet?'

Yes it is, but you've chosen to accept that cost by wrapping your global
access in a function. This is a good situation, if you want the safety
and you're prepared to accept the cost, then there is a way that you can.

No, I've chosen to accept that accessing a static should by default have
this type of semantics.

logic in it as

well. You are loosing very little if anything in speed (_slight_
pointer speed reduction for pointers to classes that have been
declared in the static space) in running mode. There will be a little
more overhead in initialisation mode, but then initialisation is
usually a little slow to begin with and that overhead will be
negligible in comparison. However you are gaining assured
initialisation. What exactly is the cost of running down an
initialisation bug? Or any bug for that matter that is beyond the
control of the programmer? A lot higher than one that they are in
control, that is for sure.

Is it easy? Not like writing a Hello World programme, but writing
good programmes/algorithms rarely are. And since this is the
foundation of what we are coding on, then I don't give a flying f***.
I say, "Do it right!"

> I think this would be a completely unacceptable overhead.

There are lots of different overhead, running overhead is only a small
part. Maintenance is far greater. If a programme is modified which
is written on the edge of working and someone modifies it and it
breaks because of some initialisation problem, that could be far worse
overhead in terms of cost of maintenance.

However, if you really think it is, there is always execution code
path analysis which is used in optimisers. This too could be used to
ensure proper dependency initialisation. You said:

> >> That can't work, for one thing dependencies only occur at runtime,
> >> they can't be worked out in advance. Possible for constructors to
> >> contain if statements so a compiler or linker doesn't know which
> >> branch of the if statement will be taken and so can't work out
which
> >> globals are dependent on which.

This may be partially true, but not entirely. Yeah you may not know
which execution path is going to be taken, but you can still setup a
dependency tree and use the methods that I have already described for
the rest 'indeterminable' paths, which would reduce the overhead that
I stated *by far*.

It can be done, it should be done. About my remark about
'representatives of the major compiler writers work on the C++
standards board' being lazy. I'm probably wrong, they're not lazy,
they just don't want to take responsibility.

Your original question was 'why does this problem even exist' I didn't
know the answer when you first asked. This discussion has meant I have a
much better understanding of why it is so. I guess we are going to have
to disagree on whether the current situation is a good thing or not.
Maybe a future revision of the standard will make your suggestions
compulsory.

I definitely have a better grasp as well. It is not a trivial problem,
just a fairly difficult one, and one that most likely needs to be
deferred to the linker. The object file would also probably have to
somehow contain semantics of dependencies for this to be resolved fully

Thanks for the discussion.

Adrian

--
========================================================
          Adrian Hawryluk BSc. Computer Science
--------------------------------------------------------
  Specialising in: OOD Methodologies in UML
                    OOP Methodologies in C, C++ and more
                                 RT Embedded Programming
--------------------------------------------------------
------[blog: http://adrians-musings.blogspot.com/]------
--------------------------------------------------------
    This content is licences under the Creative Commons
     Attribution-Noncommercial-Share Alike 3.0 License
     http://creativecommons.org/licenses/by-nc-sa/3.0/
=========================================================