Re: calling convention stdcalll and cdecl call

From:

"Alf P. Steinbach" <alfps@start.no>

Newsgroups:

microsoft.public.vc.language

Date:

Tue, 22 Jul 2008 18:19:38 +0200

Message-ID:

<P86dnTRuDKYWlRvVnZ2dnUVZ_ofinZ2d@posted.comnet>

* Ben Voigt [C++ MVP]:

[snip]

And second, your article contained four individual fallacies which I
pointed out individually, no one individual pointing-out-of-fallacy
relying on others.

Well, by assigning neatly numbered points to each sequential piece of the
argument, I think we've shown the argument is valid, you contest one of the
premises. Which is quite different from having four fallacies, but that is
the advantage of laying things out so neatly.

This statement could easily be misinterpreted by some reader other than me, due
to the quoting technique you employed here.

The four fallacies were mentioned in the context of your previous article, call
it X, not the one you're talking about above, and we're talking about here, call
it Y, with numbered points.

In article Y I though the reasoning was fine, but premise #1 didn't hold.

I'm now going to quote you out-of-order but I think not out of context.

[snip]

QED, by contradiction of consequence #3 with point #1.

Point #1 does not hold. :-)

Ok, now that we've gotten that squared away, we can address only Point #1 in
the future.

[snip]

I will present my argument in a simpler form to make it easier to
respond to.

Good, thank you.

Premise #1 -- An calling convention implements stdcall iff it in
binary compatible with every other conformant implementation.

This either places very strong constraints on "binary compatible", or
is simply wrong.

In general it's simply wrong -- but see comment at end of this
section below.
Although I don't consider mangled names to be part of stdcall
convention, first example that binary compatibility doesn't hold for
that. g++ produces mangled name "__Z3food@8" for "void __stdcall foo(
double ) {}", whereas MSVC produces mangled name "?foo@@YGXN@Z". I'm
sorry for addressing since you don't bring it explicitly up, but the
vagueness of "binary compatible" means it could mean just about
anything, so, if you need even more forceful arg about that, consider
e.g. not-mangled names of MessageBox routines in user32.dll.
Arguing against myself, it's not unreasonable to consider /C/ mangled
names as part of stdcall convention.

What if we restricted ourselves to "only calls made through a raw function
pointer (i.e. not functor, not pointer-to-member)"? Then we get rid of the
whole "locating-the-function" issue and focus on "calling-the-function".

Not a bad idea. However, at least as I've learned to use these terms, and how
they're used by e.g. Microsoft, "stdcall" is in practice somewhat relative to
context. Within a given language implementation it includes language
implementation specific name mangling. Then there is the simplified C name
mangling as common denominator between language implementations. Then there is
the raw machine code calling convention level, the call-via-pointer level.

Arguing back against myself, hey, that's a neat notion, but only for
a subset of cases; e.g. it falls flat on its face when compared to
the Windows API reality, where names are not mangled that way, but
are certainly stdcall.
Then, if hopefully that's a complete enough exposition of the
mangling as part of convention or not (yes in some cases, no in
general, e.g. for Windows API), example of different machine code for
same __stdcall routine.
    #include <iostream>

    struct Blah { int x; Blah(): x(666) {} };

    Blah __stdcall foo() { Blah x; return x; } // This
routine.
    int main()
    {
        Blah const b = foo();
        std::cout << b.x << std::endl;
    }

Here g++ (default options) returns the result in register EAX,

Looks like g++ has chosen sizeof(x) <= sizeof (EAX), if I might be so bold
as to illegally mix sizeof with a register name -- it's not valid C nor
assembler, do we all understand what this pseudo-code expression means?

This is OK. But stdcall as /documented/ by Microsoft requires using register
pair if necessary, as I recall. So even if Visual C++ padded the struct up to 8
bytes (it did not) then that should not force it to do anything but return
result via register.

.def __Z3foov@0; .scl 2; .type 32; .endef
__Z3foov@0:
push ebp
mov ebp, esp
mov eax, 666
pop ebp
ret

while MSVC (default options) employs RVO where the caller must supply
the address where the result should be placed,

Looks like VC++ has chosen sizeof(x) > sizeof (EAX),

Nope (see below), but it hardly matters. :-)

we might ask ourselves
why, but in that case the binary compatibility problem stems from having
arguments with different layouts, not compiler-dependent choices in how to
implement the calling convention. Or something, because it looks like size
= 4 for the return value.

_x$ = -4 ; size = 4

As you can see here MSVC sets aside 4 bytes for the struct. That fits in EAX,
even without removing any end-padding (and there isn't any). If it didn't fit
then MSVC should, without RVO influencing the decision, choose a register pair.

    ___$ReturnUdt$ = 8 ; size = 4
    ?foo@@YG?AUBlah@@XZ PROC NEAR ; foo
    push ebp
    mov ebp, esp
    push ecx
    lea ecx, DWORD PTR _x$[ebp]
    call ??0Blah@@QAE@XZ ; Blah::Blah

RVO is not used, it would have eliminated the argument x and directly
constructed the return value at [ebp + __$ReturnUdt$__].

Right, in this concrete case. I think I formulated that badly. The possibility
of RVO means that MSVC's simplistic implementation always (at least as far as I
know, I don't have that documented) passes that storage pointer, for non-POD.

But it constructs in the local variable space (< ebp).

mov eax, DWORD PTR ___$ReturnUdt$[ebp]
mov ecx, DWORD PTR _x$[ebp]

And here it copies the variable x to the return code whose address is in the
parameter space (> ebp). No RVO.

mov DWORD PTR [eax], ecx
mov eax, DWORD PTR ___$ReturnUdt$[ebp]

And it also returns the value in eax, like g++ does.

    mov esp, ebp
    pop ebp
    ret 4

I hope you understand this, that in the g++ case above the caller
pushes nothing, calls foo() and gets a result back in register eax,
while in the MSVC case the caller must push the caller's result
storage address before calling foo, i.e., that the generated machine
code for the two tools *is not binary compatible* in spite of both
tools generating perfectly acceptable stdcall.

So it appears. But struct Blah is not POD in C++03 due to the existance of
a constructor, so it's not particularly relevant to any discussion on
variadic functions (I learned my lesson about passing non-PODs to variadic
functions -- the standard forbids it!).

On the contrary, it's very relevant.

One issue was whether you need absolute binary compatibility in order to have
stdcall, that was the issue of your premise #1.

I guess nobody here, well, almost nobody, would deny that the above two binary
incompatible routines are both stdcall, and from the same source code.

So whatever lets this work in the context of the imagined hypothetical problems
with binary compatibility, also lets stdcall variadic functions work even if
they're implemented in wildly different ways by different tools.

Or put another way, if stdcall variadic functions can't work and be stdcall when
they're implemented differently by different tools, then neither can the above.
The above does work no problem. Hence.

Another issue was whether a hidden argument can be passed by compiler, and still
have stdcall.

It is above.

I can elaborate if you want, but the point is, evidenced by facts
above, your premise #1 is simply wrong in general -- although, I
hasten the emphasise, it *is* an important consideration for a large
subset of cases.

I agree. We do need to define what it means to be stdcall better.
Otherwise we could say that fastcall and cdecl functions are stdcall, which
we know isn't true in general.

What would be Good would IMHO be

   * An authoritative definition of what registers must be preserved by function
     (two different sets are stated by Microsoft, natural choice would be the
     smallest one, which I think is also the latest, in order not to break code).

   * An authoritative definition of what registers can be clobbered by e.g.
     a trampoline function such as Liviu suggested (natural choice would be
     none, then let non-conforming code, if any, fight it out in market).

   * An authoritative definition of how non-static member function's this pointer
     is passed, and perhaps ditto for args total size for variadic function.

   * An authoritative definition of for which argument types the stdcall
     convention requires a certain way of passing them / returning them. This is
     the essential thing saying that if you have routine with such restricted
     signature, you really know how to call it at machine code level,
     irrespective of tool.

   * And what it allows (tool-dependent) for other argument types.

and not the least, as I think you're saying above and as I've lamented on (is
that correct English?) several times in this thread,

   * A *clear separation* of what's specific to Windows API stdcall ("general
     stdcall"), what's specific to MSVC "C", and what's specific to MSVC "C++".

plus of course linking from all the five or six or more current pages to one
common page.

It might seem a little bit late for this, as we're passing over into x64 world.

But I think x86 32-bit code will continue to be produced and maintained for a
few years still.

Cheers,

- Alf

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?