Re: calling convention stdcalll and cdecl call
 
* Ben Voigt [C++ MVP]:
[snip]
And second, your article contained four individual fallacies which I
pointed out individually, no one individual pointing-out-of-fallacy
relying on others.
Well, by assigning neatly numbered points to each sequential piece of the 
argument, I think we've shown the argument is valid, you contest one of the 
premises.  Which is quite different from having four fallacies, but that is 
the advantage of laying things out so neatly.
This statement could easily be misinterpreted by some reader other than me, due 
to the quoting technique you employed here.
The four fallacies were mentioned in the context of your previous article, call 
it X, not the one you're talking about above, and we're talking about here, call 
it Y, with numbered points.
In article Y I though the reasoning was fine, but premise #1 didn't hold.
I'm now going to quote you out-of-order but I think not out of context.
[snip]
QED, by contradiction of consequence #3 with point #1.
Point #1 does not hold. :-)
Ok, now that we've gotten that squared away, we can address only Point #1 in 
the future.
[snip]
I will present my argument in a simpler form to make it easier to
respond to.
Good, thank you.
Premise #1 -- An calling convention implements stdcall iff it in
binary compatible with every other conformant implementation.
This either places very strong constraints on "binary compatible", or
is simply wrong.
In general it's simply wrong  --  but see comment at end of this
section below.
Although I don't consider mangled names to be part of stdcall
convention, first example that binary compatibility doesn't hold for
that. g++ produces mangled name "__Z3food@8" for "void __stdcall foo(
double ) {}", whereas MSVC produces mangled name  "?foo@@YGXN@Z". I'm
sorry for addressing since you don't bring it explicitly up, but the
vagueness of "binary compatible" means it could mean just about
anything, so, if you need even more forceful arg about that, consider
e.g. not-mangled names of MessageBox routines in user32.dll.
Arguing against myself, it's not unreasonable to consider /C/ mangled
names as part of stdcall convention.
What if we restricted ourselves to "only calls made through a raw function 
pointer (i.e. not functor, not pointer-to-member)"?  Then we get rid of the 
whole "locating-the-function" issue and focus on "calling-the-function".
Not a bad idea. However, at least as I've learned to use these terms, and how 
they're used by e.g. Microsoft, "stdcall" is in practice somewhat relative to 
context.  Within a given language implementation it includes language 
implementation specific name mangling. Then there is the simplified C name 
mangling as common denominator between language implementations. Then there is 
the raw machine code calling convention level, the call-via-pointer level.
Arguing back against myself, hey, that's a neat notion, but only for
a subset of cases; e.g. it falls flat on its face when compared to
the Windows API reality, where names are not mangled that way, but
are certainly stdcall.
Then, if hopefully that's a complete enough exposition of the
mangling as part of convention or not (yes in some cases, no in
general, e.g. for Windows API), example of different machine code for
same __stdcall routine.
    #include <iostream>
    struct Blah { int x; Blah(): x(666) {} };
    Blah __stdcall foo() { Blah x; return x; }          // This
routine.
    int main()
    {
        Blah const b = foo();
        std::cout << b.x << std::endl;
    }
Here g++ (default options) returns the result in register EAX,
Looks like g++ has chosen sizeof(x) <= sizeof (EAX), if I might be so bold 
as to illegally mix sizeof with a register name -- it's not valid C nor 
assembler, do we all understand what this pseudo-code expression means?
This is OK.  But stdcall as /documented/ by Microsoft requires using register 
pair if necessary, as I recall.  So even if Visual C++ padded the struct up to 8 
bytes (it did not) then that should not force it to do anything but return 
result via register.
.def __Z3foov@0; .scl 2; .type 32; .endef
__Z3foov@0:
push ebp
mov ebp, esp
mov eax, 666
pop ebp
ret
while MSVC (default options) employs RVO where the caller must supply
the address where the result should be placed,
Looks like VC++ has chosen sizeof(x) > sizeof (EAX),
Nope (see below), but it hardly matters. :-)
we might ask ourselves 
why, but in that case the binary compatibility problem stems from having 
arguments with different layouts, not compiler-dependent choices in how to 
implement the calling convention.  Or something, because it looks like size 
= 4 for the return value.
    _x$ = -4 ; size = 4
As you can see here MSVC sets aside 4 bytes for the struct. That fits in EAX, 
even without removing any end-padding (and there isn't any). If it didn't fit 
then MSVC should, without RVO influencing the decision, choose a register pair.
    ___$ReturnUdt$ = 8 ; size = 4
    ?foo@@YG?AUBlah@@XZ PROC NEAR ; foo
    push ebp
    mov ebp, esp
    push ecx
    lea ecx, DWORD PTR _x$[ebp]
    call ??0Blah@@QAE@XZ ; Blah::Blah
RVO is not used, it would have eliminated the argument x and directly 
constructed the return value at [ebp + __$ReturnUdt$__].
Right, in this concrete case. I think I formulated that badly. The possibility 
of RVO means that MSVC's simplistic implementation always (at least as far as I 
know, I don't have that documented) passes that storage pointer, for non-POD.
But it constructs in the local variable space (< ebp).
    mov eax, DWORD PTR ___$ReturnUdt$[ebp]
    mov ecx, DWORD PTR _x$[ebp]
And here it copies the variable x to the return code whose address is in the 
parameter space (> ebp).  No RVO.
    mov DWORD PTR [eax], ecx
    mov eax, DWORD PTR ___$ReturnUdt$[ebp]
And it also returns the value in eax, like g++ does.
    mov esp, ebp
    pop ebp
    ret 4
I hope you understand this, that in the g++ case above the caller
pushes nothing, calls foo() and gets a result back in register eax,
while in the MSVC case the caller must push the caller's result
storage address before calling foo, i.e., that the generated machine
code for the two tools *is not binary compatible* in spite of both
tools generating perfectly acceptable stdcall.
So it appears.  But struct Blah is not POD in C++03 due to the existance of 
a constructor, so it's not particularly relevant to any discussion on 
variadic functions (I learned my lesson about passing non-PODs to variadic 
functions -- the standard forbids it!).
On the contrary, it's very relevant.
One issue was whether you need absolute binary compatibility in order to have 
stdcall, that was the issue of your premise #1.
I guess nobody here, well, almost nobody, would deny that the above two binary 
incompatible routines are both stdcall, and from the same source code.
So whatever lets this work in the context of the imagined hypothetical problems 
with binary compatibility, also lets stdcall variadic functions work even if 
they're implemented in wildly different ways by different tools.
Or put another way, if stdcall variadic functions can't work and be stdcall when 
they're implemented differently by different tools, then neither can the above. 
The above does work no problem. Hence.
Another issue was whether a hidden argument can be passed by compiler, and still 
have stdcall.
It is above.
I can elaborate if you want, but the point is, evidenced by facts
above, your premise #1 is simply wrong in general  --  although, I
hasten the emphasise, it *is* an important consideration for a large
subset of cases.
I agree.  We do need to define what it means to be stdcall better. 
Otherwise we could say that fastcall and cdecl functions are stdcall, which 
we know isn't true in general.
What would be Good would IMHO be
   * An authoritative definition of what registers must be preserved by function
     (two different sets are stated by Microsoft, natural choice would be the
     smallest one, which I think is also the latest, in order not to break code).
   * An authoritative definition of what registers can be clobbered by e.g.
     a trampoline function such as Liviu suggested (natural choice would be
     none, then let non-conforming code, if any, fight it out in market).
   * An authoritative definition of how non-static member function's this pointer
     is passed, and perhaps ditto for args total size for variadic function.
   * An authoritative definition of for which argument types the stdcall
     convention requires a certain way of passing them / returning them. This is
     the essential thing saying that if you have routine with such restricted
     signature, you really know how to call it at machine code level,
     irrespective of tool.
   * And what it allows (tool-dependent) for other argument types.
and not the least, as I think you're saying above and as I've lamented on (is 
that correct English?) several times in this thread,
   * A *clear separation* of what's specific to Windows API stdcall ("general
     stdcall"), what's specific to MSVC "C", and what's specific to MSVC "C++".
plus of course linking from all the five or six or more current pages to one 
common page.
It might seem a little bit late for this, as we're passing over into x64 world.
But I think x86 32-bit code will continue to be produced and maintained for a 
few years still.
Cheers,
- Alf
-- 
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?