Re: compiler smarts: register variables and catching exceptions
andrew_nuss@yahoo.com wrote:
I put the register keyword there primarily for effect. Let me
be more explicit. This is a virtual machine loop, and the
four stack pointers is an exaggeration, there are only 3 of
them, plus an opcode array whose offset doesn't change.
I'm assuming that the compiler has profiled the switch
statement and seen that so many of the sp1..sp3 accesses are
happening, that it will want to use registers for those
stacks.
Maybe. Or maybe just some of the time.
Remember, too, that on some architectures (including IA-32), the
number of registers is extremely limited. One of the reasons
the register keyword has become counterproductive is that
compilers have become very good at knowing which variables to
put in registers, when---rather than have the same value
constantly in the same register, it will switch them around
depending on what it is doing at any one particular point in the
function.
As to the issue of function calls and throws, that happens in
only a few of the switch cases. Most of the cases are just 2
or 3 statements plus some inline function calls. However,
from the compiler's standpoint, any of the function calls made
in any of the switch cases could throw specialexception, and
unfortunately, I need to catch it, clean up the stacks, and
resume processing in many cases. That means that the values
of sp1..sp3 at the time of the function call that throws are
needed in the catch block, and my guess is that this is
easiest for the compiler if those values are held in the
frame.
It's easiest for the compiler to just keep all variables
(including compiler generated temporaries) in memory. The whole
point of optimization is that the compiler does extra work, to
find a solution which is better than the obvious one.
main {
// assume that these are used so frequently that
// compiler would choose to put them in registers on its own
// (aside from catch block issues seen below)
register int* sp1 = ...;
register int* sp2 = ...;
register int* sp3 = ...;
register int* bytecodes = ...;
register int cursor = 0;
On an IA-32, I'd be surprised if you got more than two or three
in a register. Once that stack frame has been set up, the
compiler only has six registers to play with, and even very
simple expressions can require three registers to implement.
On a Sparc, of course, there would be no problem.
do {
try {
do {
int opcode = bytecodes[cursor++];
switch (opcode) {
case DUP1: {
// very important for sp1 to be a register
// because this is a common opcode
// even though its only one case in the switch
// and there's only 2 statements here
int temp = *(sp1-1);
*--sp1 = temp;
break;
}
case CALLNATIVE: {
// call a function which could throw special exception
// this case does not happen frequently and does
// not need to be fast.
*--sp2 = MyFunctionWhichCanThrow(*sp1++, *sp1++);
Well, this epxression has undefined behavior. You need to break
out at least one of the auto-increments into a separate
expression.
Beyond that, it also depends on the calling conventions of the
local API: does it save registers, or not? On a Sparc, the
conventions (actually, the underlying hardware) ensures that 16
registers will be saved; the compiler would doubtlessly put the
register variables in those, and not bother spilling to memory
(and of course, the exception handling routine can also access
those registers). I'm less familiar with the conventions used
on modern Intel processors; back in the days of the 8086, the
usual convetion was that all of the six general registers were
volatile, and that it was up to the caller to save anything he
needed. In such cases, the compiler will spill the registers to
memory, exceptions or no.
If the compiler is using some sort of dynamic register
assignment scheme, of course, it will have to ensure that the
values are somewhere the exception handling routine can find
them. Spilling to memory is the easiest solution, but the
compiler could just as well arrange to use a different point of
entry to the catch block, which would move the values from where
ever they happened to be to where ever the catch block expected
them.
break;
}
}
} while (true);
} catch (SpecialException& e) {
...
// manipulate sp1..sp3 as seen in first posting
// does the fact that sp1..sp3 are being used in the catch block
// mean that registers will not be chosen by the compiler for them???
// otherwise, the compiler would have to unwind not only the frame
// but also the complete register state. I just cannot see how a compiler
// could do this.
It depends on the hardware, and how exceptions are being
handled. With the standard handling algorithm, the compiler
generates a map which maps return addresses to exception
handling code---and there is absolutely nothing to prevent it
from using different mappings for different call sites, even in
cases where the code ends up in the same user code.
}
} while (true);
}
In the end, the only real answer one can give is to profile
different variants, to see what the actual effect is on
performance.
--
James Kanze GABI Software
Conseils en informatique orient?e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S?mard, 78210 St.-Cyr-l'?cole, France, +33 (0)1 30 23 00 34
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]