I currently working on a multithreaded framework not really related to
this query, except that during testing I happened to stumble upon
something I really hadn't given any thought earlier. The scenario was:
I had two threads communicating through a (CAS2 based) lock-free queue
[1]. 10^8 integers were passed from one to another.

The 8byte words needed by CAS2 could either be made as structs or
classes, either trivially implemented.

typedef struct double_word_t
     int a;
     int b;

} int8b;

struct double_word_t toInt8b(int a, int b)
{ return (struct double_word_t){a,b};}


class int8b
     int a;
     int b;

     int8b():a(0), b(0) {}
     int8b(int ia, int ib):a(ia), b(ib) {}


Usage of these structures were in short:
CAS2(... , ... , toInt(a,b));
CAS2(... , ... , int8b(a,b));

Now, what really makes me wonder, is that using the struct, copying
the hundred million integers takes roughly 33 secs, while using the
class version takes nearly 90 secs. I realize that this might not be
very surprising at all, and that the real overhead arise from me using
a constructor in one case and a simple cast in another. However, the
gain is significant, and I know that I for one will be considering
this more thoroughly next time.

If my assertion is wrong, or if there are other factors which I
haven't taken into account, I gather someone around here will indulge

I would not have expected any difference at all. My guess is that your
compiler settings are wrong - remember to optimize the code.
If I guessed wrong, perhaps you are using a sub-standard/very old


