Re: Overhead diffences between structs and classes
On 28 Aug., 23:58, Acinonyx <stian.l...@gmail.com> wrote:
On 27 Aug, 08:04, peter koch larsen <peter.koch.lar...@gmail.com>
wrote:
On 26 Aug., 17:50, Acinonyx <stian.l...@gmail.com> wrote:
I currently working on a multithreaded framework not really related to
this query, except that during testing I happened to stumble upon
something I really hadn't given any thought earlier. The scenario was:
I had two threads communicating through a (CAS2 based) lock-free queue
[1]. 10^8 integers were passed from one to another.
The 8byte words needed by CAS2 could either be made as structs or
classes, either trivially implemented.
typedef struct double_word_t
{
int a;
int b;
} int8b;
struct double_word_t toInt8b(int a, int b)
{ return (struct double_word_t){a,b};}
or
class int8b
{
public:
int a;
int b;
int8b():a(0), b(0) {}
int8b(int ia, int ib):a(ia), b(ib) {}
};
Usage of these structures were in short:
CAS2(... , ... , toInt(a,b));
and
CAS2(... , ... , int8b(a,b));
Now, what really makes me wonder, is that using the struct, copying
the hundred million integers takes roughly 33 secs, while using the
class version takes nearly 90 secs. I realize that this might not be
very surprising at all, and that the real overhead arise from me using
a constructor in one case and a simple cast in another. However, the
gain is significant, and I know that I for one will be considering
this more thoroughly next time.
If my assertion is wrong, or if there are other factors which I
haven't taken into account, I gather someone around here will indulge
me.
I would not have expected any difference at all. My guess is that your
compiler settings are wrong - remember to optimize the code.
If I guessed wrong, perhaps you are using a sub-standard/very old
compiler?
{ Edits: quoted signature & clc++m banner removed. Challenge: how to make people
aware that the banner is automatically appended to every article? And Note: this
is not strictly about C++ since the toInt8b function isn't valid C++. -mod }
True, optimization seems to do the trick. In fact, with -02
optimization, the class version is faster, completing the transfer in
~25s vs. the struct's ~30s. Didn't see that one coming. What sort of
shortcuts are taken here to quadruple the speed? And why isn't the
struct given the same consideration?
The shortcut probably is to inline the constructor call - inlined not
in the C++ term, but in the meaning that no function-call is
performed. Also, the structure is analyzed as being memcpy-able.
In general, C++ performance lives and dies with a good optimizer. All
these abstractions which are so nice for us human beings need to be
optimized away in order for the code to perform optimally. A factor of
10 is not at all uncommon in the code I create. Not that it always
matters. In the old days (10 years ago), I only applied trivial
optimisation to my code, as the code had a tendency to break because
of buggy optimisations. Today I find that the compilers I know have
improved a lot in that area, and I have not found any bugs in that
part of the code-generation.
/Peter
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]