Le Chaud Lapin <>
Tue, 8 Dec 2009 18:41:41 CST
On Dec 7, 1:55 pm, Goran Pusic <> wrote:

On Dec 6, 1:07 am, Le Chaud Lapin <> wrote:
(I find it hard to believe that your concern is code size, it's about
speed, right? If so...)

I was going on the assumption that code A that is 15 times larger than
code B is generally slower than code B.

Are you sure about that overhead? I just made a smallest possible
memmove function I could think of (cld, init esi/edi/ecx, rep movsd).
I compared speed of that (inlined and noninlined), with stock

Speed-wise, I see only statistically irrelevant differences, inlined
version being (in some, not all, runs) less than 5% faster than the
other two. IOW, I think you have fallen in the trap of optimization
without measurement and this whole discussion is HORRIBLY irrelevant.

Having written quite a bit of x86 assembly Ye Olden Days, I find it
hard to believe that the difference is "statiscally irrelevant"
between a movs and the 200+ instructions in full version of memcpy, at
least 95 of which gets executed for a stock operator =. That's
excluding stack manipulaation and function calls.

-Le Chaud Lapin-

