Computers and heat density
spc at conman.org
Tue Aug 15 14:56:13 CDT 2006
It was thus said that the Great Kevin Handy once stated:
> Chuck Guzis wrote:
> Never trust an optimization unless you see it for yourself.
> Looking at a real-world example of generated code, I wrote
[ some C code to test a loop vs. memset() ]
> This generated the following output using O3 level optimization
[ Assembly code from GCC ]
> So, in this case the loop operates inline, while the memset
> version pushes parameters onto the stack, then calls an
> external function.
> I suspect that the loop version, which generates inline code with
> hardwired values, is likely to be slightly faster than the call to
> memset, unless memset is doing something very odd.
Well ... how ... disappointing.
I played around with GCC 18.104.22.168 (yes, old) and the code produced wasn't
different at all, and I tried several different varients to the memset()
call (declaring the variable local, reducing the amount set to 4 bytes,
changing the type, etc).
Now, I could understand if the clear routine accepted a char pointer as a
parameter---then you have issues of alignment that the compiler can't make,
but that wasn't the case (since you were calling memset() directly on arrays
which will have proper alignment). The only thing I can think that memset()
might do that is unusual is that on Pentium class machines, using the
floating point hardware to write to memory is faster than using the 32 bit
registers, but I could only see that being used for Pentium specific code
(and even then, only certain generations of the Pentium).
-spc (Maybe they felt that adding compiler support for memset() wasn't
worth the time since most code uses explicit loops?)
More information about the cctech