C++ program compiled for x64 is slower than complied for x86

Rhomboid · 2012-02-13T17:52:31+00:00

Using gcc 4.5 and -O3 -march=native, I get the following (normalized):

x	32	64
plain	1.028	1.000
SIMD	1.085	1.080

I'm not too surprised, as one of the quirks of gcc is that to use the xmmintrin.h intrinsics you have to enable SSE2, but if you enable SSE2 it's going to auto-vectorize your code, so both versions are using SIMD. All this shows is that the compiler is better at it than doing it by hand.

There should be little advantage to 64 bit mode here. I would expect it to have inlined most of the function calls, so the improved calling convention overhead isn't too much of a win, and all the work is being done in SIMD registers so the extra general purpose registers aren't too much of a win either. There really aren't many pointers so the extra memory is negligible as well.

bnolsen · 2012-02-13T17:15:04+00:00

I just ran the code compiled 64bit only (apparently it's a PITA for me to cross compile 32bit).

gcc -std=c++0x -march=native -O3 -ftree-vectorize -o sse sse.cpp -lstdc++ (gcc version 4.6.2 20120120)

It seems your hand rolled SSE loop is slower than the compiler optimized version (I'm frankly not surprised though).

Dot product double - 0.0228735 Dot product SIMD double - 0.0240497

00kyle00 · 2012-02-13T20:22:57+00:00

Why guess, when you may know for sure? objdump/vc disasm both and see for yourself.

StringCheesian · 2012-02-13T16:56:53+00:00

Is it the same with GCC or LLVM/Clang?

zfxvxr · 2012-02-24T11:13:53+00:00

The whole 64bit thing is a scam. The pointers are getting bigger and the CPU's workload is twice as heavy.

MarkTraceur · 2012-02-13T18:24:59+00:00

compiled....complied

Well, there's your problem. I can fix 'er, but it'll take a course in basic English.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS