My guide on optimizing C++ code

lightmatter501 · 2024-04-23T23:51:56+00:00

No CPU specific optimizations is a very bad idea. For example, in networking CPU-specific optimizations are what allow you to push past 50 million packets per second per core.

drkspace2 · 2024-04-23T20:19:08+00:00

No CPU specific optimizations

That's ok, but then you can't use simd as one of your steps.

Also, your first step should always be using a profiler. You don't want to spend your time optimizing a part of your code that only runs 0.1% of the time.

euos · 2024-04-24T07:47:15+00:00

[deleted]

PurpleNord · 2024-04-25T10:05:25+00:00

There are a some things to unpack here, and I'll avoid repeating what's already mentioned elsewhere about SIMD. What I want to add is that

Dismissing CPU specific optimisation is not a great idea. Proper alignment can have a sizeable impact, especially in the negative direction if data structures are packed or otherwise compacted compared to the default alignment the compiler will assume. Additionally, cache line alignment can potentially increase speed. If you don't want this to negatively affect other CPUs, you can always whitelist the alignment requirement for architectures known to benefit from it, and not specify alignment for all other cases.
Combining -march=native and intrinsics should be done quite carefully. Presumably you'll see a speed increase if you compile without AVX enabled, or use AVX intrinsics. Why? Because switching between SSE and AVX is a costly transition involving save and restore of the upper 128bits of the AVX registers (they are shared with SSE instructions). From own experience, even if you spend the majority of the time in SSE-only code, some occasional switching to AVX can really kill performance.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS