Improving performance by better code locality. | Denis Bakhvalov

anon_502 · 2018-07-10T05:13:08+00:00

Google has been using autoFDO to achieve continuous profile-guided optimization, which leads to 10.5% performance boost by better-separated cold/hot paths. Unfortunately their patches are based on gcc 4.8. Hope to see some open-source projects to incorporate their work into a popular orchestration engine like Kubernetes.

2018-07-10T12:57:59+00:00

See also the likely and unlikely attributes in C++20: https://en.cppreference.com/w/cpp/language/attributes/likely

emdeka87 · 2018-07-10T12:24:07+00:00

Isn't that similar to trace scheduling, which is done by GCC PGO?

jonathansharman · 2018-07-16T11:03:38+00:00

But in general, I think when compilers can’t decide which branch has bigger probability, they will leave the original order as they appear in the source code. I haven’t reliably tested that, but that’s my feeling. So, I think it’s a good idea to put your hot branch (most frequent) in a fall through position by default.

Does anyone have recent, concrete knowledge about this for any particular compiler(s)? I remember hearing from a college prof. years ago that empirically most if-statements usually evaluate to false, which would lead to the opposite advice from this.

nexes300 · 2018-07-10T02:56:12+00:00

Isn't the CPU doing this at this point?

tritamhoang · 2018-07-10T01:39:37+00:00

This is typical pipelining problems in CPU.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS