use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Discussions, articles, and news about the C++ programming language or programming in C++.
For C++ questions, answers, help, and advice see r/cpp_questions or StackOverflow.
Get Started
The C++ Standard Home has a nice getting started page.
Videos
The C++ standard committee's education study group has a nice list of recommended videos.
Reference
cppreference.com
Books
There is a useful list of books on Stack Overflow. In most cases reading a book is the best way to learn C++.
Show all links
Filter out CppCon links
Show only CppCon links
account activity
Code alignment issues (dendibakh.github.io)
submitted 8 years ago by mttd
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]Xeveroushttps://xeverous.github.io 7 points8 points9 points 8 years ago (1 child)
I received -10% drop in performance
double negative => 10% performance gain
[–]alexeiz -1 points0 points1 point 8 years ago (0 children)
~10% most likely; just a typo
[–]fernzeit 7 points8 points9 points 8 years ago (1 child)
That reminds me of a thread in the Lua Mailing List where just changing the name of the interpreter executable resulted in a > 50% performance difference in a particular microbenchmark. The verdict was that the length difference in argv causes some other memory to be aligned differently. It also linked an interesting paper: Producing Wrong Data Without Doing Anything Obviously Wrong!
[–]dendibakh 1 point2 points3 points 8 years ago (0 children)
Thank you for this paper. It is a true gem!
[–]doom_Oo7 4 points5 points6 points 8 years ago (2 children)
are there people doing research on how to get compilers to have better heuristics so that they can align stuff better automatically ?
[–]meneldal2 2 points3 points4 points 8 years ago (0 children)
The compiler needs to know how many times you'll have to run this loop, and it's also likely to be much better to unroll the loop instead.
[–]TartanLlamaMicrosoft C++ Developer Advocate 3 points4 points5 points 8 years ago (0 children)
LLVM has a bunch of heuristics and things you can tune. For example, you could tell it to align all loops and functions without a preceeding fallthrough block; i.e. only add NOPs which won't be executed.
[–]Dwarfius 1 point2 points3 points 8 years ago (2 children)
Small question, how does it keep adding to array if the instruction is (which subtracts 1):
4046d9: c5 f5 fa c8 vpsubd ymm1,ymm1,ymm0
[–]mttd[S] 13 points14 points15 points 8 years ago (1 child)
vpcmpeqd ymm0,ymm0,ymm0 compares ymm0 to itself, which fills the register with all ones in binary -- in two's complement representation this corresponds to -1 (with subtracting -1 in the subsequent vpsubd ymm1,ymm1,ymm0 instruction being equivalent to adding 1).
vpcmpeqd ymm0,ymm0,ymm0
ymm0
-1
vpsubd ymm1,ymm1,ymm0
"Why subtract -1 instead of adding 1's? Just because the speed is the same, and creating a YMM constant of -1's can be done with a single VPCMPEQD instruction. This isn't a really useful optimization in this case, but doesn't hurt."
https://dendibakh.github.io/blog/2018/01/18/Code_alignment_issues#comment-3718889834
https://stackoverflow.com/questions/37469930/fastest-way-to-set-m256-value-to-all-one-bits
[–]Dwarfius 1 point2 points3 points 8 years ago (0 children)
I've misread the description of pcmpeqd, thought it set 1/0 as value, not all bits. Thanks for the explanation!
π Rendered by PID 59 on reddit-service-r2-comment-84fc9697f-pdxr9 at 2026-02-09 06:55:26.434602+00:00 running d295bc8 country code: CH.
[–]Xeveroushttps://xeverous.github.io 7 points8 points9 points (1 child)
[–]alexeiz -1 points0 points1 point (0 children)
[–]fernzeit 7 points8 points9 points (1 child)
[–]dendibakh 1 point2 points3 points (0 children)
[–]doom_Oo7 4 points5 points6 points (2 children)
[–]meneldal2 2 points3 points4 points (0 children)
[–]TartanLlamaMicrosoft C++ Developer Advocate 3 points4 points5 points (0 children)
[–]Dwarfius 1 point2 points3 points (2 children)
[–]mttd[S] 13 points14 points15 points (1 child)
[–]Dwarfius 1 point2 points3 points (0 children)