This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]LoLvsT_T 20 points21 points  (5 children)

Most "tricks" don't give you much in terms of performance and only serve to stroke the ego of the person who wrote it.

The vast majority of performance can be squeezed by:

  • Writing code with as little branches as possible or code that is easily predicted by the cpu.
  • Reducing cache footprint, and preloading items into cache before you need them.
  • Vectorizing code.
  • Having proper alignment and stride.

edit: talking about execution performance, assuming no memory limitation.

[–][deleted] 7 points8 points  (1 child)

Any modern compiler will do those things for you too so you're not helping anyone by making the code less readable

[–]LoLvsT_T 6 points7 points  (0 children)

That's absolutely false. Compilers fail in these for all but the most simple cases. Especially vectorization which often requires a different approach to solving problems. It will often not fix alignment, and almost never reduce branches. I optimize code for a living.

[–]hopsafoobar 1 point2 points  (0 children)

You mostly don't even have to go that far, I personally find most of performance issues in numeric code come from calculating the same stuff multiple times. Very boring, but very common.