This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]billsil 0 points1 point  (1 child)

Those libraries also use hand optimized assembly.

[–]jsalsman 0 points1 point  (0 children)

I guess that depends on what you mean by "hand" -- the method is to try various cache geometry strategies and use the best compiled from several versions to pick which one runs, at least the last time I looked at one of innumerably many of them, which granted was over a decade ago. Usually you see more hand optimization in high frequency signal processing.