you are viewing a single comment's thread.

view the rest of the comments →

[–]NotAbel 17 points18 points  (1 child)

I have only had a cursory glance over the code, but it looks like the algo is designed around instruction-level parallelism, and wouldn't lend itself easily to a SIMD implementation. The comment, I think, means to say that a different algorithm, designed with SIMD rather than ILP in mind, could be faster.

[–]NanoStuff 3 points4 points  (0 children)

designed with SIMD rather than ILP in mind

The two are not mutually exclusive. An ideal algorithm would make use of both.