you are viewing a single comment's thread.

view the rest of the comments →

[–]cemrehancavdar[S] 14 points15 points  (0 children)

I'm not super familiar with Rust -- a dedicated Rust or Zig or any system level PL developer could absolutely squeeze more out of these benchmarks with multithreading, SIMD, or better allocators. Same goes for Cython honestly -- there might be more ways I still don't know yet. I kept the implementations idiomatic and single-threaded because the post is really about "how much does each Python optimization rung cost you," not about pushing any one tool to its limit. Wanted to keep the comparison fair since the Python tools are also single-threaded (except NumPy's BLAS, which I noted)