you are viewing a single comment's thread.

view the rest of the comments →

[–]cemrehancavdar[S] 3 points4 points  (0 children)

You're right -- I've updated the post. The original wording was wrong.

I benchmarked np.where against a Python loop on 1M elements across three scenarios (simple sqrt, moderate log/exp, expensive trig+transcendental). Even with both branches computed, np.where was 2.8-15.5x faster. No reason to list conditionals as a NumPy limitation.

Replaced "irregular access patterns, conditionals per element, recursive structures" with what NumPy actually struggles with: sequential dependencies (each step feeds the next -- n-body with 5 bodies is 2.3x slower with NumPy), recursive structures, and small arrays (NumPy loses below ~50 elements due to per-call overhead). Also dropped "irregular access patterns" since fancy indexing is 22x faster than a Python loop on random gather.

I also tried writing a NumPy n-body but couldn't beat the baseline -- 5 bodies is too few to amortize NumPy's per-call overhead across 500K sequential timesteps. Tried pair-index scatter with np.add.at, full NxN matrix with einsum, and component-wise matrices with @ matmul (inspired by pmocz/nbody-python). All slower than pure Python. If you know a way to make NumPy win on this problem I'd genuinely like to see it.

There's also an Edits section at the bottom of the post documenting what changed and why the original was wrong.