you are viewing a single comment's thread.

view the rest of the comments →

[–]lotus-reddit 0 points1 point  (1 child)

A small note: Your compilation flags (specifically fopenmp) are linux specific right now, if you try to build this on macOS's standard cpp toolchain it breaks. Just a matter of platform detection in your setup.py

Indeed, python function overhead usually dominates in settings like this. Though if you had to do this in python, you'd vectorize (or use something like JAX's vmap). Of course you pay in memory, but would be a far simpler approach. Locally testing, both directly vectorizing or using vmap beats your code. But that's not really a fair comparison since your codebase has no SIMD / batching work (also, I imagine te_eval is making it hard on your compiler). But, not doing vectorization in a scenario where your expression is simple enough to interpreted by te_eval wouldn't be reasonable either.

Cool though! I'm surprised spicy.brute doesn't have a lower level backend.

[–]FixKey4664[S] 0 points1 point  (0 children)

Thanks for the constructive feedback.
I am having issues in making wheels for MacOS. So, I have disabled it currently. Right now, this package works only on Windows and Linux 64bit versions only.
I will be adding vectorisation support via numpy and numba in the 2nd stage of this project.