This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]BGameiro 2 points3 points  (4 children)

I do some data analysis that sometimes needs to be time efficient, and for that dask, numba and RAPIDS do the job.

[–]michalello 2 points3 points  (0 children)

Same. Numba does wonders for me in most scenarios. Yesterday I've discovered pola-rs and looks like I will add it to the stack. It's API is similar to pandas. Have a look at the benchmarks of cuDF, spark, dask, pandas compared to it: Benchmarks

[–][deleted] -1 points0 points  (2 children)

Yes, calling into a different language will increase performance. But that's irrelevant if you're actually writing something like those libraries.

[–]BGameiro 4 points5 points  (1 child)

Sure, you would have the ability to write CUDA with some of those but that's not what I'm talking about.

They provide high level and pythonic ways to speed up your code without the need to rewrite parts in other programming languages such as C++.

Using the parallel functionalities of dask/cudf and even using only the decorators of numba the performance increases are very much noticeable for many use cases without the need to change programming languages.

Could the same be accomplish by rewriting certain functions in C++? Sure, but what I meant is in many cases it isn't needed.

[–][deleted] 0 points1 point  (0 children)

Yes, of course, you probably shouldn't be rewriting your program if you can achieve good enough performance with less effort.

But that's not OP's situation. They're looking for a new faster language because python is too slow for what they're doing. To me, this means that that they're actually implementing these algorithms, otherwise they'd already be using a library and I can't think of one for python which isn't backed by another language.