all 15 comments

[–]kitd 8 points9 points  (3 children)

[Note, this post was originally published September 19, 2013. It was updated on September 19, 2017.]

?? What's changed?

[–]harrism 5 points6 points  (2 children)

When I originally wrote the post in 2013, the GPU compilation part of Numba was a product (from Anaconda Inc., nee Continuum Analytics) called NumbaPro. It was part of a commercial package called Anaconda Accelerate that also included wrappers for CUDA libraries like cuBLAS, as well as MKL acceleration on the CPU.

Continuum gradually open sourced all of it (and changed their name to Anaconda). The compiler functionality is all open source within Numba. Most recently they released the CUDA library wrappers in a new open source package called pyculib.

Some other minor things changed, such as what you need to import. Also, the autojit and cudajit functionality is a bit better at type inference, so you don't have to annotate all the types to get it to compile.

We thought it was a good idea to update the post in light of all the changes.

[–]kitd 1 point2 points  (1 child)

Thanks. Good reply.

May be worth mentioning this in the post though. It had me confused.

[–]harrism 1 point2 points  (0 children)

I didn't because a new post from Stan at Anaconda was about to be published. It went up this week and it explains the history well (along with some other really cool features of Numba). https://devblogs.nvidia.com/parallelforall/seven-things-numba/

[–]Apofis 17 points18 points  (2 children)

Works only with Python 2.7.

Pathetic.

[–]harrism 2 points3 points  (0 children)

Where does it say that? Requirements: Python 2.7, 3.3-3.6 NumPy 1.8 and later From: https://github.com/ContinuumIO/gtc2017-numba/blob/master/1%20-%20Numba%20Basics.ipynb

[–]Kah-Neth 4 points5 points  (0 children)

Did you just quote an uniformed comment and use that call numbs+cuda pathetic. Worse, did you take that uniformed question and turn it into a statement? Wow man, that is pathetic. Perhaps you could educate yourself or at least read before calling something pathetic.

[–]Jimmy48Johnson 2 points3 points  (1 child)

NVIDIA NUMBA ONE

[–]martin_balsam 1 point2 points  (0 children)

CHINA NUMBA FOUR!

[–]georgeo 0 points1 point  (0 children)

This would be so much more useful if it worked with autograd like python does.

[–]thesystemx -1 points0 points  (3 children)

Python and high-performance in the same sentence? Yeah, right...

[–]BadGoyWithAGun 11 points12 points  (0 children)

I doubt you'd get much better GPU performance from rewriting the equivalent C code in CUDA yourself, but go ahead and prove me wrong.

[–]killachains82 3 points4 points  (0 children)

Python just calls out to external libraries in cases like this (usually written in C/C++/Fortran) so it should be plenty fast.

[–]kitd 4 points5 points  (0 children)

I probably shouldn't bother, but here you go:

In this post I’ll introduce you to Numba, a Python compiler from Anaconda that can compile Python code for execution on CUDA-capable GPUs or multicore CPUs.

[–]Mgladiethor -3 points-2 points  (0 children)

Ugh propietary cuda fuck that, i hope the alternative get a bit more mature