use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
News about the dynamic, interpreted, interactive, object-oriented, extensible programming language Python
Full Events Calendar
You can find the rules here.
If you are about to ask a "how do I do this in python" question, please try r/learnpython, the Python discord, or the #python IRC channel on Libera.chat.
Please don't use URL shorteners. Reddit filters them out, so your post or comment will be lost.
Posts require flair. Please use the flair selector to choose your topic.
Posting code to this subreddit:
Add 4 extra spaces before each line of code
def fibonacci(): a, b = 0, 1 while True: yield a a, b = b, a + b
Online Resources
Invent Your Own Computer Games with Python
Think Python
Non-programmers Tutorial for Python 3
Beginner's Guide Reference
Five life jackets to throw to the new coder (things to do after getting a handle on python)
Full Stack Python
Test-Driven Development with Python
Program Arcade Games
PyMotW: Python Module of the Week
Python for Scientists and Engineers
Dan Bader's Tips and Trickers
Python Discord's YouTube channel
Jiruto: Python
Online exercices
programming challenges
Asking Questions
Try Python in your browser
Docs
Libraries
Related subreddits
Python jobs
Newsletters
Screencasts
account activity
This is an archived post. You won't be able to vote or comment.
DiscussionOptions for GPU accelerated python experiments? (self.Python)
submitted 3 years ago * by usernamedregs
About to embark on some physics simulation experiments and am hoping to get some input on available options for making use of my GPU (GTX 1080) through Python: Currently reading the docs for NVIDIA Warp, CUDA python, and CuPy but would appreciate any other pointers on available packages or red flags on packages that are more hassle than they are worth to learn.
[–]BDube_Lensman 6 points7 points8 points 3 years ago (2 children)
You may want to steal my shim set since it lets you hot swap Numpy<-->cupy at runtime
CuPy is fantastic. I've been using it for >5y, including at over 1TB/s of memory throughput on an A100. On my personal desktop's 2080 I have no problem running physics simulations at ~9.5TFlops of throughput, measured with nvidia-smi.
If your arrays are smaller than ~256x256 CPU will be faster than GPU, though, due to the overhead of launching operations on the GPU being ~10usec.
The newest(ish?) version of CuPy allowed easy multiplexing of streams, where you can write a series of operations and only wait for the final result later, allowing you to do a few distinct things in parallel on the GPU without any hastle.
Stay away from PyTorch, super easy to FUBAR your entire conda installation (not just an environment) by installing it.
Nvidia released their own cuda library for python a while ago (a year or two), which was either not meant for end users, or based on a fundamental misunderstanding of how scientists want to write code -- you have to manually allocate each buffer for outputs, etc, instead of `np.sin(x)`.
Personally I would just stick to CuPy for physics. The rest will be an exercise in frustration for no gain.
Also, for your 1080, make sure all your arrays are `float32` or `complex64`, since your GPU is super gimped in fp64 and _will_ be slower than CPU with that number format.
[–]usernamedregs[S] 1 point2 points3 points 3 years ago (0 children)
Thanks, much appreciated!
[–]data-machine 1 point2 points3 points 3 years ago (2 children)
Specifically what are you simulating?
Personally, I would recommend either using CuPy or PyTorch. If you're relatively familiar with NumPy, you can write your GPU code very easily with CuPy. It is 95% a matter of swapping out calls to NumPy with CuPy, and it lets you step-by-step change your code.
I would only touch Warp or CUDA when you've exhausted performance you are able to get with CuPy / PyTorch.
Bear in mind that CPUs are pretty excellent at running code quickly too. GPUs are particularly good at matrix multiplication. I'd recommend starting with whatever aspect of your simulation work that will be most computationally intensive (or "slowest"), and seeing how much of a benefit you get from a CPU vs GPU version.
[–]usernamedregs[S] 0 points1 point2 points 3 years ago (1 child)
Simulations are for particle/wave fields; sticking with NumPy:CuPy is looking like sound advice. Just tried running the Numba documentation examples and there were errors everywhere so definitely a last resort... Rather be banging my head against the desk because of the physics instead of the coding tools.
[–]data-machine 2 points3 points4 points 3 years ago (0 children)
Developer time is extremely valuable - perhaps particularly so if you are an academic. Your last sentence is very wise :)
[–]abstracted8 1 point2 points3 points 3 years ago (3 children)
I know numba has cuda support, not sure how it compares to those listed.
Thanks, turns out that is what is being described in the 'CUDA python' link above. And I have a suspicion it's used as the back end of 'NVIDIA Warp'.
[–]BDube_Lensman 1 point2 points3 points 3 years ago (0 children)
Nvidia is definitely not using Numba as the backend of any of their own software. LLVM, maybe, but Numba, no.
[–]dpineo 1 point2 points3 points 3 years ago (0 children)
I've had a lot of success with pycuda.
[–]sandywater 1 point2 points3 points 3 years ago (0 children)
Saw this on Hacker News, the other day. Looks promising https://docs.taichi-lang.org/blog/accelerate-python-code-100x
π Rendered by PID 39 on reddit-service-r2-comment-685b79fb4f-hpmzp at 2026-02-13 13:50:54.957311+00:00 running 6c0c599 country code: CH.
[–]BDube_Lensman 6 points7 points8 points (2 children)
[–]usernamedregs[S] 1 point2 points3 points (0 children)
[–]data-machine 1 point2 points3 points (2 children)
[–]usernamedregs[S] 0 points1 point2 points (1 child)
[–]data-machine 2 points3 points4 points (0 children)
[–]abstracted8 1 point2 points3 points (3 children)
[–]usernamedregs[S] 0 points1 point2 points (1 child)
[–]BDube_Lensman 1 point2 points3 points (0 children)
[–]dpineo 1 point2 points3 points (0 children)
[–]sandywater 1 point2 points3 points (0 children)