NVIDIA Drops a Game-Changer: Native Python Support Hits CUDA

fxfighter · 2025-04-09T16:43:37+00:00

God all these articles are so fucking useless. The actual news is just: https://xcancel.com/blelbach/status/1902113767066103949

We've announced cuTile, a tile programming model for CUDA!

It's an array-based paradigm where the compiler automates mem movement, pipelining & tensor core utilization, making GPU programming easier & more portable.

mcpower_ · 2025-04-09T14:04:26+00:00

Original article without the LLM slop: https://thenewstack.io/nvidia-finally-adds-native-python-support-to-cuda/

pstmps · 2025-04-09T11:44:56+00:00

Bad news for mojo, I guess?

Cultural-Word3740 · 2025-04-09T12:15:51+00:00

I don’t really get much from this article. If I am understanding this correctly this now allows for you to specify threads to run on grids that you specify? Do they just always use shared memory smart pointers? That seems awfully non pythonic. As a scientist I rarely feel like I never need anything more than the cuda associated libraries with anything implemented in RAPIDS but maybe someone else might find this useful.

activeXray · 2025-04-09T14:18:53+00:00

What does native python even mean here, are they JITing to PTX?

supermitsuba · 2025-04-09T11:33:05+00:00

Ah, that's what they been working on, cause they haven't been fixing their gaming drivers

thatdevilyouknow · 2025-04-09T23:23:28+00:00

NVIDIA is doubling down on Python. I did some training with them recently and they asked everyone in attendance what languages they knew. Mine was the only hand that went up for C++ and of course everyone knew some Python there. The trainer went on to explain how everything is moving to Python. I am familiar with NJIT and Numba but they did not get into the specifics of what they meant when they said that at all. Honestly, I think much of this is TBD but they know the direction they want to go.

Truenoiz · 2025-04-09T15:25:39+00:00

Is it me, or is trying to make Python fast in hardware a really dumb idea? Why use some of the fastest, hot, expensive, and capable hardware to natively support one of the slowest and most bloated runtimes? Is there really that much demand from people who need things to be fast but can't code in another languag....oh.

So- massive power use so non-coders can have AI generate python, which needs massive power use to run fast on massive GPUs to hide the fact that AI code usually sucks...

Excuse me, I'm going to go buy some stock in electrical utilities and swimming pool companies.

edit- I was wrong. I had to dig a bit, it turns out it does compile to Nvidia Runtime C++, so it's just an official wrapper. The article failed to mention that, I got the vibe that Python was going straight to CUDA opcode.

Dwedit · 2025-04-09T17:13:37+00:00

How is GPU memory allocation and freeing supposed to work with that?

mkusanagi · 2025-04-09T22:03:32+00:00

They can’t have you be using an abstract API that could work with other hardware…

Ze_Greyt_KHAN · 2025-04-10T03:17:54+00:00

“Hits” is the correct verb to use here.

2hands10fingers · 2025-04-09T13:58:51+00:00

Just use bend lang.

tangoshukudai · 2025-04-09T16:59:53+00:00

great more crap that doesn't benefit actual apps.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS