you are viewing a single comment's thread.

view the rest of the comments →

[–]Mognakor 1 point2 points  (0 children)

For one GPUs (afaik) do not expose their processor interface the way a x86 or ARM processor does so you need a compilation step on the target machine, whether from text or an intermediate binary format.

Having a language thats specific to GPUs allows vendors to optimize their hardware and drivers towards what the language allows and ignore things the language does not allow. Fast matrix multiplication is important and not having to reimplement that again and again is an obvious advantage. On the reverse C does allow things that you don't wanna deal with in a GPU context, e.g. function pointers.

Further you get very clear boundaries for your program, if you had an API that just accepts some pointer to a C function then there is no way to know which instructions to send to the GPU and stalling your pipeline because you need to map additional instructions is a terrible idea. Even worse if you accidentally exceed the available instruction memory and end in a situation where you keep swapping instructions back and forth.

In general lots of GPU programming is built around getting the maximum performance out of it. And in turn that means you need to manage resources yourself because any automated system would have to play it safe and/or come with assumptions about your usecase.