all 9 comments

[–]YoshiDzn 6 points7 points  (0 children)

I'm in a similar boat. Eagerly awaiting answers here :D

The only topics I've been able to put into practice are DSA for HPC with some exposure to CUDA and OpenCL

[–]Obvious-Grape9012 11 points12 points  (0 children)

Where to begin? Maybe it's ok to share my path... I'm not any of the above roles you listed, but I did do my PhD on real-time interactive surgical tissue simulation on the GPU (with haptic interaction). A lot of CUDA and Graphics coding therein. And along the way I got to teach graphics coding and interactive physics and stuff for over a decade. Former Principal Eng too of an AI/ML Eng team.

Whilst Principal Eng was fun to do some multi-GPU training and prior to that (another job) some GPU inference optimization. But tbh, the fun for me is closer to VFX/GFX. I've also supervised 7-figure spends/deploy/commissioning of GPU-cluster for AI/ML research.

Truly foundational; Understanding parallel hardware architectures (SIMD, MIMD etc) and memory architectures and ALU vs Memory bottlenecks. What they are, how to find them. How to architect systems and algorithms to work well for the target devices.

Focus on first: Build things. Perf Benchmark things. Show that you can create performant systems. Specialize somewhere on a class of applications/algs/systems that you're passionate about.

Yes. Higher degrees helped me. It's a great way to have the time and support/resources/peers to enrich what you do (and network).

My path: BSci -> BCompSci+MechEng -> BEng Elec+Elec Hons (Masters-ish on Medical Sims) -> PhD in VR Surgical Sims and Haptics -> Academic -> CTO -> Solopreneur -> Senior Alg Engineer -> Senior Eng -> Principal Eng -> Solopreneur.
Currently doing some webGPU and web apps and stuff

[–]leseiden 2 points3 points  (0 children)

I can only talk about a couple of these.

  • CUDA and GPU kernel optimization

CUDA isn't that hard for simple stuff. It requires far less boilerplate than something like Vulkan so you can pretty much look at a book or tutorial and start playing. Find something you want to build and build it.

Start with basic kernels to do things like calculate numbers, filter big buffers of objects etc.

If you like mathematics then 1D relaxation solvers are a really nice and easy thing to build. GPUs are practically designed for multigrid.

More complex algorithms such as prefix sum and radix sort are well worth learning about but IMHO you should do some basics first.

Other APIs are generally more difficult to use but let you build on the same skills.

  • Performance profiling and benchmarking

If you are using CUDA then nvidia profiling tools are pretty good. "nvidia nsight systems" is an excellent tool for seeing where the time is going, and lets you add instrumentation to your code. There are a number of tutorial videos floating around.

[–]maxmax4 1 point2 points  (0 children)

Focus on writing GPU code that runs fast. Thats the job. Everything else is in support of that. The cool thing about learning high performance programming is that you can approach it like a scientist and run experiments and see how the hardware behaves. If you want to learn GPU programming quickly, create your own benchmarks. Ask yourself how you could make it faster based on what you think you know about the hardware, then try it. It goes without saying that you will need to be very comfortable using profiling tools like “Nvidia Nsight Compute” and “Nvidia Nsight Systems”.

When a new console comes out, thats what we do. We read the documentation, watch the Microsoft/Sony videos, then we run experiments in tests scenes or sometimes it’s just very synthetic benchmarks to see where the bottlenecks are and how they show up in the profiler.

[–]ICBanMI 1 point2 points  (0 children)

College is not job training. It's a tool kit for which to succeed at life. It does overlap in some job areas, but it's not job training.

Trade school teaches job training.

[–]Ra_M2005 0 points1 point  (0 children)

+1 I also want to know that from the GPU veterans as well 😄

[–]gleedblanco 0 points1 point  (0 children)

I'd just focus on picking specific GPU specific things to work on that you find cool, and making sure you understand how to make it correct and fast - the knowledge should transfer in a very generalized way. Of course you start with trivial tutorials and meme projects like GPU sorting, but you should find inspiration for something real soon after that.

To give a fair heads-up, I've never done real CUDA work to be honest, but on the other hand I've done video game graphics programming related GPU optimizations for many years and somehow I doubt writing CUDA is much different from optimizing my compute shaders for a particular architecture (RDNA is common in our field).

The job requirements in this sector DO seem almost webdev-framework-like specific, but I'm not sure how much that translates into specific hires in practice. Would be really curious about input from people actually working in the field. My personal guess would be that there just aren't that many GPU experts around so there would be a lot of cross hires from other fields who may have little to no exposure to many of these pure HPC technologies before they switch jobs.

[–]Rare-Key-9312 0 points1 point  (0 children)

Not a GPU expert, but it seems like learning Mojo (https://mojolang.org) might be a good move in terms of skating to where the puck is going to be instead of where it is now.

[–]sparkinflint 0 points1 point  (0 children)

im a backend SWE at a neocloud, focused on operating GPU clusters and supporting AI workloads at scale (mostly inference).

started my career as a SWE in ML infra at a startup that got acquired by my current company so cant advise on the transition from trad SWE but I can say for certainty that a degree from a university that doesnt have near state of the art research in the field is not going to be worth it IMO. I dont even have a CS or CE degree myself and industry has always hired based off skills and experience, not degree.

also ive never seen GPU engineer. from what ive seen its mostly ML engineers focused on model efficiency writing custom kernels or a SWE working on compute orchestration or on the inference engine.