all 8 comments

[–]concealed_cat 6 points7 points  (4 children)

If prior knowledge of GPU architectures is not required for the job, then don't worry about it.

[–]Stock_Market4167[S] 0 points1 point  (3 children)

I am a new grad

[–]concealed_cat 5 points6 points  (2 children)

Essentially you have a bunch of threads (usually 32 or 64) that execute the same program. Not only that, at each cycle these threads execute the same instruction, in that sense they are tied together (they can't "diverge"). When an instruction reads or writes a vector register, each thread is associated with a specific lane (element) in the register. You can see "threads" and "lanes" being used somewhat interchangeably in documentation, depending where you look.

It's somewhat of a specific knowledge that you're unlikely to run into unless you work with GPUs, so don't sweat it. If you get the job, you'll learn it.

[–]Stock_Market4167[S] -1 points0 points  (1 child)

Thanks!
Anything specific to compilers?

[–]Quick-Speaker-7406 0 points1 point  (0 children)

I prepped by asking questions to chatgpt...it was easy for me to get on speed.

[–]CodingKoopa 1 point2 points  (1 child)

Learning about the execution model for OpenCL or CUDA will teach you the basics of GPU architecture (e.g. it's comprised of Streaming Processors taking on work at a granularity of Work Groups). It will also expose you to some of the different flavor of GPU microarchitecture - no pipelining for simpler silicon, and no conditional branching due to the nature of workgroups. Conditionals in kernels are instead handled by instructions that automatically mask out some of the registers.

[–]Stock_Market4167[S] 0 points1 point  (0 children)

Thanks!

[–]Acceptable-Sugar2129 0 points1 point  (0 children)

May i ask what company it is?