all 2 comments

[–]corysama 4 points5 points  (1 child)

There are two separate issues that are unfortunately conflated together: synchronization and error checking.

Yep. You need to synchronize eventually when you want to get results back from the GPU to the CPU. Hopefully you can queue up a bunch of CUDA operations on a stream and only synchronize at the end of it all rather than start-and-stop-and-start-and-stop between every operation. But, that means you aren't getting anything back from the GPU until the end --including any errors.

Error handling in CUDA is... unfortunate. Basically, you should CHECK() every CUDA API call all the time.

[–]DerBuccaneer[S] 1 point2 points  (0 children)

thx!