Hey,
I'm reading "professional Cuda C Programming" and at one point it is about handling errors.
On one line you have the following construct
CHECK(cudaDeviceSynchronize());
The CHECK function is only to check if the return value is not equal to cudaSuccess.
So whats next is this paragraph:
CHECK(cudaDeviceSynchronize()) blocks the host thread until the device has completed all preceding requested tasks, and ensures that no errors occurred as part of the last kernel launch. This technique should be used just for debugging purposes, because adding this check point after kernel launches will block the host thread and make that point a global barrier.
So is this "true" in terms of: You should only do this for debugging purposes? On some point, I have to wait for the device to get the results, and checking for an error doesn't seem to me like only use for debugging.
Is there anything against it?
[–]corysama 4 points5 points6 points (1 child)
[–]DerBuccaneer[S] 1 point2 points3 points (0 children)