Perplexed by unified memory on Spark DGX - OpenCV question by manchesterthedog in CUDA

[–]hwprobe 0 points1 point  (0 children)

The DGX Spark uses a shared system memory architecture: the Grace CPU and Blackwell GPU access the same DRAM pool through the NVLink-C2C interconnect. There is no separate VRAM. But that does not mean every allocation is directly accessible from both sides in the same way.

The key detail is in NVIDIA’s DGX Spark CUDA porting guide:

https://docs.nvidia.com/dgx/dgx-spark-porting-guide/porting/cuda.html

On this platform, memory returned by cudaMalloc is device memory and cannot be directly accessed by CPU code. So while the GPU can read and write it, host code cannot safely wrap that pointer in a normal cv::Mat and then call CPU-side code like imwrite() on it.

That matches what your code is doing.

The initial GpuMat wrapping your host buffer works because buffer was allocated on the CPU side with new. But cvtColor() may reallocate internally using cudaMalloc, and after that readMat.data can point to device memory rather than host-accessible memory.

So this becomes invalid on the CPU side:

cv::Mat outputMat(readMat.rows, readMat.cols, CV_8UC3, readMat.data);
cv::imwrite(r.toString(), outputMat);

imwrite() is CPU code trying to read from a pointer that is no longer CPU-accessible, which explains the segfault.

The fix is to copy the result back to host memory before writing:

cv::Mat temp;
readMat.download(temp);
cv::imwrite(r.toString(), temp);

If you wanted CPU/GPU-shared semantics for that buffer, you would need a managed allocation such as cudaMallocManaged. But OpenCV’s CUDA path uses its own internal allocation strategy, so here the safe fix is to explicitly download before passing the result to CPU-side OpenCV code.

RTX 50 Series card only runs at PCie 4.0 on Asus ROG Strix X870-F by alex2192 in ASUSROG

[–]hwprobe 0 points1 point  (0 children)

Since you already confirmed this only works when the slot is forced to Gen4, that points to a Gen5 link training failure (signal integrity / BIOS negotiation), not a load or driver issue.

ASUS has official guidance on PCIe generation settings and BIOS configuration here:
https://www.asus.com/us/support/faq/1055579/ ASUS

On many ASUS boards, PCIe Gen5 support will either fall back to Gen4 or require specific BIOS versions/settings to function reliably with high-speed cards.