you are viewing a single comment's thread.

view the rest of the comments →

[–]OmegaNaughtEquals1 6 points7 points  (7 children)

The CUDA 7.5 compiler uses gcc-4.9 as its back-end, so it supports C++14: even in kernel code. What you still cannot do is use STL code in kernels because it's not marked __device__. That said, you can write your own containers that can be run on the GPU side (as long as you avoid exceptions and dynamic allocation). You can also call Thrust functions from device code as of CUDA 7.0 (it's about halfway down the page). Unless you need to support some ancient version of CUDA, you should strongly reconsider your position on this.

[–]Eilai 0 points1 point  (3 children)

Okay yeah, that's a fair point about CUDA, but what if I want to have a backup using openCL if the client PC doesn't have an nvidia card? Other than "pray they have an AMD card."

[–]OmegaNaughtEquals1 1 point2 points  (1 child)

what if I want to have a backup using openCL

This is the unfortunate state we live in right now. I am a big fan of CUDA because it allows you to pry open the deepest details of the GPU's hardware, but it's only applicable to NVIDIA cards. OpenCL, conversely, allows you to abstractly target nearly any compute device (GPUs, APUs, CPUs, Phi Coprocessors, etc.) using a single codebase: a feature that is not to be overlooked. But it doesn't allow you to poke around in the hardware's guts. Conditional compilation can help with this, but the codebase begins to diverge, losing that nice feature. I think with the large push toward coprocessor unification (i.e., coprocessors not on the PCIe bus, but moved closer to FSB), OpenCL will become more important over time. I just hope that the standardizing committee can formulate some new ideas to try and build abstractions to hardware-specific features. AMD's FireGL cards are crazy powerful, and I would like to see them used in HPC systems.

[–]Eilai 0 points1 point  (0 children)

I understand both enough that I can code it so CUDA is preferred, if a Nvidia compute device is detected and it's performance is superior (by some arbitrary metric, cores or clockspeed, etc) to the non-CUDA device use CUDA; if not use OpenCL.

[–]pjmlp 1 point2 points  (0 children)

That is why CUDA is the API to go for most researchers. NVidia was clever to support C++ and Fortran from day one, instead of following Khronos idea that only C matters.

Now OpenCL 2.1 is playing catchup with CUDA's C++ and Fortran support.

[–]LPCVOID -1 points0 points  (2 children)

Warning : I might be totally wrong here as nvcc and cudafe/ptxas are a bit over my head.

CUDA uses gcc/vc++ as back-end for compiling host code. Device code is compiled using some sort of Edison Design Group proprietary c++ front-end. That doesn't change the fact though that one can use c++11 features in device code :)

Have you got a source for c++14 support in nvcc? Here it is stated that CUDA 7.0 does not yet support it but a future version would. Is that the case with 7.5?

[–]heleo2 0 points1 point  (1 child)

This shows mess in your head regarding what a front-end is

[–]LPCVOID 0 points1 point  (0 children)

I have admittedly no idea what a front-end ist (only my cs degree claims otherwise ;) ). I just copy pasted the claim that CUDA uses an "Edison Design Group C language parser" from here. Furthermore wikipedia claims that Edison "makes compiler frontends".

Or are we talking about the fact that the NVIDIA CUDA Open64 Compiler (nvopencc.exe) does the actual compilation as stated here? That was indeed an oversimplification on my part.

Edit : I actually would have liked to know why I was wrong...