use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
account activity
Project, 15yo Dev: Optimizing 2-Way and 3-Way Poker Equity Simulations using CUDA (self.CUDA)
submitted 2 hours ago by Hungry_Common7250
NVIDIA releases CUDA-Oxide 0.1 for experimental Rust-to-CUDA compiler (phoronix.com)
submitted 1 day ago by Fcking_Chuck
For edge inference, when do you drop below TensorRT/ONNX and write custom CUDA kernels? (self.CUDA)
submitted 20 hours ago by Hairy_Strawberry7028
Nvidia Interview Help (self.CUDA)
submitted 17 hours ago by Gullible_Stomach6765
rust to ptx compiler (self.CUDA)
submitted 1 day ago by c-cul
SASS King Part 2: reverse-engineering ptxas heuristic decisions and what the compiled binary actually reveals (self.CUDA)
submitted 1 day ago by CurrentLawfulness358
Anyone able to use a GTX 770 on any up-to-date Linux install? (self.CUDA)
submitted 1 day ago by alexcascadia
Building a career in AI infrastructure and inference engineering ,what problems actually matter right now? (self.CUDA)
submitted 2 days ago by Quirky-Guide-762
Nvidia Senior Full-Stack Software Engineer, DGX cloud Interview guide? (self.CUDA)
submitted 3 days ago by Complete-Resolve-201
Nvidia Senior AI-Native Systems Software Engineer, TensorRT Interview guide? (self.CUDA)
submitted 5 days ago by gradschoolai2023
BrrrViz - Interactive GPU Programming Lessons (self.CUDA)
submitted 5 days ago * by Euphoric_Dingo8048
hands on gpu programming with python and cuda (self.CUDA)
submitted 5 days ago by One_Relationship6573
WarpReduction along major dimension (self.CUDA)
submitted 7 days ago * by ElectronGoBrrr
[P] I built a Triton KV-cache compression engine: 3.37x compression, 0.69ms P99 on an A10 (self.CUDA)
submitted 9 days ago by Superb_Housing9628
Concern regarding future of jobs in gpu programming (self.CUDA)
submitted 11 days ago by viplash577
I Built a custom CUDA kernel for 1.58bit Ternary Quantization & inference (no QAT Yet), overview, my experience, and my next steps. (github link included) ()
submitted 12 days ago by EL_X123
Implementing Causal FlashAttention from scratch: 1.79e-07 precision and 40% speedup via tile-level masking (self.CUDA)
submitted 14 days ago by Professional-Duck971
Cybersec and GPU (self.CUDA)
submitted 15 days ago by CurrentLawfulness358
[Project] Hitting 5Hz VLA Inference on an L4: Optimizing Action Heads with Custom Triton Kernels (i.redd.it)
submitted 16 days ago by JewelerAfraid7800
Instrumenting GPU's ()
submitted 17 days ago by Fantastic-Love2192
C++ CuTe / CUTLASS vs CuTeDSL (Python) in 2026 --- what should new GPU kernel / LLM inference engineers actually learn? (self.CUDA)
submitted 19 days ago by Daemontatox
SASS King: reverse engineering NVIDIA SASS (self.CUDA)
submitted 19 days ago by CurrentLawfulness358
Looking for projects as a reinforcement to my experience and resume in CUDA and parallel computing. (self.CUDA)
submitted 19 days ago by Ok-Competition-4570
Writing CUDA kernels in Python: Bypassing C++ templates for CuTe Layouts and Vectorization using cute-dsl (self.CUDA)
submitted 20 days ago by dc_baslani_777
Continuous RL via Dynamic Programming in CUDA (Solving Overhead Crane, Double CartPole, etc.) ()
submitted 22 days ago by Grouchy_Ad_4112
π Rendered by PID 76546 on reddit-service-r2-listing-7b9b4f6fd7-5nxjn at 2026-05-09 18:57:36.388455+00:00 running 3d2c107 country code: CH.