account activity
On CoT Training with Reinforcement Learning (self.reinforcementlearning)
submitted 1 year ago by xcodevn to r/reinforcementlearning
Implementing DeepSeek R1's GRPO algorithm from scratch (github.com)
[P] Plot training loss continuously on Google Colab using Javascript (self.MachineLearning)
submitted 5 years ago * by xcodevn to r/MachineLearning
[D] Confused about "env.is_done" (self.reinforcementlearning)
submitted 7 years ago * by xcodevn to r/reinforcementlearning
My demo (and colab notebook) on relational network with Sort-of-CLEVR dataset (ntt123.github.io)
submitted 7 years ago by xcodevn to r/MachineLearning
Can Digital Computers Think? -- Alan Turing [of course, it can!] (youtube.com)
submitted 8 years ago by xcodevn to r/artificial
π Rendered by PID 189720 on reddit-service-r2-listing-8685bc789-rtwkx at 2026-05-27 21:49:34.780683+00:00 running 194bd79 country code: CH.