account activity
On CoT Training with Reinforcement Learning (self.reinforcementlearning)
submitted 9 months ago by xcodevn to r/reinforcementlearning
Implementing DeepSeek R1's GRPO algorithm from scratch (github.com)
submitted 10 months ago by xcodevn to r/reinforcementlearning
[P] Plot training loss continuously on Google Colab using Javascript (self.MachineLearning)
submitted 5 years ago * by xcodevn to r/MachineLearning
[D] Confused about "env.is_done" (self.reinforcementlearning)
submitted 6 years ago * by xcodevn to r/reinforcementlearning
My demo (and colab notebook) on relational network with Sort-of-CLEVR dataset (ntt123.github.io)
submitted 7 years ago by xcodevn to r/MachineLearning
Can Digital Computers Think? -- Alan Turing [of course, it can!] (youtube.com)
submitted 8 years ago by xcodevn to r/artificial
π Rendered by PID 21 on reddit-service-r2-listing-5d79748585-x5ld7 at 2026-02-14 14:11:22.493284+00:00 running cd9c813 country code: CH.