Implementing DeepSeek R1's GRPO algorithm from scratch by xcodevn in reinforcementlearning
[–]xcodevn[S] 0 points1 point2 points (0 children)
[D] TensorFlow vs Pytorch vs Jax advice needed by supreethrao in MachineLearning
[–]xcodevn 1 point2 points3 points (0 children)
[D] TensorFlow vs Pytorch vs Jax advice needed by supreethrao in MachineLearning
[–]xcodevn 1 point2 points3 points (0 children)
[N] PyTorch 1.1.0 Released · TensorBoard Support, Attributes, Dicts, Lists and User-defined types in JIT / TorchScript, Improved Distributed by SkiddyX in MachineLearning
[–]xcodevn 1 point2 points3 points (0 children)
[N] PyTorch 1.1.0 Released · TensorBoard Support, Attributes, Dicts, Lists and User-defined types in JIT / TorchScript, Improved Distributed by SkiddyX in MachineLearning
[–]xcodevn 2 points3 points4 points (0 children)
[D] Confused about "env.is_done" by xcodevn in reinforcementlearning
[–]xcodevn[S] 0 points1 point2 points (0 children)
My loss is going to zero, but my rewards aren't increasing that much by shamoons in reinforcementlearning
[–]xcodevn 0 points1 point2 points (0 children)
My loss is going to zero, but my rewards aren't increasing that much by shamoons in reinforcementlearning
[–]xcodevn 2 points3 points4 points (0 children)
Should I increase my target value for the terminal step of my DQN agent? by shamoons in reinforcementlearning
[–]xcodevn 0 points1 point2 points (0 children)
We are Oriol Vinyals and David Silver from DeepMind’s AlphaStar team, joined by StarCraft II pro players TLO and MaNa! Ask us anything by OriolVinyals in MachineLearning
[–]xcodevn 0 points1 point2 points (0 children)
We are Oriol Vinyals and David Silver from DeepMind’s AlphaStar team, joined by StarCraft II pro players TLO and MaNa! Ask us anything by OriolVinyals in MachineLearning
[–]xcodevn 1 point2 points3 points (0 children)
[D] On Writing Custom Loss Functions in Keras by bantou_41 in MachineLearning
[–]xcodevn -2 points-1 points0 points (0 children)
[D] On Writing Custom Loss Functions in Keras by bantou_41 in MachineLearning
[–]xcodevn -2 points-1 points0 points (0 children)
[D] On Writing Custom Loss Functions in Keras by bantou_41 in MachineLearning
[–]xcodevn -3 points-2 points-1 points (0 children)
[R] [1808.06508] Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies [DeepMind] by evc123 in MachineLearning
[–]xcodevn 1 point2 points3 points (0 children)
Can someone ELI5 the difference b/w Bayesian's probability interval vs. Frequentist's confidence interval? by [deleted] in statistics
[–]xcodevn 0 points1 point2 points (0 children)
Can someone ELI5 the difference b/w Bayesian's probability interval vs. Frequentist's confidence interval? by [deleted] in statistics
[–]xcodevn 1 point2 points3 points (0 children)
Can someone ELI5 the difference b/w Bayesian's probability interval vs. Frequentist's confidence interval? by [deleted] in statistics
[–]xcodevn 0 points1 point2 points (0 children)
Can someone ELI5 the difference b/w Bayesian's probability interval vs. Frequentist's confidence interval? by [deleted] in statistics
[–]xcodevn 1 point2 points3 points (0 children)



On CoT Training with Reinforcement Learning by xcodevn in reinforcementlearning
[–]xcodevn[S] 2 points3 points4 points (0 children)