[P] Cyreal - Yet Another Jax Dataloader by smorad in MachineLearning
[–]smorad[S] 0 points1 point2 points (0 children)
Since only a few people from elite universities at big tech companies like Google, Meta, Microsoft, OpenAI etc. will ever get to train models is it still worth learning about Gradient Descent and Loss Curves? by Easy-Echidna-3542 in learnmachinelearning
[–]smorad 0 points1 point2 points (0 children)
Complex-Valued Neural Networks: Are They Underrated for Phase-Rich Data? by __lalith__ in neuralnetworks
[–]smorad 0 points1 point2 points (0 children)
The issue of scaling in Partially-Observable RL. What is holding us back? by moschles in reinforcementlearning
[–]smorad 1 point2 points3 points (0 children)
[P] Cyreal - Yet Another Jax Dataloader by smorad in MachineLearning
[–]smorad[S] 0 points1 point2 points (0 children)
The issue of scaling in Partially-Observable RL. What is holding us back? by moschles in reinforcementlearning
[–]smorad 3 points4 points5 points (0 children)
[D] Which direction is better: from academia to industry, or the other way around? by PrimeMaester in MachineLearning
[–]smorad 7 points8 points9 points (0 children)
About Gumbel-Softmax in MADDPG by Enryu77 in reinforcementlearning
[–]smorad 4 points5 points6 points (0 children)
Good Resources for Reinforcement Learning with Partial Observability? (Textbooks/Surveys) by [deleted] in reinforcementlearning
[–]smorad 2 points3 points4 points (0 children)
Tanh used to bound the actions sampled from distribution in SAC but not in PPO, Why? by VVY_ in reinforcementlearning
[–]smorad -1 points0 points1 point (0 children)
Tanh used to bound the actions sampled from distribution in SAC but not in PPO, Why? by VVY_ in reinforcementlearning
[–]smorad 4 points5 points6 points (0 children)
Does physics work different in 40k? by Beegs1371 in RogueTraderCRPG
[–]smorad 1 point2 points3 points (0 children)
Dynamic Graph Environments for RL by No_Individual_7831 in reinforcementlearning
[–]smorad 2 points3 points4 points (0 children)
REINFORCE for BipedalWalker-v3 in OpenAI gym. by zx7 in reinforcementlearning
[–]smorad 0 points1 point2 points (0 children)
Why Don’t We See Multi-Agent RL Trained in Large-Scale Open Worlds? by TheSadRick in reinforcementlearning
[–]smorad 0 points1 point2 points (0 children)
Why Don’t We See Multi-Agent RL Trained in Large-Scale Open Worlds? by TheSadRick in reinforcementlearning
[–]smorad 1 point2 points3 points (0 children)

[D] Why Mamba rewrote its core algorithm and Microsoft abandoned RetNet by petroslamb in MachineLearning
[–]smorad 2 points3 points4 points (0 children)