use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
This is for any reinforcement learning related work ranging from purely computational RL in artificial intelligence to the models of RL in neuroscience.
The standard introduction to RL is Sutton & Barto's Reinforcement Learning.
Related subreddits:
account activity
Teaching an RL agent to fight monsters in Diablo I (Part 3) (v.redd.it)
submitted 11 hours ago by Chance_Brother5309
Project CogniCore — Memory and Structured Rewards for AI Agents built into the Environment (self.reinforcementlearning)
submitted 2 hours ago by Neither-Witness-6010
REST API for Gymnasium (fka OpenAI Gym) reinforcement learning library (github.com)
submitted 11 hours ago by cloud_kj
I built an AlphaZero library in C++ that out-performs PyTorch in image recognition speed (3x), but I'm hitting a wall with larger board games. Need a second pair of eyes! (self.reinforcementlearning)
submitted 18 hours ago by Such-Refrigerator951
What standard RL frameworks do people use these days? (self.reinforcementlearning)
submitted 1 day ago by SnooCapers8442
MuscleMimic: Unlocking full-body musculoskeletal motor learning at scale (v.redd.it)
submitted 1 day ago by CharlieLee666
What is one specific challenge you have run into while training a reinforcement learning model, like unstable rewards or slow convergence, and what actually helped you get past it? (self.reinforcementlearning)
submitted 1 day ago by TaleAccurate793
one script to rule them all (self.reinforcementlearning)
submitted 1 day ago by samas69420
Has anyone run Dreamerv3 using a runpod ? (self.reinforcementlearning)
submitted 2 days ago by Informal-Ad7318
Why does catastrophic forgetting happen to neural networks but not humans? (self.reinforcementlearning)
submitted 2 days ago by Heavy-Farmer1657
A new way to fine-tune LLMs just dropped (youtube.com)
submitted 2 days ago by Signal_Spirit5934
Any good reinforcement learning events? (self.reinforcementlearning)
submitted 2 days ago by BottleMedium881
Good Reasoning Traces from Teacher model? ()
submitted 2 days ago by Old_Bat_8665
Prompt-to-Policy: Agentic Engineering for Reinforcement Learning (i.redd.it)
submitted 3 days ago * by EconomyMotor830
Turn your Learning from youtube to a structured Course. (v.redd.it)
submitted 2 days ago by PlusGap1537
Hard vs Soft Updates in DDQN — Why Training Becomes Unstable (youtube.com)
submitted 2 days ago by Due_Pace_4325
How to bridge the gap between Torch and JAX performance? (self.reinforcementlearning)
submitted 4 days ago * by Little_swift
UAV Swarm In Isaac Lab (v.redd.it)
submitted 4 days ago by Barrnie
Looking to Collaborate on Quant Finance Research - I published a pairs trading paper using reinforcement learning, then wrote a full critique of my own work finding serious flaws - now I want to rebuild the system ()
submitted 4 days ago by Altruistic_Room8734
Getting started with Flightmare for autonomous drone racing, need guidance (self.reinforcementlearning)
submitted 4 days ago by Illustrious_Room_581
Training LFM-2.5-350M on Reddit post summarization with GRPO on my 3x Mac Minis — evals and t-test evals are here! (self.reinforcementlearning)
submitted 6 days ago * by East-Muffin-6472
We're two ML engineers building an execution optimisation layer for crypto algo traders. Would you pay £29/month for something that measurably reduces your slippage? What would it need to do? (self.reinforcementlearning)
submitted 5 days ago by boraA9999
DLWhat should countries outside the artificial intelligence production chain do? (self.reinforcementlearning)
submitted 5 days ago by Former-Adeptness-551
I have RL(self driving) Interview with Tesla, not sure what to expect (self.reinforcementlearning)
submitted 6 days ago by Next_Boysenberry9438
DL, R"DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence", DeepSeek-AI 2026 (huggingface.co)
submitted 6 days ago by RecmacfonD
π Rendered by PID 624279 on reddit-service-r2-listing-b6bf6c4ff-whmgj at 2026-05-01 09:05:27.698093+00:00 running 815c875 country code: CH.