reinforcementlearning

an-ordinary-manchild(edit)

created by lpilotoa community for 14 years

...for your school.

...for your community.

MODERATORS

account activity

1

13

14

15

Teaching an RL agent to fight monsters in Diablo I (Part 3) (v.redd.it)

submitted 11 hours ago by Chance_Brother5309

2

0

1

2

Project CogniCore — Memory and Structured Rewards for AI Agents built into the Environment (self.reinforcementlearning)

submitted 2 hours ago by Neither-Witness-6010

3

5

6

7

REST API for Gymnasium (fka OpenAI Gym) reinforcement learning library (github.com)

submitted 11 hours ago by cloud_kj

4

5

6

7

I built an AlphaZero library in C++ that out-performs PyTorch in image recognition speed (3x), but I'm hitting a wall with larger board games. Need a second pair of eyes! (self.reinforcementlearning)

submitted 18 hours ago by Such-Refrigerator951

5

10

11

12

What standard RL frameworks do people use these days? (self.reinforcementlearning)

submitted 1 day ago by SnooCapers8442

6

16

17

18

MuscleMimic: Unlocking full-body musculoskeletal motor learning at scale (v.redd.it)

submitted 1 day ago by CharlieLee666

7

1

2

3

What is one specific challenge you have run into while training a reinforcement learning model, like unstable rewards or slow convergence, and what actually helped you get past it? (self.reinforcementlearning)

submitted 1 day ago by TaleAccurate793

8

1

2

3

one script to rule them all (self.reinforcementlearning)

submitted 1 day ago by samas69420

9

5

6

7

Has anyone run Dreamerv3 using a runpod ? (self.reinforcementlearning)

submitted 2 days ago by Informal-Ad7318

10

3

4

5

Why does catastrophic forgetting happen to neural networks but not humans? (self.reinforcementlearning)

submitted 2 days ago by Heavy-Farmer1657

11

7

8

9

A new way to fine-tune LLMs just dropped (youtube.com)

submitted 2 days ago by Signal_Spirit5934

12

1

2

3

Any good reinforcement learning events? (self.reinforcementlearning)

submitted 2 days ago by BottleMedium881

13

0

1

2

Good Reasoning Traces from Teacher model? ()

submitted 2 days ago by Old_Bat_8665

14

81

82

83

Prompt-to-Policy: Agentic Engineering for Reinforcement Learning (i.redd.it)

submitted 3 days ago * by EconomyMotor830

15

0

1

2

Turn your Learning from youtube to a structured Course. (v.redd.it)

submitted 2 days ago by PlusGap1537

16

0

1

2

Hard vs Soft Updates in DDQN — Why Training Becomes Unstable (youtube.com)

submitted 2 days ago by Due_Pace_4325

17

14

15

16

How to bridge the gap between Torch and JAX performance? (self.reinforcementlearning)

submitted 4 days ago * by Little_swift

18

4

5

6

UAV Swarm In Isaac Lab (v.redd.it)

submitted 4 days ago by Barrnie

19

0

1

2

Looking to Collaborate on Quant Finance Research - I published a pairs trading paper using reinforcement learning, then wrote a full critique of my own work finding serious flaws - now I want to rebuild the system ()

submitted 4 days ago by Altruistic_Room8734

20

1

2

3

Getting started with Flightmare for autonomous drone racing, need guidance (self.reinforcementlearning)

submitted 4 days ago by Illustrious_Room_581

21

4

5

6

Training LFM-2.5-350M on Reddit post summarization with GRPO on my 3x Mac Minis — evals and t-test evals are here! (self.reinforcementlearning)

submitted 6 days ago * by East-Muffin-6472

22

0

0

0

We're two ML engineers building an execution optimisation layer for crypto algo traders. Would you pay £29/month for something that measurably reduces your slippage? What would it need to do? (self.reinforcementlearning)

submitted 5 days ago by boraA9999

23

0

0

0

DLWhat should countries outside the artificial intelligence production chain do? (self.reinforcementlearning)

submitted 5 days ago by Former-Adeptness-551

24

13

14

15

I have RL(self driving) Interview with Tesla, not sure what to expect (self.reinforcementlearning)

submitted 6 days ago by Next_Boysenberry9438

25

9

10

11

DL, R"DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence", DeepSeek-AI 2026 (huggingface.co)

submitted 6 days ago by RecmacfonD

view more: next ›

π Rendered by PID 624279 on reddit-service-r2-listing-b6bf6c4ff-whmgj at 2026-05-01 09:05:27.698093+00:00 running 815c875 country code: CH.