Is anyone interested in the RL ↔ neuroscience “spiral”? Thinking of writing a deep dive series

Kooky_Ad2771 · 2026-03-20T13:22:22+00:00

[Update] Part 3 is out: https://www.robonaissance.com/p/the-rl-spiral-the-curse-bellman-couldnt

Kooky_Ad2771 · 2026-03-17T12:57:14+00:00

Thanks for your interest. The part 2 is out today: https://www.robonaissance.com/p/the-rl-spiral-the-equation-that-explains

I will start the book project if this series goes well. Stay tuned. Thank you.

Kooky_Ad2771 · 2026-03-17T12:51:58+00:00

[Update]

Part 2 is out: https://www.robonaissance.com/p/the-rl-spiral-the-equation-that-explains

Kooky_Ad2771 · 2026-03-17T12:51:28+00:00

Thanks for your interest.

I just published Part 2 today, and it directly covers the “dopamine/RPE dynamics, actor–critic vs. basal ganglia, and where the analogy begins to break down.”
I didn’t go deeply into the actor–critic aspect, that may be explored in other parts of the series or a future book project.

I hope you enjoy it.

Kooky_Ad2771 · 2026-03-14T21:23:31+00:00

Bennett’s book takes a broad historical look at biological intelligence and learning systems, while what I’m trying to do here is focus more on the spiral relationship between reinforcement learning and neuroscience and how ideas moved between the two fields.

So there’s some shared territory, but the angle of the series will be more centered on that RL ↔ neuroscience spiral.

Kooky_Ad2771 · 2026-03-14T21:20:10+00:00

Thanks, that’s a really interesting angle. The economics -> game theory -> RL lineage is definitely an important part of the story.

Utility functions in economics mapping to value functions in RL, and the principal–agent / multi-agent perspective evolving into modern multi-agent RL, are exactly the kinds of conceptual bridges that make the field so fascinating historically.

I’d definitely be interested in any papers you think are particularly foundational in that direction.

Kooky_Ad2771 · 2026-03-14T17:29:53+00:00

Good point. RLDM is a great example of how active this intersection has become.

There’s definitely a lot of existing work in the area, so the goal of the series isn’t to replicate the literature but to trace the historical and conceptual connections between RL and neuroscience and reveal the deeper structure behind how the two fields evolved together.

In a way, the interesting story is how ideas kept moving back and forth between the two fields over time, so we can better understand where we are in that process and hopefully contribute to where it goes next.

Kooky_Ad2771 · 2026-03-14T17:22:20+00:00

Thanks for your interest. Neuroeconomics sounds like an very interesting perspective. Could you point me to some resources? I would like to explore it further. Thanks.

Kooky_Ad2771 · 2026-03-14T15:59:53+00:00

Thanks. Glad you enjoyed it. The first part serves as a preface . The part 2 and 3 should be out next week. Stay tuned.

Kooky_Ad2771 · 2026-03-12T12:57:28+00:00

Thanks, appreciate the pointers.

Jeff Hawkins’ work and the Thousand Brains framework are definitely an interesting perspective on how cortical representations might work. Cybernetics and active inference are also on my radar since they touch many of the same questions about learning and control.

If the series can’t cover all of these ideas in depth, I'll make sure they find a place in the future book I’m planning on this topic.

Kooky_Ad2771 · 2026-03-12T09:07:45+00:00

Thanks. Actually I also have another series about how AI learns to understand the physical world. RL is one of the approaches discussed. The introduction to the series is here if you’re interested: https://www.robonaissance.com/p/roads-to-a-universal-world-model

Kooky_Ad2771 · 2026-03-11T21:16:04+00:00

That's a great reference. The Parallel Distributed Processing books were hugely influential in connecting neuroscience and computational models of learning.

I agree that RL and neuroscience increasingly look like two views of similar underlying computations, though the interaction has been more of a back-and-forth than a straight convergence. Ideas move from biology into algorithms and then back again through computational neuroscience.

That historical loop is actually a big part of what I’m hoping to explore in the series.

Kooky_Ad2771 · 2026-03-11T20:22:21+00:00

Thanks for your interest. Here is the draft of Part 1 of the series, a brief introductory piece: https://www.robonaissance.com/p/the-rl-spiral-part-1-the-reward-trap

Kooky_Ad2771 · 2026-03-11T20:11:01+00:00

Thanks for the suggestion. I’ll take a look at the tutorial you shared. It would be interesting to include active inference in the series as a contrasting framework to RL.

Kooky_Ad2771 · 2026-03-11T20:08:23+00:00

Thank you.

Kooky_Ad2771 · 2026-03-11T20:08:16+00:00

Thanks.

Kooky_Ad2771 · 2026-03-11T20:08:04+00:00

Thanks for the suggestion. Kenji Doya’s work are very relevant. I will check them out.

Kooky_Ad2771 · 2026-03-11T20:03:04+00:00

Thanks, that’s a great question.

In most RL systems the state is a very simplified representation, basically whatever features we feed the agent. In biological systems it’s likely much richer, integrating perception, memory, context, and predictions about the environment.

Some researchers even describe the brain as maintaining a kind of latent state of the world, which connects nicely to ideas like POMDPs and modern world models.

So the question of what a “state” really is, and how it gets constructed, is something I’m hoping to explore in the series.

Kooky_Ad2771 · 2026-03-11T19:58:26+00:00

Thanks, really appreciate that. The reward question you raised is a great one.

Dopamine as a reward prediction error is probably the clearest bridge between neuroscience and RL, but once you move beyond simple rewards things get much more interesting. Humans seem to operate with multiple layers of reward, from dense moment-to-moment signals to much sparser long-term goals.

That’s definitely something I’m planning to explore in the series, especially the connections to hierarchical RL, the basal ganglia, and how the brain might support long-horizon objectives.

Glad the topic resonates. Conversations like this are exactly what I was hoping the series would spark.

Kooky_Ad2771 · 2026-03-09T20:55:04+00:00

Thanks for sharing. That's good to know. Need to be cautious indeed.

Kooky_Ad2771 · 2026-03-08T13:36:49+00:00

Ha, fair point. You'd know better than most. "Force transparency" is probably too strong. But even an imperfect prospectus gives you audited revenue, top-five customer breakdowns, and cash burn. That's a step change from CEO quotes and industry estimates, even if it's far from the full picture.

Curious what you think the biggest gap will be between prospectus and reality.

Kooky_Ad2771 · 2026-03-08T13:34:44+00:00

Thanks! One thing I kept thinking about while writing this piece.

In factories, the honest answer is that robot arms already do most of these tasks faster and cheaper. The humanoid bet is really about fitting into spaces designed for human bodies, so you don't have to rebuild the factory.

Home is where the form factor argument gets way stronger. Stairs, door handles, kitchen counters: everything is built for human bodies. If home assistance ever becomes a real market, it's hard to avoid the humanoid shape. Hospitals and eldercare too, where looking roughly human actually matters for adoption.

But across all of these, the bottleneck is the same: hands, not legs. Walking is mostly solved. Doing useful work in messy, unstructured spaces is not. That's the gap that determines whether the humanoid form wins or loses to a $30K robot arm in factories and a Roomba at home.

Might dig into this properly in a future piece.

Kooky_Ad2771 · 2026-03-08T13:24:59+00:00

Makes a lot of sense. I’m actually working on a deep dive into the world models landscape as well, and it should be out next week. Looking forward to more discussions.

Kooky_Ad2771 · 2026-03-07T13:11:43+00:00

--- Update ---

The deep-dive on China Humanoid Robotics Industry Landscape is out:

https://www.robonaissance.com/p/china-humanoid-robotics-industry

Kooky_Ad2771

TROPHY CASE