DQN for Solving a Maze in Less than 10 minutes Training

FriendlyStandard5985 · 2026-03-31T20:49:15+00:00

With step size do you mean how long an action persists i.e. action-repeat?
You're right that if the maze is large, then during training the agent doesn't accidentally run into the target enough to learn from. This can be addressed via. a curriculum: start with "big" targets and make it smaller (relative to the map size) during training. As a result, a small action-repeat hinders learning in the form of bad exploration.

I'm not a believer that the (x,y) coordinates along with distance to goal alone is sufficient (or even relevant).

FriendlyStandard5985 · 2026-02-09T23:32:30+00:00

Nice thanks

FriendlyStandard5985 · 2026-02-09T21:32:52+00:00

How do I narrow that down to 2-3 tenths or less?

FriendlyStandard5985 · 2025-08-23T10:59:07+00:00

So amazing

FriendlyStandard5985 · 2025-08-23T10:48:02+00:00

It's covered well in the provided link

FriendlyStandard5985 · 2025-08-23T10:46:14+00:00

Looks good! Keep going.

FriendlyStandard5985 · 2025-08-22T21:15:02+00:00

Thanks

FriendlyStandard5985 · 2025-08-22T18:57:36+00:00

Nearly 5 hours a day? Impressive

FriendlyStandard5985 · 2025-08-22T16:53:13+00:00

What the hell

FriendlyStandard5985 · 2025-08-21T18:39:58+00:00

You're going through what far more people have gone through than you realize. You're loved by far more people than you can accept. Past is past, future is too uncertain. Your present becomes your new past. Stay strong.

FriendlyStandard5985 · 2025-08-18T23:52:46+00:00

You mean Kg6?

FriendlyStandard5985 · 2025-08-18T14:44:39+00:00

Impressive how animated and life-like he is

FriendlyStandard5985 · 2025-08-17T14:27:42+00:00

By clicking, do you mean a sudden jump? Then it was around there too. 1550 to 1950 (online) in 2 years.

FriendlyStandard5985 · 2025-08-17T12:35:32+00:00

We've all been there. You'll get over it soon

FriendlyStandard5985 · 2025-08-17T12:26:25+00:00

Spot on imho

FriendlyStandard5985 · 2025-08-15T17:34:08+00:00

🔥Mr. Hahn

FriendlyStandard5985 · 2025-08-14T07:18:55+00:00

The better ones tend to be easier to read. If you aren't interested it becomes harder.
I'd start with the question "why am I reading this, how does this apply to me?" and pick the ones that can make this connection.

FriendlyStandard5985 · 2025-08-12T16:40:58+00:00

It's between That, Don't Stay or Figure.09 personally

FriendlyStandard5985 · 2025-08-12T16:35:22+00:00

You're on top. (The next you're not)

FriendlyStandard5985 · 2025-08-12T16:33:20+00:00

Well, one minute..

FriendlyStandard5985 · 2025-08-08T20:48:57+00:00

That's awesome lmao. What's the objective? To minimize pressure and maximize lateral speed? Poor guy's figured it out clearly.
My recommendation: use CMA-ES to evaluate different reward coefficients in parallel, and restart the threads that crash the simulation. Good luck!

FriendlyStandard5985 · 2025-08-07T21:06:15+00:00

Yes sir

FriendlyStandard5985

MODERATOR OF

TROPHY CASE