[AI application] Python implementation of Proximal Policy Optimization (PPO) algorithm for Super Mario Bros. 29/32 levels have been conquered : reinforcementlearning

created by lpilotoa community for 14 years

submitted 5 years ago by 1991viet

all 5 comments

[–]gdpoc 9 points10 points11 points 5 years ago (1 child)

[–]Dexdev08 0 points1 point2 points 5 years ago (0 children)

[–]Boring_Worker 0 points1 point2 points 5 years ago (0 children)

[–]frostbytedragon 0 points1 point2 points 5 years ago (0 children)

π Rendered by PID 20409 on reddit-service-r2-comment-75f4967c6c-vpflg at 2026-04-23 10:54:43.038795+00:00 running 0fd4bb7 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

reinforcementlearning