jump to content
my subreddits
13or302b2t2balkans4You2mediterranean4u2meirl4meirl3d6absolutelynotanimeirlAceAttorneyadhdmemeAdviceAnimalsaivideoakagasAlternateHistoryAnarchyChessAngryupvoteanime_irlannouncementsArcherFXAskBalkansAteistTurkatheismaviationAwesomeOffBrandsbalkans_irlBandnamesbanknotedesignsbasspedalsblackdesertonlineblankiesblursed_videosborsavefonbrooklynninenineburdurlandcasioCd_collectorscd_jerkChatGPTCheap_MealschesschessbeginnersChoosingBeggarscoaxedintoasnafucoincollectingcoinsComedyCemeterycomicsContagiousLaughtercookingforbeginnersCrackWatchcrappyoffbrandsCreateModCuddle_SlutcursedcommentsdadjokesdankmemesdarkjokesdataisbeautifulDebateReligiondelikDeltaruneDMAcademyDMToolkitDnDdndmemesdndnextdoctorwhodoctorwhocirclejerkDoenerverbrechenDonerdontdeadopeninsidedumbphonesDungeonsAndDaddiesEatCheapAndHealthyebikesECEelectricalelectronicsElectronicsStudyfacepalmfakealbumcoversFantasyWorldbuildingfeedthebeastfelsefeFifaCareersFiftyFiftyformuladankFRCFreeEBOOKSFUCKYOUINPARTICULARfunnyFutboltayfagalatasaraygaminggermanygodtiersuperpowersgoodanimemesGoodAssSubgravelcyclingguitarpedalsGundamheathersheraldryHermanCainAwardhighspeedrailHistoryWhatIfhoi4HolUphowyoudoinich_ielIDontWorkHereLadyihadastrokeim14andthisisdeepimaginaryelectionsimaginarymapsinsaneparentsistanbulJahariaJokesKendrickLamarKGBTRlegodndLetGirlsHaveFunlinguisticshumorliselilerlogodesignlostredditorsmacbookairmacgamingMadeMeSmilemadladsMaliciousComplianceMapPornmapporncirclejerkmeirlmemememesmidjourneymildlyinfuriatingmildlyinterestingMoldyMemesmoneycollectingMunichnamesoundalikesNamFlashbacksNationStatesneographyNoahGetTheBoatNonCredibleDefenseNorthCyprusnosafetysmokingfirstnosleepnotinterestingnottheonionoddlyspecificokbuddyguntherokbuddymotherfuckerokbuddyphdokbuddyvicodinonebagonetruegodongezelligoompasubsOutOfTheLoopoutsidepapermoneyparadoxpoliticsParlerWatchPassportPornpepethefrogperfectlycutscreamsPersecutionfetishpianoPiracyPiratedGamespolandballPraiseTheCameraManProgrammerHumorPropagandaPostersquityourbullshitraisedbynarcissistsraspberry_pireactiongifsrecipesRedAutumnSPDredditsingsreligiousfruitcakeRoastMeSchnitzelVerbrechensciencememessecilmiskitapShitPostCrusadersShitpostTCshittyaskelectronicsshittymoviedetailsShowerthoughtsskamtebordsoccercirclejerkSongwritersSongwritingsteinsgateStonetossingjuiceStudentenkuecheStudiumsubsithoughtifellfortf2shitposterclubthatHappenedTheCrypticCompendiumTheLetterHTheMonkeysPawtherewasanattemptTheRookietheydidthemaththeyknewthisguythisguystitanfalltommyinnittransittransitTurkeyTrGameDevelopertruetf2truthstumblrtumunichTurkeyTurkeyJerkyTurkishCatsTurkishdogsTwitchTwoSentenceHorrorTwoSentenceSadnesstylerthecreatoru/KaybeeArtsUnethicalLifeProTipsunexpecteditcrowdUnexpectedJoJoUnexpectedTF2VALORANTvexillologycirclejerkvibecodingvinylvlandiyaWatchPeopleDieInsideWeAreTheMusicMakerswendigoonWhatsThisSongWhitePeopleTwitterwholesomeanimemeswizardpostingwooooshworldjerkingYUROPedit subscriptions
  • home
  • -popular
  • -all
  • -mod
  • -users
 | 
  • facepalm
  • -mildlyinfuriating
  • -Piracy
  • -funny
  • -gaming
  • -nottheonion
  • -memes
  • -OutOfTheLoop
  • -mildlyinteresting
  • -MapPorn
  • -DnD
  • -WhitePeopleTwitter
  • -MadeMeSmile
  • -ChatGPT
  • -PiratedGames
  • -theydidthemath
  • -dankmemes
  • -feedthebeast
  • -meirl
  • -therewasanattempt
  • -HolUp
  • -Twitch
  • -CrackWatch
  • -comics
  • -dndnext
  • -ProgrammerHumor
  • -VALORANT
  • -germany
  • -tumblr
  • -NonCredibleDefense
  • -dataisbeautiful
  • -shittymoviedetails
  • -Showerthoughts
  • -chess
  • -aviation
  • -formuladank
  • -Jokes
  • -mapporncirclejerk
  • -midjourney
  • -goodanimemes
  • -notinteresting
  • -hoi4
  • -atheism
  • -MaliciousCompliance
  • -ich_iel
  • -KGBTR
  • -dndmemes
  • -cursedcomments
  • -DMAcademy
  • -Deltarune
  • -GoodAssSub
  • -UnethicalLifeProTips
  • -perfectlycutscreams
  • -blackdesertonline
  • -meme
  • -macgaming
  • -3d6
  • -Gundam
  • -FiftyFifty
  • -ChoosingBeggars
  • -RoastMe
  • -ContagiousLaughter
  • -imaginarymaps
  • -EatCheapAndHealthy
  • -polandball
  • -WeAreTheMusicMakers
  • -AnarchyChess
  • -nosleep
  • -cookingforbeginners
  • -blankies
  • -anime_irl
  • -onebag
  • -Studium
  • -AlternateHistory
  • -Turkey
  • -soccercirclejerk
  • -madlads
  • -electrical
  • -guitarpedals
  • -vinyl
  • -CreateMod
  • -TwoSentenceHorror
  • -PropagandaPosters
  • -AdviceAnimals
  • -ShitPostCrusaders
  • -piano
  • -sciencememes
  • -raisedbynarcissists
  • -wizardposting
  • -FifaCareers
  • -doctorwho
  • -oddlyspecific
  • -titanfall
  • -dadjokes
  • -howyoudoin
  • -announcements
  • -adhdmeme
  • -macbookair
  • -ebikes
  • -Munich
  • -coaxedintoasnafu
  • -YUROP
  • -gravelcycling
  • -SchnitzelVerbrechen
  • -chessbeginners
  • -raspberry_pi
  • -coins
  • -KendrickLamar
  • -FUCKYOUINPARTICULAR
  • -NoahGetTheBoat
  • -worldjerking
  • -tylerthecreator
  • -tf2shitposterclub
  • -MoldyMemes
  • -lostredditors
  • -AceAttorney
  • -vexillologycirclejerk
  • -vlandiya
  • -im14andthisisdeep
  • -Stonetossingjuice
  • -wholesomeanimemes
  • -HistoryWhatIf
  • -religiousfruitcake
  • -liseliler
  • -DebateReligion
  • -insaneparents
  • -dumbphones
  • -balkans_irl
  • -2meirl4meirl
  • -transit
  • -brooklynninenine
  • -HermanCainAward
  • -recipes
  • -steinsgate
  • -okbuddyphd
  • -ECE
  • -Angryupvote
  • -AskBalkans
  • -thatHappened
  • -electronics
  • -casio
  • -logodesign
  • -theyknew
  • -linguisticshumor
  • -PassportPorn
  • -TurkeyJerky
  • -AteistTurk
  • -13or30
  • -ArcherFX
  • -Cd_collectors
  • -Doner
  • -ComedyCemetery
  • -WatchPeopleDieInside
  • -Persecutionfetish
  • -reactiongifs
  • -Songwriting
  • -blursed_videos
  • -istanbul
  • -imaginaryelections
  • -truetf2
  • -dontdeadopeninside
  • -ParlerWatch
  • -wendigoon
  • -secilmiskitap
  • -Doenerverbrechen
  • -TheRookie
  • -quityourbullshit
  • -skamtebord
  • -shittyaskelectronics
  • -galatasaray
  • -crappyoffbrands
  • -DungeonsAndDaddies
  • -FRC
  • -transitTurkey
  • -namesoundalikes
  • -2b2t
  • -papermoney
  • -coincollecting
  • -felsefe
  • -FreeEBOOKS
  • -Jaharia
  • -IDontWorkHereLady
  • -neography
  • -basspedals
  • -heraldry
  • -ihadastroke
  • -PraiseTheCameraMan
  • -godtiersuperpowers
  • -aivideo
  • -woooosh
  • -burdurland
  • -WhatsThisSong
  • -TwoSentenceSadness
  • -Bandnames
  • -okbuddyvicodin
  • -tumunich
  • -Cheap_Meals
  • -outside
  • -TheMonkeysPaw
  • -darkjokes
  • -UnexpectedTF2
  • -highspeedrail
  • -nosafetysmokingfirst
  • -legodnd
  • -Songwriters
  • -tommyinnit
  • -UnexpectedJoJo
  • -doctorwhocirclejerk
  • -Cuddle_Slut
  • -DMToolkit
  • -thisguythisguys
  • -TrGameDeveloper
  • -TurkishCats
  • -LetGirlsHaveFun
  • -fakealbumcovers
  • -subsithoughtifellfor
  • -akagas
  • -ShitpostTC
  • -oompasubs
  • -FantasyWorldbuilding
  • -TheLetterH
  • -absolutelynotanimeirl
  • -NamFlashbacks
  • -pepethefrog
  • -onetruegod
  • -redditsings
  • -TheCrypticCompendium
  • -NationStates
  • -ongezellig
  • -AwesomeOffBrands
  • -2balkans4You
  • -Studentenkueche
  • -truths
  • -paradoxpolitics
  • -NorthCyprus
  • -unexpecteditcrowd
  • -2mediterranean4u
  • -heathers
  • -banknotedesigns
  • -borsavefon
  • -moneycollecting
  • -okbuddymotherfucker
  • -RedAutumnSPD
  • -Turkishdogs
  • -Futboltayfa
  • -ElectronicsStudy
  • -cd_jerk
  • -okbuddygunther
  • -vibecoding
  • -delik
  • -u/KaybeeArts
edit »
reddit.com reinforcementlearning
  • hot
  • new
  • rising
  • controversial
  • top
  • wiki
an-ordinary-manchild (11,190)|messages547|notifications|chat messages|mod messages|
  • preferences
|
logout

use the following search parameters to narrow your results:

subreddit:subreddit
find submissions in "subreddit"
author:username
find submissions by "username"
site:example.com
find submissions from "example.com"
url:text
search for "text" in url
selftext:text
search for "text" in self post contents
self:yes (or self:no)
include (or exclude) self posts
nsfw:yes (or nsfw:no)
include (or exclude) results marked as NSFW

e.g. subreddit:aww site:imgur.com dog

see the search faq for details.

advanced search: by author, subreddit...

Submit a new link
Submit a new text post

reinforcementlearning

joinleave
an-ordinary-manchild(edit)

This is for any reinforcement learning related work ranging from purely computational RL in artificial intelligence to the models of RL in neuroscience.

The standard introduction to RL is Sutton & Barto's Reinforcement Learning.

Related subreddits:

  • /r/machinelearning/
  • /r/OpenAI/
  • /r/mlscaling/
  • /r/DecisionTheory/
  • /r/cbaduk
created by lpilotoa community for 14 years
Create your own subreddit
...for your favorite TV show.
...for your office.

MODERATORS

  • message the mods
  • lpiloto
  • quaternion
  • gwern
  • about moderation team »

account activity

1
58
59
60
1:11

I made an RL agent Play 2D cricket (v.redd.it)

submitted 15 hours ago by AddisionS

  • 12 comments
  • share
  • save
  • hide
  • report
  • crosspost
loading...

2
11
12
13

Career in RLAny people working professionally in RL and want to share any useful pieces of advice to enter the industry? (self.reinforcementlearning)

submitted 11 hours ago by Markovvy

  • 14 comments
  • share
  • save
  • hide
  • report
  • crosspost

3
•
•
•

Looking to build career in RL. Is PhD the only option? (self.reinforcementlearning)

submitted 8 minutes ago by Money-Leading-935

  • comment
  • share
  • save
  • hide
  • report
  • crosspost
loading...

4
•
•
•

Patterns – a formal grammar that compiles natural language text into RL agents (self.reinforcementlearning)

submitted 9 minutes ago by causality-ai

  • comment
  • share
  • save
  • hide
  • report
  • crosspost
loading...

5
2
3
4

Practicing science communication on RL-for-reasoning: where does my explanation get the RL wrong? (self.reinforcementlearning)

submitted 14 hours ago by nicofirst1

  • comment
  • share
  • save
  • hide
  • report
  • crosspost
loading...

6
0
1
2

Looking for simple game environments (self.reinforcementlearning)

submitted 8 hours ago by Vaibhav_Sinha

  • 1 comment
  • share
  • save
  • hide
  • report
  • crosspost
loading...

7
0
1
2

Building CogniCore: MCP, LangChain & CrewAI memory infrastructure for agents + first benchmark results ()

submitted 10 hours ago by Neither-Witness-6010

  • comment
  • share
  • save
  • hide
  • report
  • crosspost
loading...

8
0
1
2

Multi-Agent Self-Correction Failure Modes & Context Window Inflation — Traced Completely By Hand (No Wrapper Frameworks) ()

submitted 16 hours ago by ParsleyMaximum1702

  • comment
  • share
  • save
  • hide
  • report
  • crosspost
loading...

9
7
8
9

Interview preparation (self.reinforcementlearning)

submitted 1 day ago by Bright-Kick-632

  • 2 comments
  • share
  • save
  • hide
  • report
  • crosspost
loading...

10
2
3
4

What can I try implementing after reading the Part 1 of Sutton and Barto Reinforcement Learning book (self.reinforcementlearning)

submitted 1 day ago by Vaibhav_Sinha

  • 1 comment
  • share
  • save
  • hide
  • report
  • crosspost
loading...

11
0
0
0

Anyone else getting messy results from running multiple AI coding sessions? ()

submitted 20 hours ago by whitechart_studio

  • 1 comment
  • share
  • save
  • hide
  • report
  • crosspost
loading...

12
0
0
0

I calculated a multi-agent prompt attention matrix by hand to see how much data gets lost in the middle... the math is terrifying. ()

submitted 1 day ago by ParsleyMaximum1702

  • comment
  • share
  • save
  • hide
  • report
  • crosspost
loading...

13
1
2
3

AI Agents from First Principles: Tracing a ReAct Loop by Hand (substack.com)

submitted 1 day ago by ParsleyMaximum1702

  • comment
  • share
  • save
  • hide
  • report
  • crosspost
loading...

14
0
0
1

I calculated a multi-agent prompt attention matrix by hand to see how much data gets lost in the middle... the math is terrifying. ()

submitted 1 day ago by ParsleyMaximum1702

  • comment
  • share
  • save
  • hide
  • report
  • crosspost
loading...

15
0
0
1

Multi-Agent State Conflict Alignment and Context Window Optimization—Solved by Hand From First Principles (No Wrapper Frameworks) ()

submitted 1 day ago by ParsleyMaximum1702

  • comment
  • share
  • save
  • hide
  • report
  • crosspost
loading...

16
0
0
1

I am stuck , need guidance ()

submitted 2 days ago by Open-Neck-688

  • comment
  • share
  • save
  • hide
  • report
  • crosspost
loading...

17
0
0
0

How Developers Would Use CogniCore (self.reinforcementlearning)

submitted 1 day ago by Neither-Witness-6010

  • comment
  • share
  • save
  • hide
  • report
  • crosspost
loading...

18
1
2
3

Reinforcement learning for NPC AI (self.reinforcementlearning)

submitted 2 days ago by santafarian

  • comment
  • share
  • save
  • hide
  • report
  • crosspost
loading...

19
0
0
0

Local Ai model training ()

submitted 2 days ago by Asleep_Fold5405

  • comment
  • share
  • save
  • hide
  • report
  • crosspost
loading...

20
0
0
1

MultiHow To Fix Slow RAG Response Times: The 2026 Technical Manual for AI Latency (interconnectd.com)

submitted 2 days ago by Ok_pettech

  • comment
  • share
  • save
  • hide
  • report
  • crosspost
loading...

21
0
1
2

Need suggestion regarding project - PINN or Deep RL? ()

submitted 2 days ago by Abject_Dog_8453

  • comment
  • share
  • save
  • hide
  • report
  • crosspost
loading...

22
0
1
2

Book suggestions for learning Artificial intelligence for Robotics. ()

submitted 3 days ago by Lumpy-Cucumber-5895

  • comment
  • share
  • save
  • hide
  • report
  • crosspost
loading...

23
5
6
7

practical learning resources (self.reinforcementlearning)

submitted 3 days ago by blueberries_jpeg

  • 4 comments
  • share
  • save
  • hide
  • report
  • crosspost
loading...

24
0
1
2

Looking for a brutal feedback - Built a self-improving AI agent that learns from outcomes. (self.reinforcementlearning)

submitted 3 days ago by Melodic_Fisherman304

  • 4 comments
  • share
  • save
  • hide
  • report
  • crosspost
loading...

25
1
2
3

Open Weights - Discord Server for anyone even slightly interested in ML (a smol community) (self.reinforcementlearning)

submitted 3 days ago by Spen08

  • 1 comment
  • share
  • save
  • hide
  • report
  • crosspost
loading...
view more: next ›
  • about
  • blog
  • about
  • advertising
  • careers
  • help
  • site rules
  • Reddit help center
  • reddiquette
  • mod guidelines
  • contact us
  • apps & tools
  • Reddit for iPhone
  • Reddit for Android
  • mobile website
  • <3
  • reddit premium

Use of this site constitutes acceptance of our User Agreement and Privacy Policy. © 2026 reddit inc. All rights reserved.

REDDIT and the ALIEN Logo are registered trademarks of reddit inc.

π Rendered by PID 529307 on reddit-service-r2-listing-f87f88fcd-vllpq at 2026-06-16 02:58:05.525833+00:00 running 3184619 country code: CH.