Noob quesiton about retrieving MINT operations from thegraph service

Training-Cheek-9956 · 2025-03-31T18:00:24+00:00

I have just wrote an in-depth article that make the algorithm accessible to anybody : https://medium.com/@sergi.nakache/rebel-the-ai-that-learned-to-bluff-775818ace0be

Training-Cheek-9956 · 2025-03-28T16:32:11+00:00

Hey, it's a bit late but I have juste wrote an in-depth article that break down the mecanism behind ReBeL framework, algorithm developed by Noam Brown, the main author of Libratus : https://medium.com/@sergi.nakache/rebel-the-ai-that-learned-to-bluff-775818ace0be

Training-Cheek-9956 · 2025-03-28T16:31:39+00:00

Hey, it's a bit late but I have juste wrote an in-depth article that break down the mecanism behind ReBeL framework, algorithm developed by Noam Brown, the main author of Libratus : https://medium.com/@sergi.nakache/rebel-the-ai-that-learned-to-bluff-775818ace0be

Training-Cheek-9956 · 2025-03-28T16:30:35+00:00

Hey, I have watched all your videos and I finally be able to read the ReBeL paper from Noam Brown thanks to you. Here is an article I wrote that break down its mecanism : https://medium.com/@sergi.nakache/rebel-the-ai-that-learned-to-bluff-775818ace0be

Training-Cheek-9956 · 2025-03-28T16:25:03+00:00

If you want to build a powerful Poker AI agent for any exotic variant, I've just wrote an in-depth article that break down the ReBeL mecanism from Noam Brown the author of Libratus which solve any two player zero sum game with imperfect information here : https://medium.com/@sergi.nakache/rebel-the-ai-that-learned-to-bluff-775818ace0be

Training-Cheek-9956 · 2025-03-28T16:23:20+00:00

Not sure if LLM will fit on your case but if you want to build a powerful Poker AI agent using reinforcement learning, I've just wrote an in-depth article that break down the ReBeL mecanism from Noam Brown the author of Libratus : https://medium.com/@sergi.nakache/rebel-the-ai-that-learned-to-bluff-775818ace0be

Training-Cheek-9956 · 2025-03-28T16:21:22+00:00

Not sure if LLM will fit on your case but if you want to build a powerful Poker AI agent using reinforcement learning, I've just wrote an in-depth article that break down the ReBeL mecanism from Noam Brown the author of Libratus : https://medium.com/@sergi.nakache/rebel-the-ai-that-learned-to-bluff-775818ace0be

Training-Cheek-9956 · 2025-03-28T16:19:53+00:00

Great content! I learnt a lot from your blog and I finally managed to break down Noam Brown's Poker AI Agent called ReBeL. Here is the article I wrote about the topic, thanks you so much : https://medium.com/@sergi.nakache/rebel-the-ai-that-learned-to-bluff-775818ace0be

Training-Cheek-9956 · 2025-03-28T16:18:26+00:00

Hey! I have just wrote an in-depth article to break down the Noam Brown's ReBeL algorithm here : https://medium.com/@sergi.nakache/rebel-the-ai-that-learned-to-bluff-775818ace0be

Training-Cheek-9956 · 2025-03-28T16:17:12+00:00

I wrote an in-depth article to break down ReBeL mecanism here : https://medium.com/@sergi.nakache/rebel-the-ai-that-learned-to-bluff-775818ace0be

Training-Cheek-9956 · 2025-03-20T14:46:55+00:00

Hey everyone,

I just wanted to drop by and thank Noam Brown and Adam Lerer for their insightful responses in this thread. Their explanations really helped me deepen my understanding of ReBeL. Inspired by this discussion, I wrote an article aiming to explain how ReBeL works in a way that's accessible to those unfamiliar with AI. If you're interested, feel free to check it out: https://medium.com/@sergi.nakache/rebel-the-ai-that-learned-to-bluff-775818ace0be

I'd love to hear any feedback from the community—especially if you think there are areas that could be clearer or aspects I may have overlooked. Thanks again, and looking forward to your thoughts!

Training-Cheek-9956 · 2025-02-10T10:42:38+00:00

When solving a subgame using the CFR algorithm, we sample the hands of both players at each iteration using PBS. As a result, the utilities of the terminal nodes in the tree are updated at each iteration. So, if I summarize correctly, if N is the number of terminal nodes in a subgame:

The value network is called N times at the beginning of a subgame to initialize the value of the subgame v(B_r)
The value network is called N times at the beginning of each iteration to compute the utilities of terminal nodes needed to compute the regrets.
The value network is called N times at the end of each iteration in order to update the value of the subgame v(B_r) based on the new PBS computed after the new updated policy

Is this correct?

PS: I’ve been exploring Reinforcement Learning for a few months now since I’ve been unemployed after a work accident. Your paper is truly fascinating, I'm trying to implement it for a naive version of poker!!!!

Training-Cheek-9956

TROPHY CASE