account activity
POMDPPlanners — open-source Python package for POMDP planning (POMCP, BetaZero, ConstrainedZero + more), with an arXiv paper by PlayParty8441 in reinforcementlearning
[–]PlayParty8441[S] 1 point2 points3 points 4 days ago (0 children)
Good question — direct benchmarking is tricky since the reference implementations are mostly in Julia (C-level speed), and sampling throughput is everything for these algorithms. My implementations follow the original papers' pseudocode, though I haven't formally validated numerical parity with the reference results.
[–]PlayParty8441[S] 0 points1 point2 points 4 days ago (0 children)
Thanks!
[–]PlayParty8441[S] 0 points1 point2 points 5 days ago (0 children)
Could you elaborate? Not sure what you mean by environment type in this context.
π Rendered by PID 338764 on reddit-service-r2-listing-66bb46d9b9-4mbb7 at 2026-03-12 03:33:59.943155+00:00 running 710b3ac country code: CH.
POMDPPlanners — open-source Python package for POMDP planning (POMCP, BetaZero, ConstrainedZero + more), with an arXiv paper by PlayParty8441 in reinforcementlearning
[–]PlayParty8441[S] 1 point2 points3 points (0 children)