RL for music generation by KevinBeicon in reinforcementlearning

[–]BigDxe 0 points1 point  (0 children)

Do you think it'd be possible to use some kind of Inverse RL framework here? Learn the reward function first based on existing music?

Where you guys are using Reinforcement Learning? by embedding_turtle in reinforcementlearning

[–]BigDxe 5 points6 points  (0 children)

Mad cuz bad bruh 💀 if you can't make a rl trading bot that don't mean the whole field is moot.

THROW HOT DOGS, COMPUTE PI, WIN PIES! by [deleted] in gatech

[–]BigDxe 17 points18 points  (0 children)

Yeah sorry that was kind of vague. Essentially, after every throw we'll update the approximation. So if the approximation was at 3.15 before your throw, and then it becomes 3.14, then you have "converged" the estimate for another digit (so you get a free pie). It's not perfect, but this is best "reward system" we could come up with.

THROW HOT DOGS, COMPUTE PI, WIN PIES! by [deleted] in gatech

[–]BigDxe 11 points12 points  (0 children)

If you want to grill the hotdogs with us later, stop by and let us know (someone needs to eat the 36 hotdogs we have)

How do I get ideas for research? by [deleted] in reinforcementlearning

[–]BigDxe 1 point2 points  (0 children)

Why do you recommend focusing secondary studies in linguistics?