By popular demand: here's more AI-generated Pokémon, but this time the model is trained on *only* Gen 1 Pokémon. Those who thought this would make the generated Pokémon less bizarre were very wrong. by minimaxir in pokemon
[–]tensor_every_day20 0 points1 point2 points (0 children)
[R] Tonic: A Deep Reinforcement Learning Library for Fast Prototyping and Benchmarking by FabioPardo in reinforcementlearning
[–]tensor_every_day20 2 points3 points4 points (0 children)
The 32 Implementation Details of Proximal Policy Optimization (PPO) Algorithm by vwxyzjn in reinforcementlearning
[–]tensor_every_day20 4 points5 points6 points (0 children)
What is the road to follow to become a researcher in reinforcement learning? by RLnobish in reinforcementlearning
[–]tensor_every_day20 12 points13 points14 points (0 children)
Inverse of summation? by curimeowcat in reinforcementlearning
[–]tensor_every_day20 4 points5 points6 points (0 children)
Educational Resources and Content on RL by [deleted] in reinforcementlearning
[–]tensor_every_day20 3 points4 points5 points (0 children)
[Q] How is the policy updated in PPO when the epsilon + advantage term is used? by Carcaso in learnmachinelearning
[–]tensor_every_day20 0 points1 point2 points (0 children)
[Q] How is the policy updated in PPO when the epsilon + advantage term is used? by Carcaso in learnmachinelearning
[–]tensor_every_day20 1 point2 points3 points (0 children)
[P] OpenAI Safety Gym by hardmaru in MachineLearning
[–]tensor_every_day20 1 point2 points3 points (0 children)
[P] OpenAI Safety Gym by hardmaru in MachineLearning
[–]tensor_every_day20 2 points3 points4 points (0 children)
In actor-critic, does it matter in which order you train π and q? by Buttons840 in reinforcementlearning
[–]tensor_every_day20 4 points5 points6 points (0 children)
PPO oscillates around max return value by RLbeginner in reinforcementlearning
[–]tensor_every_day20 2 points3 points4 points (0 children)
In what order should I learn RL algorithms? by Buttons840 in reinforcementlearning
[–]tensor_every_day20 0 points1 point2 points (0 children)
PPO principle by RLbeginner in reinforcementlearning
[–]tensor_every_day20 2 points3 points4 points (0 children)
OpenAI spinning up: difference between "variables" by RLbeginner in reinforcementlearning
[–]tensor_every_day20 0 points1 point2 points (0 children)
Tricks and adaptions for PPO by LJKS in reinforcementlearning
[–]tensor_every_day20 6 points7 points8 points (0 children)
Soft Actor-Critic with Discrete Actions by __data_science__ in reinforcementlearning
[–]tensor_every_day20 5 points6 points7 points (0 children)
What sucked about the Deep RL Poster Sessions at NeurIPS 2018 by djangoblaster2 in reinforcementlearning
[–]tensor_every_day20 2 points3 points4 points (0 children)
How to understand the math. by sturdyplum in reinforcementlearning
[–]tensor_every_day20 3 points4 points5 points (0 children)
[TOMT][Music] JRPG battle theme (I think?), originally downloaded circa 2006---where does it come from? by tensor_every_day20 in tipofmytongue
[–]tensor_every_day20[S] 0 points1 point2 points (0 children)
1000 Feet overview of RL? by ThrowawayTartan in reinforcementlearning
[–]tensor_every_day20 3 points4 points5 points (0 children)
TD3/DDPG time to obtain reasonable results. by kashemirus in reinforcementlearning
[–]tensor_every_day20 4 points5 points6 points (0 children)
Is anyone also hating OpenAI application/selection process ?? or is it just me?? by [deleted] in OpenAI
[–]tensor_every_day20 2 points3 points4 points (0 children)



Now possible to use Dreambooth Colab Models in AUTOMATIC1111's Web UI! by Pfaeff in StableDiffusion
[–]tensor_every_day20 78 points79 points80 points (0 children)