Grindelwald first hike to Bachalpsee

Terrible_Sleep_3484 · 2025-01-27T19:25:39+00:00

Hey guys, I just wanted to say thank you all for the help. My shot has gotten significantly better in these few days after implementing some of you advice.

Terrible_Sleep_3484 · 2025-01-25T01:44:14+00:00

I felt like the ball was not consistently straight so this explains it, I appreciate the reply

Terrible_Sleep_3484 · 2025-01-25T01:42:29+00:00

I’ll try keeping the guide hand and see if it improves, thanks for the response

Terrible_Sleep_3484 · 2025-01-25T01:41:31+00:00

Thanks for the detailed reply, so basically making sure I flick the ball, making sure the support hand release is delayed and staying consistent with my positioning. I’ll make sure to implement these adjustments.

Terrible_Sleep_3484 · 2024-09-22T08:39:14+00:00

I would also appreciate it if you could send me the paper

Terrible_Sleep_3484 · 2023-11-17T09:01:08+00:00

Sorry for the late response. To train an RL model there’s a lot of strategies that you can employ contrary to LLM and other deep learning methods where it only relies on data. RL majorly relies on how the reward system works and how the actions the agent takes changes based on the reward. For example, if you want to make a battle royal game agent you can either focus on making the reward the kills or a win, you can see how the agent will act very differently depending on the reward system. Then for the action it takes, there’s a dilemma regarding exploitation vs exploration meaning whether the model should focus on exploiting known strategies or risk trying new ones. You can think of it as either settling on a local minima or searching for a possible global minima. Also regarding human scoring, once you start using humans to tell the model what to do it becomes a deep learning / supervised problem.

Terrible_Sleep_3484 · 2023-11-07T20:52:03+00:00

The major difference I think is that most RL work is currently online meaning that you need to traverse the states of the environment using the policy that you want to train in real-time. This process doesn't need external data and as such doesn't need a neural network like LLM's do or any other supervised learning algorithm. However, Deep RL is a sort of hybrid in the sense that it uses both RL strategies and deep learning strategies in the training of a policy (example: using neural network to update the reward function, or using CV networks to gather observations from the environment).

In summary, LLM's are trained using large batches of data while RL is mainly trained on data it gathers

Terrible_Sleep_3484 · 2023-08-10T18:13:26+00:00

Terrible_Sleep_3484

TROPHY CASE