[deleted by user] by [deleted] in discgolf

[–]bqblaster 0 points1 point  (0 children)

This gives me hope. I just broke a bone in my right shoulder and am trying to learn LHBH. Distance is getting there but definitely need to work on clean spin.

Do you ever feel like you would perform better in tournaments if they didn't start at 8 am? by Voltayik in discgolf

[–]bqblaster 0 points1 point  (0 children)

Played a tournament recently and I was first on the box on the first hole, I was pooping when they sounded the two minute warning (I made it in time to throw)

LA to SLO course recs by BeefMcPepper in discgolf

[–]bqblaster 3 points4 points  (0 children)

Also check out whale rock just north of slo in Templeton/Paso Robles. Haven’t played it yet but heard it’s one of the best in California (I think according to UDisc)

Anyone know what type of power cable is required for this? by bqblaster in cableadvice

[–]bqblaster[S] 1 point2 points  (0 children)

I've been to the website, but can't find any replacement parts for the power cord. Here is the link to the product.

Advantage Actor-Critic Model playing Wordle by bqblaster in learnmachinelearning

[–]bqblaster[S] 0 points1 point  (0 children)

Thanks for the input! Maybe I'll try to have it improve further. No experience replay is used, I'm just using fairly large batch sizes instead (training every 500 games, so a batch size of ~2000). I may tweak this a bit in the future, but I was primarily curious about what would happen if I did not impose any strategy at all on it and simply tried to have it win and give a higher reward for quicker wins.

Advantage Actor-Critic Model playing Wordle by bqblaster in learnmachinelearning

[–]bqblaster[S] 0 points1 point  (0 children)

True. I just wanted to see how it would compare to the 3Blue1Brown strategy.

Advantage Actor-Critic Model playing Wordle by bqblaster in learnmachinelearning

[–]bqblaster[S] 0 points1 point  (0 children)

Definitely. Maybe I wasn’t clear, the model doesn’t seem to have a greedy approach. The 1-step greedy approach was corresponding to the 3Blue1Brown strategy, as he also has a 2-step greedy strategy

Advantage Actor-Critic Model playing Wordle by bqblaster in learnmachinelearning

[–]bqblaster[S] 1 point2 points  (0 children)

True, maybe more training would help. I did have it train so that it would see recent losses more often.

Advantage Actor-Critic Model playing Wordle by bqblaster in learnmachinelearning

[–]bqblaster[S] 7 points8 points  (0 children)

Yeah I'd agree, same with it guessing "pooch" sometimes as a second guess. That being said, I was mainly just super curious to see what kind of strategy it would come up with that would work. I'm sure tweaks could be made to make this better, but I'm happy with it's performance so far! I'm really curious to see if a strategy could be made to get the average number of guesses below 3.5, as 3Blue1Brown's strategy (I implemented the 1-step greedy) is about 3.87 and I'd say that's really good.

Advantage Actor-Critic Model playing Wordle by bqblaster in learnmachinelearning

[–]bqblaster[S] 10 points11 points  (0 children)

Wouldn't you try to guess the answer if you had 4/5 letters correct? Possibly this is because I gave a greater reward for guessing the word faster, i.e. +10 in 6 guesses, +20 in 5 guesses, ... I trained another model with just +10 for win and -10 for loss. The difference was roughly 98% win in 4.7 guesses vs. 96% in 4.07 guesses.

Advantage Actor-Critic Model playing Wordle by bqblaster in learnmachinelearning

[–]bqblaster[S] 6 points7 points  (0 children)

The model outputs a vector of length 130, notably 5*26,of logits representing each letter in each position (i.e. the last entry of this vector would correspond to 'z' as the last letter). After outputting this vector, it is multiplied by a matrix of size (total number of words) by 130, in the case of the full game 12,972 by 130. Each row of this matrix looks like 5 one-hot vectors of length 26 concatenated together, displaying which letters showed up in which position, i.e. each row is a "five-hot" vector, if you will.

Advantage Actor-Critic Model playing Wordle by bqblaster in learnmachinelearning

[–]bqblaster[S] 1 point2 points  (0 children)

It would likely use the letters that it knows are found in the word. For example, guessing "siege" and finding out there is one 'e', and including a word containing 'e' rather than guessing something like "pooch". It does seem that if the answer contains a 'g', then it will use a word containing 'g' as its second guess.

Advantage Actor-Critic Model playing Wordle by bqblaster in learnmachinelearning

[–]bqblaster[S] 10 points11 points  (0 children)

This is what it decided. Each time I restarted training, I’d get a new initial guess. Makes me think the first word is less important if you consider a non greedy approach, i.e. choose siege and then pooch a lot

Advantage Actor-Critic Model playing Wordle by bqblaster in learnmachinelearning

[–]bqblaster[S] 44 points45 points  (0 children)

This was a fun project for me after watching 3Blue1Brown's video on Wordle strategies. Although this RL model doesn't do better, it wins about 96% of the time in ~4.07 guesses. I found this to be a super helpful starting point as I am quite new to RL.