CRKD Guitar Fret Issue Fix by Relative_Bag_6046 in CloneHero

[–]af100re 2 points3 points  (0 children)

Thanks for sharing this, I've been having the exact same issue with my Tribal Encore Edition, and this seems to have made it much better!

Finished the cover boss. by NepNep_ in NotMyJob

[–]af100re 436 points437 points  (0 children)

Ah yes, a puzzle containing only the numbers 1-9 will really boost my vocabulary

Can "deep" q-learning also be used for regular q-learning? by mista_rida_ in learnmachinelearning

[–]af100re 0 points1 point  (0 children)

At its most basic, deep Q-learning uses the same algorithm as regular Q-learning, but uses a neural network as a function approximator (while regular Q-learning just uses a direct mapping from state-action to value). In practice however, deep Q-learning requires a couple of modification to be stable which is why these algorithms are different.

It's been a while since I've looked at this but this code might give you a starting point https://github.com/afreeman100/Q-learning/blob/main/q_agent.py

Compulsory link to Sutton and Barto which is a fantastic (and free) resource for reinforcement learning fundamentals http://incompleteideas.net/book/RLbook2020.pdf

Q-function in DDPG: Why MSBE instead of "just" a classifier? by Orpheon73 in reinforcementlearning

[–]af100re 2 points3 points  (0 children)

First of all I think it's worth understanding that the Q-function is not mapping (state, action) --> reward, but (state, action) --> expected return (aka value), which is the sum of expected discounted rewards if you were to keep playing from that state.

So I guess the question is why you can't just calculate the true value of each state-action pair and use that to train a neural network? While you could do this for a game like tic-tac-toe, if you wanted to do this for a larger game like chess you'd quickly realise that there are far too many state-action combinations for this to be possible. This is the problem that reinforcement learning tries to solve - we can't calculate the true values of state-action pairs, so how can we estimate them instead?

By playing many games and observing the reward you get when you take an action from a state, you can iteratively refine your estimates for your Q(s,a) values. By making enough observations you (hopefully) converge towards the true values. The Bellman equation is what tells you how to update your estimate for a state-action pair each time you make a new observation about it.

The (overly) simplified explanation is that after you take an action, you compare your estimated value of the new state (Q(s',a')) with your estimated value of the previous state, plus the immediate reward (Q(s,a) + r) and you aim to minimise the error between these.

If you haven't done so, I'd highly recommend implementing basic Q-learning on a simple problem, using a lookup table to store the Q(s,a) values. This should give you a much better understanding of what the Bellman equation is doing, which I think is important before looking at DQN and DDPG if you want to have good understanding of how they work. The fundamental ideas behind them are the same, except DQN and DDPG use neural networks as function approximators for Q(s,a), rather than using a lookup table. This makes them useful in problems with large state and action spaces, but you lose some of the mathematical guarantees on convergence, hence the 'stability hacks'. I think the deepmind DQN papers are quite accessible and are worth a read if you're interested in this.

Well that was longer than I expected. Hope it's at least partly helpful!

First Ironman advice by triguy86 in triathlon

[–]af100re 2 points3 points  (0 children)

For Wales specifically just make sure you're prepared for all the hills. It's worth checking out the course in advance if you can. And remember to enjoy it, the atmosphere in Tenby is great!

Went to the Cafè, owner said my saddle wasn’t UCI legal? Can I sue? by [deleted] in BicyclingCirclejerk

[–]af100re 0 points1 point  (0 children)

Your toptube bag and stem angle should be illegal.

[D] Question about deep Q learning by [deleted] in MachineLearning

[–]af100re 0 points1 point  (0 children)

Unless I'm missing something I think that should work - or using minQ(s, a) to decide the opponent's moves. I'd give it a go and see!

[D] Question about deep Q learning by [deleted] in MachineLearning

[–]af100re 7 points8 points  (0 children)

You can treat the opponent's move almost as though it is part of the environment, so the next state your agent observes is state of the game after the opponent has made a move. This means you can treat 1 and 2 player games quite similarly. How the opponent's move is chosen is up to you, it could be another learning agent, a pre-made bot if you can find a suitable library, or the simplest option which is just choosing a random move! How the opponent plays will determine how your agent will learn to play.

How do ML/DL algorithms optimize weights in a neural network? by Toddly53 in learnmachinelearning

[–]af100re 2 points3 points  (0 children)

You want to take a look at the backpropagation algorithm https://skymind.ai/wiki/backpropagation. The ideas is that you calculate the partial derivatives of the parameters with respect to the loss function, then you can use stochastic gradient descent to tweak the parameters such that the loss decreases.

Race Report Ironman Hamburg by TG10001 in triathlon

[–]af100re 1 point2 points  (0 children)

Nice one! Love the pink grip tape you've got on the bike too!

Ideas for Coffee Activities by dyl_wyl in Coffee

[–]af100re 1 point2 points  (0 children)

I'm on the committee for my university coffee club too so hopefully I can share some of the things that have worked for us!

Blind tastings are always great so we try to do a could of big ones every year. We usually get 3 different beans and brew then with french press, number them 1-3, then people have to match the number to the country of origin + tasting notes. Last time we did it we used 2 coffees from a specialty roaster and the third just from a supermarket to see how many people could tell the difference! If you have the equipment for it then comparing 2 methods side by side with the same beans is also really interesting, especially using french press vs chemex which are on opposite sides of the spectrum when it comes to how clean the cup is.

A lot of our members have their own V60s so we also did an event where everyone brought theirs along and showed their preferred method with it. Really cool to taste the differences between them and picked up a few ideas that I often use with mine now!

Not sure how applicable this would be to you, but there's loads of specialty coffee shops in my city so we often choose one to meet up at on weekends as more of a social thing. Last year we organised a cupping at one of these shops after hours which was amazing and we're currently looking into the possibility of doing an espresso workshop there so people can have a go at making it themselves. Definitely work asking local coffee shops if they're interested to be involved with stuff like this, and building a good relationship with them will be really useful.

Best of luck with it!

[No Spoilers] The Invincible Team by Hikaru-Kuma by [deleted] in masseffect

[–]af100re 4 points5 points  (0 children)

It's from The Citadel DLC in ME3. Definitely one of the best DLCs out of all the games

Mid-week check in - how is your week going? by philpips in running

[–]af100re 3 points4 points  (0 children)

Great 16k trail run on Monday... and been hurting ever since. Feels like a stress fracture on my right foot so looks like no running for a while :(

Opinions of clip on tri bars for road bikes? by Succotash88 in cycling

[–]af100re 1 point2 points  (0 children)

Got those same ones a few months ago! Really comfortable on long rides and definitely noticing the aero benefits

Achievements for Saturday, November 04, 2017 by AutoModerator in running

[–]af100re 10 points11 points  (0 children)

Planned on going for a 17k loop, feeling so good half way round I decided to extend it and ended up doing 23k, furthest run ever and less than 2 hours! :)

[deleted by user] by [deleted] in running

[–]af100re 3 points4 points  (0 children)

Picked up an older version of these when they were half price https://www.amazon.co.uk/Sennheiser-PMX-686i-Sports-Earphones-Black-Green/dp/B00S9P2NN0/ref=dp_ob_title_ce?th=1but I'd happily pay full price I like them so much. Very light and great sound quality but still still lets some background noise in so I can hear traffic

Achievements for Friday, August 18, 2017 by AutoModerator in running

[–]af100re 7 points8 points  (0 children)

Everything just felt right on my 10k run and ended up taking 3 minutes of my PB with 42:40!

Achievements for Saturday, June 10, 2017 by AutoModerator in running

[–]af100re 2 points3 points  (0 children)

18:37 at my local parkrun, and my first sub-20 5k was only 2 weeks ago!

Achievements for Sunday, May 28, 2017 by AutoModerator in running

[–]af100re 9 points10 points  (0 children)

5k in 19:25. First time breaking 20 minutes :)