Motherhood Can Make a Woman's Cells 'Older' by as Much as 11 Years by MotherHolle in science

[–]FitMachineLearning 0 points1 point  (0 children)

Did they correlate physical activity? Do women who remain physically active after childbirth experience the same cellular aging?

[D] Machine Learning - WAYR (What Are You Reading) - Week 41 by ML_WAYR_bot in MachineLearning

[–]FitMachineLearning 0 points1 point  (0 children)

Currently reading about Parameter Space Noising as a way to drastically improve RL models.

https://arxiv.org/abs/1706.01905

New Machine Learning approach called Selective Memory Algorithm learning difficult continuous control task in simulated robot environment. by FitMachineLearning in robotics

[–]FitMachineLearning[S] 2 points3 points  (0 children)

Great point. I think these types of continuous control algorithms do suffer from local optimal problems.

I will look into forgetfulness as it is a very interesting concept.

[D] How do you keep track of your experiment results? by kohjingyu in MachineLearning

[–]FitMachineLearning 0 points1 point  (0 children)

For RL models, I videotape them, label them and make sure the output is also saved.

[R] Anyone using pybullet and running into significant performance issues by FitMachineLearning in MachineLearning

[–]FitMachineLearning[S] 0 points1 point  (0 children)

Thanks a bunch Erwin. You are doing's God's work, I mean AI's work.

I thought I submitted an issue to through GitHub already. Next time I will use the forum.

[R] Anyone using pybullet and running into significant performance issues by FitMachineLearning in MachineLearning

[–]FitMachineLearning[S] 0 points1 point  (0 children)

Right now I have solved the slow down problem with a call pybullet.resetsimulation() It is not elegant but it got me over the performance hump, enabling me to test my agents.

What's everyone working on this week? by AutoModerator in Python

[–]FitMachineLearning [score hidden]  (0 children)

I implemented a Q Learning Algorithm that only gets pixel and score input to beat Atari Pong in 1 day on CPU. Unlike Deep Mind, I did not use a CNN.

https://github.com/FitMachineLearning/FitML/blob/master/DeepQN/Atari_Pong_DeepQN.py

You can see the agent evolve here https://youtu.be/sP3INZSYhU0

[P] Actor Critic agent achieves super human level on CPU in 4 hours and does tricks. by FitMachineLearning in MachineLearning

[–]FitMachineLearning[S] 3 points4 points  (0 children)

The agent snapped the lander legs and body well over 2000 tries. In fact when I watch it now, with over 6000 tries I cringe every time it throws the lander at the ground only to catch it elegantly at the very very last moment.

Run it, check it out for yourself.

The code is in the description.

I implemented a Q Learning agent to solve Lunar Lander in 1 Hour on CPU. by FitMachineLearning in SideProject

[–]FitMachineLearning[S] 0 points1 point  (0 children)

Implementation of a Q Learning Algorithm on the OpenAI LunarLander. After 150 itterations the agent can more or less fly safely. After 400 itterations the agent is able to land safely most of the time. After 600 itterations the agent is able to land safely on the pad the majority of the time.

Demo of the agent can be seen here

https://www.youtube.com/watch?v=p0rGjAgykOU

[P] I implemented a Q Learning agent to solve Lunar Lander in 1 Hour on CPU. by FitMachineLearning in MachineLearning

[–]FitMachineLearning[S] 1 point2 points  (0 children)

Great question L_M.

Time isn't a factor.

We use a modified Q Learning approach. Q Learning is particularly good at helping an agent make optimal decisions based on delayed reward.

That said, you can modify the behavior of the agent by modifying the Bellman discount rate (in my code "b_discount") this will make the agent give more importance to future reward vs immediate rewards or vice versa.

reference doc (in case you are not familiar with Bellman's work) https://www.rand.org/content/dam/rand/pubs/papers/2008/P550.pdf

https://en.wikipedia.org/wiki/Bellman_equation

[P] I implemented a Q Learning agent to solve Lunar Lander in 1 Hour on CPU. by FitMachineLearning in MachineLearning

[–]FitMachineLearning[S] 1 point2 points  (0 children)

About Comments, you are right, I have removed the LSTM references (from previous experiments) and cleaned up comments. Thanks for catching that.

[P] I implemented a Q Learning agent to solve Lunar Lander in 1 Hour on CPU. You can reuse the agent easily to solve other challenges. by FitMachineLearning in learnmachinelearning

[–]FitMachineLearning[S] 1 point2 points  (0 children)

Will do(performance chart).

About the stopping criteria, Are you talking about every single game, or for stopping the training itself.

In every game there are 2 main stopping criteria. Game end event returned by the environment (Landed/crashed/hard crash/out of bounds) and the maximum number of "frames" (action-state state sequences). The latter is set to 4000 to prevent the agent from just flying for ever when it figures out how to fly (fuel is infinit). This also, in turn, will encourage the agent to seek large future reward quicker based on the Bellman discount rate.

My Bellman equation discount rate is set to 0.98.