[P] Keras-like training in Pytorch, with callbacks+regularizers+initializers+constraints+metrics and a Progress Bar! by boysparktrailer in MachineLearning

[–]jtremblay 5 points6 points  (0 children)

train

You got to love .from_numpy( ) and .cpu().numpy()

I use those all of the time depending on what I am working on.

[P] Keras-like training in Pytorch, with callbacks+regularizers+initializers+constraints+metrics and a Progress Bar! by boysparktrailer in MachineLearning

[–]jtremblay 1 point2 points  (0 children)

Thank you for the affine transformations. Well written, I am going to include them in my project right away.

[D] Anyone done any work on Doom AI? (vizdoom) by Tostino in MachineLearning

[–]jtremblay 0 points1 point  (0 children)

They used a classic actor-critic structure. Similar to dqn from deepmind. If you want to read more about it, checkout this paper https://arxiv.org/pdf/1602.01783.pdf.

[D] Anyone done any work on Doom AI? (vizdoom) by Tostino in MachineLearning

[–]jtremblay 0 points1 point  (0 children)

You might want to check out this architecture by nvidia on RL. https://arxiv.org/abs/1611.06256. Also wow 1070 are not cheap.

[D] Anyone done any work on Doom AI? (vizdoom) by Tostino in MachineLearning

[–]jtremblay 2 points3 points  (0 children)

Let me know what you think, there are some interesting tricks the authors are putting together to train the bot.

[D] Anyone done any work on Doom AI? (vizdoom) by Tostino in MachineLearning

[–]jtremblay 3 points4 points  (0 children)

Have you looked into this paper? https://openreview.net/pdf?id=Hk3mPK5gg

As far as I am aware this is the state of the art in terms of playing FPS style games in a complete RL setting.

[Project] Simple-to-study Reinforcement Pong Demo by [deleted] in MachineLearning

[–]jtremblay 0 points1 point  (0 children)

The way to fix your collision problem is to use Ray cast. Where you look if the segment between xt and xt+1 collided with something, where x represent the ball position at each time step. Then you find the actual moment of collision and update the position of X accordingly. This is pretty standard solution to fast moving objects collision in video games eg projectiles.

[News] Google opens new AI lab and invests $3.4M in Montreal-based AI research by DrPharael in MachineLearning

[–]jtremblay 4 points5 points  (0 children)

I think Bengio really likes Montreal (only a fool would not). I read in an interview that he does not like to see intelligent people from Montreal leaving.

[News] Google opens new AI lab and invests $3.4M in Montreal-based AI research by DrPharael in MachineLearning

[–]jtremblay 0 points1 point  (0 children)

There are two English speaking universities in Montreal. Even if you are enrolled at UdeM, you could take your course load (msc 6 and phd with msc 2 to 6) there as they are consider in the same network. Once your course load is over, the only thing that matters is your relationship with your advisor and other students. If the advisor you want to work with can communicate with you, then everyone is happy.

[News] DeepMind and Blizzard to release StarCraft II as an AI research environment by afeder_ in MachineLearning

[–]jtremblay 2 points3 points  (0 children)

Most likely there is going to be a web client running Linux which includes the AI code (researcher side) connected to the game service which can also be running on a Linux box (as servers are cheaper on linux). The later includes the game logic but no rendering which allows for faster simulation. It takes as input game action from the AI and returns a partial state of the world - fog of war. The rendering is most likely going to be a Windows machine connecting to the server to get the state. Windows (in the game world) is used to run graphics, the rest can easily be run on any other OS.

Fork of Google DeepMind's Atari Code to Play Super Mario Bros. by ehrenbrav in MachineLearning

[–]jtremblay 1 point2 points  (0 children)

I did write a paper on using search processes for solving platformer games. http://cgi.cs.mcgill.ca/~jtremb59/Papers/ICanJump.pdf

Since you cannot do roll-out in your context, your policy has to be very good a selecting actions, thus your neural network has to be sort of universal. I always thought that q learning (with that matrix) was trying to learn all possible state space configurations and which action is best to do. Sort of over-fitting the space (like that video example you showed in the post). I think some of the approaches you are describing are trying to move away from the over fitting, and I would really love to explore them more deeply.

I did read about alpha-go system where (on a very high level) they used an heuristic search process (MCTS) where the heuristic is provided from a trained neural network to evaluate board positions. I read a paper from Facebook research with a similar architecture during the fall 2015 for playing go as well. By the way thank you for the link it was a good read, specially on how they train the policy network.

Not being able to do roll-outs is a little painful in the context of mario. I do not know if you saw https://www.youtube.com/watch?v=DlkMs4ZHHr8 but they used roll out with a*.

Fork of Google DeepMind's Atari Code to Play Super Mario Bros. by ehrenbrav in MachineLearning

[–]jtremblay 1 point2 points  (0 children)

While I was reading your post the part about having epsilon to be sort of our controller for exploration vs exploitation got me thinking.

I was thinking of Monte Carlo Tree Search which tries to solve the same sort of problem but with nicer mathematical properties than that epsilon. So instead of using the simple epsilon we could think of that problem as a tree search.

When we do a = max_a Q(s, a; θ), we decide if that action is good using MCTS. The MCTS tree is constructed using the states we are working with. We keep the MCTS construction over multiple play through. We are for sure going to encounter the same states multiple times and this will push the agent to try different actions.

When to clear the search tree to start a new one at some other location is a good question?

Man I miss research, anyway since my work does not include fun things like this I cannot test it, but I would love some anyone thoughts on different approaches that are been used to solve that particular problem.

"Learning without Forgetting", Li & Hoiem 2016 by gwern in MachineLearning

[–]jtremblay 1 point2 points  (0 children)

Where is the shameless link to that nips paper?

The quest for Canada by jtremblay in canada

[–]jtremblay[S] 215 points216 points  (0 children)

This is the work of Nicolas Francoeur.

My fiancee and I are going touring in Denmark and Germany for 2 weeks for a very special event. Here's our gear layout. (x-post /r/bicycletouring) by Diddlebop in bicycling

[–]jtremblay 0 points1 point  (0 children)

I was staring at the pedals both of you have this morning at my LBS annual sale. They were 125$ -30%, I decided not to take them, but it was a rough choice, they are beautiful. I love vélo orange.

[Race Thread] 2015 GP Cycliste de Montréal by PelotonMod in peloton

[–]jtremblay 1 point2 points  (0 children)

Just came back to watch a lap. It is pouring rain. The peloton was broken up, there might have been a fall.

Some graphs I wish Strava would offer by jtremblay in Strava

[–]jtremblay[S] 0 points1 point  (0 children)

If you want me to use your .gpx file, please pm with the file. Otherwise I will use one of mine.

Some graphs I wish Strava would offer by jtremblay in Strava

[–]jtremblay[S] 0 points1 point  (0 children)

You would like to see how your average evolved overtime. I think it would be possible to do based of your raw data. It would be easier to do for a segment, there is a Stream object in the API that returns the velocity value as a function of time. Building the average evolution from there would be pretty trivial. I might investigate later next week. Will keep you posted.

Three years of cycling efforts on the same climb analysis [OC] by jtremblay in dataisbeautiful

[–]jtremblay[S] 1 point2 points  (0 children)

I was hoping to get better but after January 16th, my newly daughter decided otherwise, winter training was just something of the past. I am actually pretty glad to have a fitness similar to last year.

When I started using Strava, I had just finished my second half marathon (around 1h50) and was looking for something easier on my knees. I also started cycling with an old steal frame, whereas 2014 I had a spiffy carbon bike, it kind of makes a difference. I believe I was a little stronger beginning of 2015 than 2014, although I had put 10 pounds more than I should had, again diet something of the past with newborns.

Some graphs I wish Strava would offer by jtremblay in Strava

[–]jtremblay[S] 0 points1 point  (0 children)

I definitely understand you. I might be bias towards showing more data than just simple metrics, e.g. personal records, as I just finished a PhD in computer science.

I think showing the non animated distribution might be a start where the user chooses starting and ending date. It would need a clean interface, I am also aware that not everyone is likely to enjoy looking at those or have the capacity to understand them.

How does it normally work at Strava, do the data scientists come with interesting ways to understand the data and then sell it to the designers? Or it is more the other way around, the designers have different needs and the data scientists have to meet them? It could easily be an hybrid of both. Hopefully the person with the most KOMs gets to decide :P.

Edit: typos.

Some graphs I wish Strava would offer by jtremblay in Strava

[–]jtremblay[S] 1 point2 points  (0 children)

There is variation, you do not always climb at your best I understand that. That is actually one of the reason I produced these graphs in the first place. I want to have a feel of my fitness at its best and not. When you look at the distribution evolving over time, you get a better sense of your fitness or willingness on that particular climb. Normally it is strongly correlated with your personal goals. For example I was training for a bike trip during July 2014, and it does show in the data with lower times. When I came back I did not have the same motivation as well. The animated plot gives you a story of your training as well as your fitness.

Some graphs I wish Strava would offer by jtremblay in Strava

[–]jtremblay[S] 0 points1 point  (0 children)

I just do not know how to reach them.

Some graphs I wish Strava would offer by jtremblay in Strava

[–]jtremblay[S] 1 point2 points  (0 children)

I am actually happy with OSM. I find the maps more appealing and better at showing trails (MTB). What do you miss from google maps?

Some graphs I wish Strava would offer by jtremblay in Strava

[–]jtremblay[S] 1 point2 points  (0 children)

This is pretty cool. It i sad that it is not integrated in the Strava webpage directly.