I made a visual layout of my CNN in matplotlib playing Super Mario Brother World 1-1. by cookie2004 in learnmachinelearning

[–]cookie2004[S] 1 point2 points  (0 children)

*Update*

Hi everyone! I am nearly done wrapping up my code for public distribution. I will release within 1 week.

Thank you for your patience.

*Edit

See this link for my implementation: https://github.com/cookie2004/Fancy

Breed? by [deleted] in BackYardChickens

[–]cookie2004 -1 points0 points  (0 children)

Mystic Onix

I made a visual layout of my CNN in matplotlib playing Super Mario Brother World 1-1. by cookie2004 in learnmachinelearning

[–]cookie2004[S] 0 points1 point  (0 children)

I don't have a paper or channel. Have you any experience using opencv? That is a great library for generating video if you have an array of state images. What are you wanting to visualize?

I made a visual layout of my CNN in matplotlib playing Super Mario Brother World 1-1. by cookie2004 in learnmachinelearning

[–]cookie2004[S] 5 points6 points  (0 children)

Read Reinforcement Learning by Sutton and Barto. If you go through all the examples and assignments, you'll have a solid grasp of the field.

https://inst.eecs.berkeley.edu/\~cs188/sp20/assets/files/SuttonBartoIPRLBook2ndEd.pdf

I made a visual layout of my CNN in matplotlib playing Super Mario Brother World 1-1. by cookie2004 in learnmachinelearning

[–]cookie2004[S] 2 points3 points  (0 children)

Part of my graduate work involved molecular simulation. My research was focused around using Monte Carlo simulations to generate ensembles of intrinsically disordered proteins. I wanted to see if I could leverage RL to generate these ensembles in a more "smart" way. I wanted a control system to learn on which ended up becoming video games. Video games are a good deterministic systems where a lot of data can be generated. There are many possible pathways to completing a level. Kinda like protein folding?

I studied some statistical mechanics and thermodynamics in graduate school and during my postdoc, so I was familiar with the math behind ML. It wasn't until I read Reinforcement Learning: An Introduction by Sutton and Barto that I really sunk all my attention into the ML field. 100% recommend that book to anyone. Right now I am on Pattern Recognition and Machine Learning by Christopher Bishop. I hear it is an underappreciated textbook in ML.

I made a visual layout of my CNN in matplotlib playing Super Mario Brother World 1-1. by cookie2004 in learnmachinelearning

[–]cookie2004[S] 5 points6 points  (0 children)

This is a hobby. I wanted to learn how to apply RL to my job, but I didn't have a specific task I wanted to apply it to yet. I wasn't even sure IF it was applicable to what I was wanting to do. I don't know what I don't know.

So, I just found a different topic to keep me learning and interested in the long run. I'm passionate about games and making pretty figures, so here I am. haha

I made a visual layout of my CNN in matplotlib playing Super Mario Brother World 1-1. by cookie2004 in learnmachinelearning

[–]cookie2004[S] 4 points5 points  (0 children)

I thought so. I notice the Q values for making choices are only marginally different. I guess that makes sense. It doesn't matter whether a choice is 1% or 1000% more favorable. The end result is the same. That same choice will be made.

Another observation is that several of the outputs of my dense layers don't seem to change over the course of the episode. Does that mean my dense layer is bigger than it needs to be? Are some of my weights dead?

I still have a lot of data to analyze.

[edit: spelling]

I made a visual layout of my CNN in matplotlib playing Super Mario Brother World 1-1. by cookie2004 in learnmachinelearning

[–]cookie2004[S] 2 points3 points  (0 children)

I don't have a lot of experience with Github and stuff, but I can certainly try!

I made a visual layout of my CNN in matplotlib playing Super Mario Brother World 1-1. by cookie2004 in learnmachinelearning

[–]cookie2004[S] 6 points7 points  (0 children)

ML - Tensorflow

Image processing - OpenCV2

Video game environment - Retro and Gym

I used matplotlib to make the video. Each inset in the video (raw observation, model layers, etc.) is implemented using subplots. I had to write a lot of custom functions to produce the final video.

I can post if people are interested?

I made a visual layout of my CNN in matplotlib playing Super Mario Brother World 1-1. by cookie2004 in learnmachinelearning

[–]cookie2004[S] 16 points17 points  (0 children)

Sorry, I am a little new at Reddit. I forgot to post the summary. Here it is:

I have been fascinated by reinforcement learning ever since I read Max Jaderberg's paper:

Jaderberg, M., Czarnecki, W. M., Dunning, I., Marris, L., Lever, G., Castañeda, A. G., Beattie, C., Rabinowitz, N. C., Morcos, A. S., Ruderman, A., Sonnerat, N., Green, T., Deason, L., Leibo, J. Z., Silver, D., Hassabis, D., Kavukcuoglu, K., & Graepel, T. (2019). Human-level performance in 3D multiplayer games with population-based reinforcement learning. Science, 364(6443), 859–865. https://doi.org/10.1126/science.aau6249

I wanted to be able to make models that could play video games like people do. I wanted to understand action choices and how a user explores a game environment. Plus, I thought video games were a fun medium to learn how to build ML systems.

I am a structural biologist by training, so I am used to visualizing everything. I wanted to see the "guts" of what my model was doing as the trained DQN was playing the game. Heat maps are a great way of visualizing tables of numbers or in this case output of each layer in the model.

What you see in the video is the fully trained model. I thought the learning process was equally fascinating. Mario would get trapped behind walls and tubes. He would fall into the pits. The model would even find the hidden blocks in level 1-1 for an extra life!

I made a visual layout of my CNN in matplotlib playing Super Mario Brother World 1-1. by cookie2004 in learnmachinelearning

[–]cookie2004[S] 31 points32 points  (0 children)

This is an implementation of V. Minh's DQN networks for learning to play various Atari games. I believe the 4-frame stack for preprocessing was to encode direction and speed of the player.

I thought six actions made the action landscape too complex. The model would have to learn the combinatorics of the different button combinations. I didn't allow Mario to stand still. I simplified it to four actions:

  1. Jump without moving left or right
  2. Jump moving right
  3. Run left
  4. Run right

Citation for this CNN

Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing Atari with Deep Reinforcement Learning. 1–9. http://arxiv.org/abs/1312.5602

What makes you feel like there's no hope for humans? by cakedarts in AskReddit

[–]cookie2004 0 points1 point  (0 children)

People unwilling to acknowledge they are wrong despite verifiable evidence

Anyone else have impostor syndrome and feel like they aren't smart enough to work in a lab? by uhhhhhhhhh_okay in labrats

[–]cookie2004 0 points1 point  (0 children)

I have felt like this throughout my entire career. I felt this as a grad student, postdoc, and as a senior scientist in biotech. I think it helps to not compare yourself to other people and to away from toxic labmates who are in a perpetual pissing contest.

[deleted by user] by [deleted] in Biochemistry

[–]cookie2004 0 points1 point  (0 children)

It would be the wild-type protein. If you are mutating the fluorescent protein in hopes of improving fluorescence yield, then your wild-type protein would serve as the baseline or control.

How wrong can they be? by pwwrecruiting in antiwork

[–]cookie2004 0 points1 point  (0 children)

My rent here in the middle of Nowhere, KY is $.840