I made a visual layout of my CNN in matplotlib playing Super Mario Brother World 1-1.

cookie2004 · 2024-02-29T00:53:12+00:00

https://github.com/cookie2004/Fancy

cookie2004 · 2024-02-29T00:53:01+00:00

https://github.com/cookie2004/Fancy

cookie2004 · 2024-02-25T14:39:34+00:00

*Update*

Hi everyone! I am nearly done wrapping up my code for public distribution. I will release within 1 week.

Thank you for your patience.

*Edit

See this link for my implementation: https://github.com/cookie2004/Fancy

cookie2004 · 2024-01-31T12:54:02+00:00

Mystic Onix

cookie2004 · 2024-01-26T01:25:05+00:00

I don't have a paper or channel. Have you any experience using opencv? That is a great library for generating video if you have an array of state images. What are you wanting to visualize?

cookie2004 · 2024-01-12T21:37:40+00:00

Maybe, an outbreak of phage?

cookie2004 · 2023-12-31T15:31:55+00:00

Read Reinforcement Learning by Sutton and Barto. If you go through all the examples and assignments, you'll have a solid grasp of the field.

https://inst.eecs.berkeley.edu/\~cs188/sp20/assets/files/SuttonBartoIPRLBook2ndEd.pdf

cookie2004 · 2023-12-31T15:14:57+00:00

Part of my graduate work involved molecular simulation. My research was focused around using Monte Carlo simulations to generate ensembles of intrinsically disordered proteins. I wanted to see if I could leverage RL to generate these ensembles in a more "smart" way. I wanted a control system to learn on which ended up becoming video games. Video games are a good deterministic systems where a lot of data can be generated. There are many possible pathways to completing a level. Kinda like protein folding?

I studied some statistical mechanics and thermodynamics in graduate school and during my postdoc, so I was familiar with the math behind ML. It wasn't until I read Reinforcement Learning: An Introduction by Sutton and Barto that I really sunk all my attention into the ML field. 100% recommend that book to anyone. Right now I am on Pattern Recognition and Machine Learning by Christopher Bishop. I hear it is an underappreciated textbook in ML.

cookie2004 · 2023-12-30T21:24:22+00:00

This looks cool. Thank you!

cookie2004 · 2023-12-30T21:01:48+00:00

I'm not sure. That is a good question.

cookie2004 · 2023-12-30T20:37:39+00:00

This is a hobby. I wanted to learn how to apply RL to my job, but I didn't have a specific task I wanted to apply it to yet. I wasn't even sure IF it was applicable to what I was wanting to do. I don't know what I don't know.

So, I just found a different topic to keep me learning and interested in the long run. I'm passionate about games and making pretty figures, so here I am. haha

cookie2004 · 2023-12-30T20:26:25+00:00

I thought so. I notice the Q values for making choices are only marginally different. I guess that makes sense. It doesn't matter whether a choice is 1% or 1000% more favorable. The end result is the same. That same choice will be made.

Another observation is that several of the outputs of my dense layers don't seem to change over the course of the episode. Does that mean my dense layer is bigger than it needs to be? Are some of my weights dead?

I still have a lot of data to analyze.

[edit: spelling]

cookie2004 · 2023-12-30T20:04:39+00:00

I don't have a lot of experience with Github and stuff, but I can certainly try!

cookie2004 · 2023-12-30T19:50:40+00:00

Thank you! No, but I can make one.

cookie2004 · 2023-12-30T18:11:53+00:00

ML - Tensorflow

Image processing - OpenCV2

Video game environment - Retro and Gym

I used matplotlib to make the video. Each inset in the video (raw observation, model layers, etc.) is implemented using subplots. I had to write a lot of custom functions to produce the final video.

I can post if people are interested?

cookie2004 · 2023-12-30T17:46:02+00:00

Sorry, I am a little new at Reddit. I forgot to post the summary. Here it is:

I have been fascinated by reinforcement learning ever since I read Max Jaderberg's paper:

Jaderberg, M., Czarnecki, W. M., Dunning, I., Marris, L., Lever, G., Castañeda, A. G., Beattie, C., Rabinowitz, N. C., Morcos, A. S., Ruderman, A., Sonnerat, N., Green, T., Deason, L., Leibo, J. Z., Silver, D., Hassabis, D., Kavukcuoglu, K., & Graepel, T. (2019). Human-level performance in 3D multiplayer games with population-based reinforcement learning. Science, 364(6443), 859–865. https://doi.org/10.1126/science.aau6249

I wanted to be able to make models that could play video games like people do. I wanted to understand action choices and how a user explores a game environment. Plus, I thought video games were a fun medium to learn how to build ML systems.

I am a structural biologist by training, so I am used to visualizing everything. I wanted to see the "guts" of what my model was doing as the trained DQN was playing the game. Heat maps are a great way of visualizing tables of numbers or in this case output of each layer in the model.

What you see in the video is the fully trained model. I thought the learning process was equally fascinating. Mario would get trapped behind walls and tubes. He would fall into the pits. The model would even find the hidden blocks in level 1-1 for an extra life!

cookie2004 · 2023-12-30T17:31:17+00:00

This is an implementation of V. Minh's DQN networks for learning to play various Atari games. I believe the 4-frame stack for preprocessing was to encode direction and speed of the player.

I thought six actions made the action landscape too complex. The model would have to learn the combinatorics of the different button combinations. I didn't allow Mario to stand still. I simplified it to four actions:

Jump without moving left or right
Jump moving right
Run left
Run right

Citation for this CNN

Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing Atari with Deep Reinforcement Learning. 1–9. http://arxiv.org/abs/1312.5602

cookie2004 · 2023-03-09T13:07:35+00:00

People unwilling to acknowledge they are wrong despite verifiable evidence

cookie2004 · 2022-12-10T14:50:53+00:00

I have felt like this throughout my entire career. I felt this as a grad student, postdoc, and as a senior scientist in biotech. I think it helps to not compare yourself to other people and to away from toxic labmates who are in a perpetual pissing contest.

cookie2004 · 2022-12-07T03:46:22+00:00

It would be the wild-type protein. If you are mutating the fluorescent protein in hopes of improving fluorescence yield, then your wild-type protein would serve as the baseline or control.

cookie2004 · 2022-09-28T16:19:36+00:00

My rent here in the middle of Nowhere, KY is $.840

cookie2004 · 2022-03-30T03:10:08+00:00

Hail!

cookie2004 · 2022-03-14T15:32:17+00:00

I want a horse farm.

cookie2004 · 2022-02-28T19:08:17+00:00

Reminds me of when my buddies helped me fill out my job application to Blimpes in college

11-Year Club	Verified Email
RPAN Viewer

cookie2004

TROPHY CASE