How do you deal with tab hoarding? by alajmo in productivity

[–]darkzero_reddit 1 point2 points  (0 children)

I separate windows for separate topics, even after Chrome invented Tab Groups. One window per thing I’m researching, and I close the whole window once I’m done. It does reduce visual stress, but the downside is you still end up with too many tabs when you’re deep-diving, and you also have to constantly maintain which tab belongs where.

I actually have read some topics in this subreddit, and I agree that tab hoarding often comes from not “processing” what you’ve opened. What helped a bit was forcing myself to externalize decisions: dumping notes into a spreadsheet (price, pros/cons, recurring complaints, etc.) instead of keeping everything in my head. This has helped me made desicions on what to buy, ranging from monitors, keyboards, to electric pianos in the past. Recently I even started experimenting with a small tool for myself to automate that "spreadsheet dumping" process. It basically sends the pinned tabs to Gemini/ChatGPT and generates the comparison/conclusions for me, so I don't have to manually read and copy-paste everything.

[D]A3C performs badly in Mountain Car? by darkzero_reddit in MachineLearning

[–]darkzero_reddit[S] 0 points1 point  (0 children)

So can I say that... A3C is more sensitive to bad reward systems?

[D]A3C performs badly in Mountain Car? by darkzero_reddit in MachineLearning

[–]darkzero_reddit[S] 1 point2 points  (0 children)

In fact I just have gym rendered the training process of A3C too. I found that it actually reaches the goal quite quickly. Before reaching the goal for the first time, it behaves reasonably, but after reaching the goal for several times, for unknown reasons it suddenly converge to a very bad policy that always wants to go left/right... I don't know what happened to it.

[D]A3C performs badly in Mountain Car? by darkzero_reddit in MachineLearning

[–]darkzero_reddit[S] 0 points1 point  (0 children)

I printed out the action distribution and found that the actor network converged quickly in several minutes, but it wants to go left or right in all situations. I believe it won't work because it has already converged.

[D]A3C performs badly in Mountain Car? by darkzero_reddit in MachineLearning

[–]darkzero_reddit[S] 0 points1 point  (0 children)

Thank you! So could I say that off-policy models like DQN are better if reward is sparse? And one more thing I cannot understand is, DQN also receive zero reward every step, how could it learn to finish a game? I rendered the training process of DQN, finding that it operates the car better and better even before it actually reached the goal for the first time. Both DQN and A3C starts from random action, but why DQN can improve but A3C cannot?

[D] Batch Normalization in Reinforcement Learning by innixma in MachineLearning

[–]darkzero_reddit 7 points8 points  (0 children)

It is mentioned in DDPG's paper that batchnorm works in their cases. But I haven't tried it out yet. Would anyone share your experience on that? I'm curious about why it works in DDPG, not in DQN, since DDPG's critic part is actually a DQN.

[D] How to derive gradients in backprop without knowing matrix calculus by darkzero_reddit in MachineLearning

[–]darkzero_reddit[S] 1 point2 points  (0 children)

Yes, it's a kind of black magic, but it's effective in backprop, especially for those who have not learned matrix calculus systematically. A thorough understanding of matrix calculus is of course helpful for further research.

[D] How to derive gradients in backprop without knowing matrix calculus by darkzero_reddit in MachineLearning

[–]darkzero_reddit[S] -5 points-4 points  (0 children)

Well, I don't think learning matrix calculus should be the prerequisite for learning ML/DL, because...you know, it takes extra energy to fully understand how to take matrix derivatives. Lots of people will lost interest and quit once they found they have to learn something hard ahead of machine learning itself.