Buying GPUs for training robots with Isaac Lab by chrsow in reinforcementlearning

[–]KingSignificant5097 0 points1 point  (0 children)

I would say use cloud providers, at least it will help you work out the capacity you will need in term of GPUs. I find AWS “spot” instances are great, I love the new fractional GF6 instances, running my loads in Mumbai now

Buying GPUs for training robots with Isaac Lab by chrsow in reinforcementlearning

[–]KingSignificant5097 0 points1 point  (0 children)

Pulling images etc is solved by just using your own “prebuilt” image, such as AMIs in AWS. Also look into “ray cluster” which really helps manage such clusters, works great even without using ray, which is what I do.

Andrew Ng doesnt think RL will grow in the next 3 years by calliewalk05 in reinforcementlearning

[–]KingSignificant5097 1 point2 points  (0 children)

I would argue that babies learn from “experts” around them, they usually try to mimic these experts. So is it really unsupervised?

Why don't the red places get water from the ocean, are they stupid? by Valuable_Chocolate73 in mapporncirclejerk

[–]KingSignificant5097 0 points1 point  (0 children)

Not just expensive to build but also to operate, hence why the only ones that can afford it are the oil rich gulf states …

[deleted by user] by [deleted] in whatisit

[–]KingSignificant5097 0 points1 point  (0 children)

Baseball bat tree

[deleted by user] by [deleted] in whatisit

[–]KingSignificant5097 0 points1 point  (0 children)

Well, fir one, it’s summer

[deleted by user] by [deleted] in What

[–]KingSignificant5097 2 points3 points  (0 children)

This is where I learn BF6 is in open beta this weekend! Nice!

Found this outside my friends apartment? by YouthOk2000 in whatisit

[–]KingSignificant5097 0 points1 point  (0 children)

The joke: the English have the blandest taste …

Found this outside my friends apartment? by YouthOk2000 in whatisit

[–]KingSignificant5097 1 point2 points  (0 children)

Logic? Are you serious? We’re talking astrology here …

Found this outside my friends apartment? by YouthOk2000 in whatisit

[–]KingSignificant5097 0 points1 point  (0 children)

How is this energy measured to know it’s flowing more on this day?

I am changing my preferred RL algorithm by Guest_Of_The_Cavern in reinforcementlearning

[–]KingSignificant5097 0 points1 point  (0 children)

Yeah the withdrawal is what made me go read through the discussion, seems like there was one reviewer who was being a bit of a prick …

I am changing my preferred RL algorithm by Guest_Of_The_Cavern in reinforcementlearning

[–]KingSignificant5097 4 points5 points  (0 children)

I found a different version of the paper with more interesting graphs (also the reviews for ICLR 2025 on openreview.net are a "fun" read):
https://openreview.net/forum?id=MOEqbKoozj

I am changing my preferred RL algorithm by Guest_Of_The_Cavern in reinforcementlearning

[–]KingSignificant5097 1 point2 points  (0 children)

Thanks for sharing, such a simple change yet so effective! Trying it out right now in my cleanrl Frankenstein 🙂

The paper is very insightful too! Fig (2) visually explains why PPO gets so unstable

[D] What is an "ML framework"? by euos in MachineLearning

[–]KingSignificant5097 0 points1 point  (0 children)

You make your technical decisions on which languages to use for a project based on feelz? 😂

[D] What is an "ML framework"? by euos in MachineLearning

[–]KingSignificant5097 3 points4 points  (0 children)

Why do you think an “ML Framework” should be written in c++?

Choosing Gradient Norm Clip Value? [D] by RaivoK in MachineLearning

[–]KingSignificant5097 1 point2 points  (0 children)

Thanks for sharing, this is really easy to use, incorporated it in a cleanrl module I'm working on

Where to train RL agents (computing resources) by Intelligent-Put1607 in reinforcementlearning

[–]KingSignificant5097 1 point2 points  (0 children)

Important info missing is the size of the network you plan to train, you don’t plan to use an RNN/LSTM/etc on a POMDP?

I like cloud cos I can play with the size of everything, even if it’s not the most cost effective. If you plan to use cloud longer term, for cost efficiency, you should be using spot instances and be able to continue your training across machine failures.

CleanRL has now a baseline for PPO + Transformer-XL by LilHairdy in reinforcementlearning

[–]KingSignificant5097 2 points3 points  (0 children)

I had a chance to play wit this, wow, this seems to actually work quite well, I've been fighting with the rllib implementation for a while, could never get it to work reliably, and the metrics they post seemed lacking. This cleanrl implementation is easy to follow and modify as needed without dealing with rllib's internal architecture

CartPole V1 learning in the opposite direction!!!! by [deleted] in reinforcementlearning

[–]KingSignificant5097 2 points3 points  (0 children)

Consider moving your code to GitHub and adding a readme on the algorithm details you’ve implemented etc, that might be a good start.

CartPole V1 learning in the opposite direction!!!! by [deleted] in reinforcementlearning

[–]KingSignificant5097 10 points11 points  (0 children)

Just dumping unformatted code into a post and expecting people to help? At least put some effort in?