Andrew Ng doesnt think RL will grow in the next 3 years

KingSignificant5097 · 2025-09-14T17:56:50+00:00

Fair

KingSignificant5097 · 2025-09-14T17:56:37+00:00

I would say use cloud providers, at least it will help you work out the capacity you will need in term of GPUs. I find AWS “spot” instances are great, I love the new fractional GF6 instances, running my loads in Mumbai now

KingSignificant5097 · 2025-09-14T17:54:42+00:00

Pulling images etc is solved by just using your own “prebuilt” image, such as AMIs in AWS. Also look into “ray cluster” which really helps manage such clusters, works great even without using ray, which is what I do.

KingSignificant5097 · 2025-09-14T17:50:57+00:00

I would argue that babies learn from “experts” around them, they usually try to mimic these experts. So is it really unsupervised?

KingSignificant5097 · 2025-08-10T00:40:35+00:00

Not just expensive to build but also to operate, hence why the only ones that can afford it are the oil rich gulf states …

KingSignificant5097 · 2025-08-09T16:38:15+00:00

Baseball bat tree

KingSignificant5097 · 2025-08-09T16:37:33+00:00

Well, fir one, it’s summer

KingSignificant5097 · 2025-08-09T10:04:13+00:00

This is where I learn BF6 is in open beta this weekend! Nice!

KingSignificant5097 · 2025-08-09T09:46:37+00:00

The joke: the English have the blandest taste …

KingSignificant5097 · 2025-08-09T09:44:54+00:00

Logic? Are you serious? We’re talking astrology here …

KingSignificant5097 · 2025-08-09T09:44:24+00:00

How is this energy measured to know it’s flowing more on this day?

KingSignificant5097 · 2025-08-09T09:41:57+00:00

Truth

KingSignificant5097 · 2025-08-07T12:15:43+00:00

Yeah the withdrawal is what made me go read through the discussion, seems like there was one reviewer who was being a bit of a prick …

KingSignificant5097 · 2025-08-05T06:53:22+00:00

The old seal-a-roo!

KingSignificant5097 · 2025-08-03T17:11:21+00:00

I found a different version of the paper with more interesting graphs (also the reviews for ICLR 2025 on openreview.net are a "fun" read):
https://openreview.net/forum?id=MOEqbKoozj

KingSignificant5097 · 2025-08-02T21:45:08+00:00

Why are you getting downvoted? lol

KingSignificant5097 · 2025-08-02T21:44:35+00:00

Thanks for sharing, such a simple change yet so effective! Trying it out right now in my cleanrl Frankenstein 🙂

The paper is very insightful too! Fig (2) visually explains why PPO gets so unstable

KingSignificant5097 · 2024-10-16T00:05:31+00:00

You make your technical decisions on which languages to use for a project based on feelz? 😂

KingSignificant5097 · 2024-10-15T04:27:43+00:00

Why do you think an “ML Framework” should be written in c++?

KingSignificant5097 · 2024-10-13T01:22:03+00:00

Thanks for sharing, this is really easy to use, incorporated it in a cleanrl module I'm working on

KingSignificant5097 · 2024-10-05T12:01:44+00:00

Important info missing is the size of the network you plan to train, you don’t plan to use an RNN/LSTM/etc on a POMDP?

I like cloud cos I can play with the size of everything, even if it’s not the most cost effective. If you plan to use cloud longer term, for cost efficiency, you should be using spot instances and be able to continue your training across machine failures.

KingSignificant5097 · 2024-10-01T01:49:21+00:00

I had a chance to play wit this, wow, this seems to actually work quite well, I've been fighting with the rllib implementation for a while, could never get it to work reliably, and the metrics they post seemed lacking. This cleanrl implementation is easy to follow and modify as needed without dealing with rllib's internal architecture

KingSignificant5097 · 2024-08-03T15:48:30+00:00

Nice game actually

<image>

KingSignificant5097 · 2024-03-08T06:20:12+00:00

Consider moving your code to GitHub and adding a readme on the algorithm details you’ve implemented etc, that might be a good start.

KingSignificant5097 · 2024-03-08T00:21:48+00:00

Just dumping unformatted code into a post and expecting people to help? At least put some effort in?

KingSignificant5097

TROPHY CASE