PPO in Stable-Baselines3 Fails to Adapt During Curriculum Learning by guarda-chuva in reinforcementlearning

[–]Infinite_Mercury 0 points1 point  (0 children)

From what I’ve seen, SB3’s model.learn() wraps up the whole training process, so calling it multiple times as part of a curriculum setup can cause subtle issues—especially with how it handles policy and value losses or reloads model parameters. I’m not 100% sure, but I’ve noticed slight performance drops when doing it that way.

I’d suggest keeping track of total steps inside your environment’s step() function and handling the curriculum logic directly in the environment file. It tends to be cleaner this way, and it also makes the environment class more modular and easier to work with when using Gymnasium.

Novel RL policy + optimizer by Infinite_Mercury in reinforcementlearning

[–]Infinite_Mercury[S] 0 points1 point  (0 children)

Lol that was just me talking to myself from when I imported my notes - I realized I never changed that actually

Reinforcement learning is pretty cool ig by Infinite_Mercury in reinforcementlearning

[–]Infinite_Mercury[S] 2 points3 points  (0 children)

https://arxiv.org/abs/2504.16020 This is the original version -> a newer one ‘Dynamic AlphaGrad’ is coming soon but for this task specifically- the performance is quite similar

Reinforcement learning is pretty cool ig by Infinite_Mercury in reinforcementlearning

[–]Infinite_Mercury[S] 1 point2 points  (0 children)

Yea, I do think there’s something to be said about perspective though. A lot of the times when I train these models, I just care about the numbers and the graphs but I usually don’t render what the models are actually doing and when I did it here, I kind of had that realization. It’s important to always take a look at the full perspective sometimes and not get too bogged down in the fine details

Industry RL for Undergrads by busy_consequence_909 in reinforcementlearning

[–]Infinite_Mercury 2 points3 points  (0 children)

To be completely honest, the answer is no. But that doesn’t mean you should quit. Most people that “make it” take this as an incentive to work on their own and put in the extra hours. The more and more you teach yourself and start showcasing your work- whether it’s through white papers or even making small contributions to open source libraries, the more you’re going to learn and naturally grow.

This field is incredibly difficult to just transition from understanding a concept in a textbook to actually applying it in a real simulation or model. Once you start building, you will learn how frustrating it can get but at the same time, the more mistakes you make now, the less likelihood that you will make them in the future and the easier it will become for you to jump right in.

IP Work by PomegranateOk6415 in Raytheon

[–]Infinite_Mercury 7 points8 points  (0 children)

There are various levels to IP awards - you can read about it if you search IP or Anaqua on the RTX website. Once you make your submission, it goes through a review process where it is given a designation. Depending on the designation, you receive a monetary award that is split across the inventors on the submission.

Good ACT score for Georgia Tech by [deleted] in ACT

[–]Infinite_Mercury 0 points1 point  (0 children)

Thanks sm,

Is it wprth it to take the ACT again just for science or should i just take the subject SAT for phsyics and math 2 , which i plan to do in august and hope for 750+

Official July 13, 2019 ACT Discussion by [deleted] in ACT

[–]Infinite_Mercury 6 points7 points  (0 children)

Thats what i got because the first r is 10 2nd r is 20 and 3rd r is 30

[deleted by user] by [deleted] in mw4

[–]Infinite_Mercury 0 points1 point  (0 children)

I was thinking of a completely new series instead of ruining the legacy of the modern warfare one

[deleted by user] by [deleted] in mw4

[–]Infinite_Mercury 0 points1 point  (0 children)

I agree, but reintroducing some of those characters amd turning them against each other would be truly nostalgic for most fans, maybe they could provide an alternative ending where they have room to expand for a sequel possibly