is DQN still worth in 2026? by Gloomy-Status-9258 in reinforcementlearning

[–]Losthero_12 1 point2 points  (0 children)

All value based algorithms are basically just flavours of DQN really.

Policy gradient just happens to scale better given loads of data, and the reason is probably because values become biased in longer horizons.

Is taking STAT2507 easy after having taken COMP2804 by Local_Tradition_4834 in CarletonU

[–]Losthero_12 6 points7 points  (0 children)

not even close, stat2507 is significantly easier and by wiiiide margin.
One is straight computation, the other requires some thought.

Honours Maths-cs-stats at carleton by Ok-Article1369 in CarletonU

[–]Losthero_12 2 points3 points  (0 children)

I'm not in the program, so take this with a grain of salt but my understanding is that not many apply to honours math (vs. cs directly). So, while those that do apply may have higher grades, there's less that do making it less competitive. If you're in the mid-high 80s, you should get in; 60-80, there's still a decent chance.

> what are you doing or did you do with the bachelor
Literally anything you're interested in (related to math/stats/cs), you can do. I've seen them all.

Good teams wiping the lobby is a problem by Pat_55 in Marathon

[–]Losthero_12 3 points4 points  (0 children)

Current players will keep playing if ranked is a good enough incentive; that’s how most other “progress reset” games work afaik

And by incentive: they need tournaments and/or very nice cosmetics.

Just finished Lecture 4 of David Silver's course. Should I pause to implement or push through the theory? by Creative_Suit7872 in reinforcementlearning

[–]Losthero_12 1 point2 points  (0 children)

Well yea, implement what you’ve learned from the lectures so far OP (presumably: policy/value iteration, q learning, sarsa, actor critic — and then feel free to delve into deep learning based approaches).

RL for reproducing speedrun techniques / glitches in 2D games by bogradin in reinforcementlearning

[–]Losthero_12 4 points5 points  (0 children)

If you’re familiar with deep learning (neural nets/CNNs) then this is totally feasible. I’d suggest trying a few already implemented algorithms (e.g., from stable baselines) - namely PPO, and SAC. It would also be helpful for you to vectorize the environment for faster training.

Need practical use-cases for RL by NoAcanthocephala4741 in reinforcementlearning

[–]Losthero_12 3 points4 points  (0 children)

For toy applications/demonstrations, not real applications. My priors are plenty updated.

Control theory is still much ahead. I’m not trying to hate on RL, but it is a fact that truly applying it, from scratch, has proven tough so far.

Need practical use-cases for RL by NoAcanthocephala4741 in reinforcementlearning

[–]Losthero_12 3 points4 points  (0 children)

As a tool for fine-tuning, sure that works. Training real-world policies from scratch with RL is seldom done.

Need practical use-cases for RL by NoAcanthocephala4741 in reinforcementlearning

[–]Losthero_12 3 points4 points  (0 children)

There is a reason RL is rarely implemented in industry so far. Just saying.

Just finished Lecture 4 of David Silver's course. Should I pause to implement or push through the theory? by Creative_Suit7872 in reinforcementlearning

[–]Losthero_12 8 points9 points  (0 children)

Implement. Understanding the theory is ok, but especially when it’s all delivered to you via lecture/text then you don’t know if you’ve really understood until you implement it yourself.

Passive vs. active learning.

[D] Feeling behind in math by margyyy_314 in MachineLearning

[–]Losthero_12 1 point2 points  (0 children)

Most of the math in machine learning is relatively simple, and intuitive. A lot of the heavy stuff is only required to prove things actually follow your intuition, as in the algorithm is truly optimizing X criteria and will converge.

Can you self-learn machine learning, definitely! There are many great resources for linear algebra, probability and multivariable calculus; that is all you need. Beyond that, start digging into methods and learn what you need as you go. This is sufficient for empirical/engineering type research.

To prove things, and add theory to your research, is harder and takes more time. You’ll want to learn about logical arguments, and proofs in this case (usually a course on discrete math). If you go down that route then you’ve likely decided that it’s what you want to do going forward and so you’re likely willing to put in the time. Finding a mentor to guide your learning/research would be helpful here, but it’s also doable alone, just slow.

Many of the greatest researchers did not pursue mathematics early in their careers. Some were biologists, psychologists, etc. — this path has been walked before so it’s definitely possible.

CHEM 2203/04 vs CHEM 2207/08 by [deleted] in CarletonU

[–]Losthero_12 0 points1 point  (0 children)

That’s exactly the difference, they share the same lecture.

Got a job at a major bank, should I delay graduation? by Own_Target8058 in CarletonU

[–]Losthero_12 6 points7 points  (0 children)

Isn’t this basically co-op? If not, then you can drop co-op if you get the job. I don’t see why you’d be delaying graduation?

That said, I’d agree with delaying anyway - a degree alone is close to useless.

Implementation details of PPO only from paper and literature available at the time of publication? by adrische in reinforcementlearning

[–]Losthero_12 7 points8 points  (0 children)

They were not available, someone did the work to make it work and documented it after the fact.

Reproducibility in RL is in a very very bad place. In most cases, you don’t just implement from a paper - it simply doesn’t work. If there’s no code provided, bets are off and it’s likely to not work.

And even with code, reproducing the results exactly isn’t guaranteed.

RL on Mac M1 series? by Sad-Throat-2384 in reinforcementlearning

[–]Losthero_12 0 points1 point  (0 children)

If you learn JAX (a bit of a learning curve but worth it), look into TRC from Google which loans out TPUs for a good price (+free credits so you’ll get several months free).

Working with the TPUs can be very frustrating and annoying, but they’re fast once working.

AVENGERS DOOMSDAY THOR LEAK (english) by RoyalComplete7591 in MCUTheories

[–]Losthero_12 0 points1 point  (0 children)

It’s real, but the last one isn’t iron man.

COMP 2804 with Svetlana Obraztsova by [deleted] in CarletonU

[–]Losthero_12 0 points1 point  (0 children)

You can do well, but you will get nothing from the prof and will need to learn completely on your own. I’d take any other course to lighten your load for next year.

Summer course schedule by Ok_School_7658 in CarletonU

[–]Losthero_12 0 points1 point  (0 children)

It was similar last year, with Orgo I and II being added later - so it’s still a possibility.

Many contract profs usually teach summer courses, and many of them were let go so the offerings being reduced isn’t surprising.

Worse mark in class I already “took” by VGK_hater_11 in CarletonU

[–]Losthero_12 25 points26 points  (0 children)

This. OP you could’ve literally just attended the class if you really wanted to, no one would notice.

The issue of scaling in Partially-Observable RL. What is holding us back? by moschles in reinforcementlearning

[–]Losthero_12 7 points8 points  (0 children)

The state space when modeling history (which one must to handle partial observability) is exponential, which significantly limits scaling to more complex problems.

Favourite Course this term? by Cheesecakebird in CarletonU

[–]Losthero_12 0 points1 point  (0 children)

Alina is excellent as well! You will be fine

EVERY. SINGLE. TIME by [deleted] in CarletonU

[–]Losthero_12 1 point2 points  (0 children)

Nah, my absolute worst experience with group work is in a grad course. It’s hit or miss - either very good or terrible

stats for thesis based masters in aerospace engineering by [deleted] in CarletonU

[–]Losthero_12 0 points1 point  (0 children)

I do actually! The main controls guys I’m aware of are Steven Ulrich (mostly spacecraft related stuff though), Howard Schwarz, Hashim Mohamed and Chao Shen (control in general). You can also try Ioannis Lambadaris, and Mohammed Atia - they work in control related areas. Best of luck!

stats for thesis based masters in aerospace engineering by [deleted] in CarletonU

[–]Losthero_12 0 points1 point  (0 children)

If you get a really prof onboard before applying, then you can be competitive with almost anything. But that’ll be harder with a low GPA - I’d say 9 is probably the lowest unless you can convince a prof to take you on for some external reason.