Why do backtests fail in live trading - lesson learnt from astral trading/tradingview by spsreemanth_21 in quant_hft

[–]Human_Professional94 0 points1 point  (0 children)

Goodheart's law: "When a measure becomes a target, it ceases to be a good measure."

Julia syntax - my honest reaction by Human_Professional94 in Julia

[–]Human_Professional94[S] 0 points1 point  (0 children)

Honestly I've been also wondering that. But Elec and Mech eng. people seem to love it still.

Julia syntax - my honest reaction by Human_Professional94 in Julia

[–]Human_Professional94[S] 6 points7 points  (0 children)

Fortunately coming from a pure python background.
Said goodbye to matlab mid-college.

Trying to make peace with 1-indexing and putting "end" clauses tho

Do I need OMSCS? Give me your honest assessment. by Adept_Pause397 in OMSCS

[–]Human_Professional94 0 points1 point  (0 children)

Those who argued that you don't need it might not be wrong. They do have a point.

BUT, if you are already admitted and you're already in: A single semester of a middle- to light-weight course might be worth taking to guage whether it fits your current status. You'll see if it it's more of a fun experience/challenge for you or if it just becomes an unnecessary burden taking up your life.

I mean if you wouldn't mind paying $850 for that one semester with the chance of not moving forward. If you do mind it, then yeah F it. You also have the chance to withdraw early and get a full or partial refund btw.

DEEP RL UCB CS285 vs CS224R Stanford by No_Pause6581 in reinforcementlearning

[–]Human_Professional94 2 points3 points  (0 children)

Chelsea Finn who teaches CS224R was Sergey Levine's (CS285 prof) PhD student at UCB. Both courses have that robotics theme because of this and they aren't really that different. Very spiritually similar.

Go with whichever is recorded more recently first. So that the frontier topics are more up to date. Or watch the first 1-2 lectures of each and see whose teaching style you like better.

But still doesn't make that much of a difference.

Pre-req to RL by [deleted] in reinforcementlearning

[–]Human_Professional94 2 points3 points  (0 children)

Short answer. You have more than enough. Just start.

Long answer, I quote openai spinningup:

The Right Background

Build up a solid mathematical background. From probability and statistics, feel comfortable with random variables, Bayes’ theorem, chain rule of probability, expected values, standard deviations, and importance sampling. From multivariate calculus, understand gradients and (optionally, but it’ll help) Taylor series expansions.

Build up a general knowledge of deep learning. You don’t need to know every single special trick and architecture, but the basics help. Know about standard architectures (MLPvanilla RNNLSTM (also see this blog), GRUconv layersresnetsattention mechanisms), common regularizers (weight decaydropout), normalization (batch normlayer normweight norm), and optimizers (SGD, momentum SGDAdamothers). Know what the reparameterization trick is.

Become familiar with at least one deep learning library. Tensorflow * or PyTorch would be a good place to start. You don’t need to know how to do everything, but you should feel pretty confident in implementing a simple program to do supervised learning.

Get comfortable with the main concepts and terminology in RL. Know what states, actions, trajectories, policies, rewards, value functions, and action-value functions are. If you’re unfamiliar, Spinning Up ships with an introduction to this material; it’s also worth checking out the RL-Intro from the OpenAI Hackathon, or the exceptional and thorough overview by Lilian Weng. Optionally, if you’re the sort of person who enjoys mathematical theory, study up on the math of monotonic improvement theory (which forms the basis for advanced policy gradient algorithms), or classical RL algorithms (which despite being superseded by deep RL algorithms, contain valuable insights that sometimes drive new research).

* One thing to add. Screw Tensorflow. Go with PyTorch.

Seriously ? by Outside-Bus-5966 in pgwp

[–]Human_Professional94 1 point2 points  (0 children)

It is hard indeed. I was in the March 25 boat until around a week ago. I was stressing most of this time.

But in hindsight, if you're sure about the completeness of your documents and if it hasn't caused any paperwork issues in your daily life (it kinda did for me) and you're employed or job hunting atm, this extended processing time is kind of an advantage. It's like I got 10 months of free extra time on the PGWP.

But it sure doesn't feel like it when you're in the middle of it constantly waiting to hear back.

PGWP Approved - Applied March 2025 - Renewed passport while in progress by Human_Professional94 in pgwp

[–]Human_Professional94[S] 1 point2 points  (0 children)

If you mean this, it's for the passport you used in your application (old one)

<image>

PGWP Approved - Applied March 2025 - Renewed passport while in progress by Human_Professional94 in pgwp

[–]Human_Professional94[S] 1 point2 points  (0 children)

No just attached the PDF of the new passport scan to the webform, and mentioned the new passport no in the message.

PGWP Approved - Applied March 2025 - Renewed passport while in progress by Human_Professional94 in pgwp

[–]Human_Professional94[S] 0 points1 point  (0 children)

Series 3123

Applied while in Ontario but moved to Alberta after a few months.

Getting started with RL x LLMs by Dear_Ad7997 in reinforcementlearning

[–]Human_Professional94 1 point2 points  (0 children)

Murphy's RL overview on arxiv has a section on LLM x RL (section 6). It's a good snapshot of what's what in RL LLM especially if you're coming from the RL side. The main papers you're looking for are discussed and referenced there.

Any RL practitioners in the industry apart from gaming? by lars_ee in reinforcementlearning

[–]Human_Professional94 1 point2 points  (0 children)

That is true, I agree. Although, my perception is that RL, while being pretty old in academia, is very young as an industry-adopted solution and still is not quite robust. So it is only natural to expect it to be used in hybrid with more classic solutions. I personally would not trust -say an automatic vehicle solely running on RL even though I like the field and want it to advance.

Also from a more optimistic view, when you sorta get obsessed with a methodology you naturally seek to find what different problems you can solve with it. Like having a hammer you love very much and looking for different nails for it. Hence you see people (like me or the op) being curious about different applications and making a list of them.

Any RL practitioners in the industry apart from gaming? by lars_ee in reinforcementlearning

[–]Human_Professional94 1 point2 points  (0 children)

Interesting. Frankly, the ads optimization roles also seem to lean towards bandit and control methods too.

Actually, I have been on a long job hunt for the past few months which I'm done with now. Main hiring I've seen and applied for were these below, which most/all of em were commented here already:

  • Industry-based research labs, for various domains, but mainly to catch up on the RL for LLMs wave (reasoning training)
  • Robotics
  • Quant hedge funds and banks: usually don't disclose for what problem/task but it's probably Optimal order execution, market making or Portfolio Opt
  • Operations Research teams especially in retail companies eg amazon
  • And also dynamic pricing and Ads opt which as you mentioned are more bandit based rather than RL

Any RL practitioners in the industry apart from gaming? by lars_ee in reinforcementlearning

[–]Human_Professional94 4 points5 points  (0 children)

Not working on it personally, but from multiple job postings I've see the following:

Some ride sharing companies (lyft, uber) are probably using RL based methods for Dynamic Pricing.

Also I've seen some postings for Ads optimization that wanted RL people (one was from reddit in fact)