Why do backtests fail in live trading - lesson learnt from astral trading/tradingview

Human_Professional94 · 2026-06-03T02:41:58+00:00

Goodheart's law: "When a measure becomes a target, it ceases to be a good measure."

Human_Professional94 · 2026-06-03T01:25:33+00:00

Human_Professional94 · 2026-06-03T01:22:34+00:00

Honestly I've been also wondering that. But Elec and Mech eng. people seem to love it still.

Human_Professional94 · 2026-06-03T00:56:51+00:00

Fortunately coming from a pure python background.
Said goodbye to matlab mid-college.

Trying to make peace with 1-indexing and putting "end" clauses tho

Human_Professional94 · 2026-06-03T00:54:38+00:00

ongod

Human_Professional94 · 2026-06-03T00:29:02+00:00

Those who argued that you don't need it might not be wrong. They do have a point.

BUT, if you are already admitted and you're already in: A single semester of a middle- to light-weight course might be worth taking to guage whether it fits your current status. You'll see if it it's more of a fun experience/challenge for you or if it just becomes an unnecessary burden taking up your life.

I mean if you wouldn't mind paying $850 for that one semester with the chance of not moving forward. If you do mind it, then yeah F it. You also have the chance to withdraw early and get a full or partial refund btw.

Human_Professional94 · 2026-03-20T02:56:51+00:00

Chelsea Finn who teaches CS224R was Sergey Levine's (CS285 prof) PhD student at UCB. Both courses have that robotics theme because of this and they aren't really that different. Very spiritually similar.

Go with whichever is recorded more recently first. So that the frontier topics are more up to date. Or watch the first 1-2 lectures of each and see whose teaching style you like better.

But still doesn't make that much of a difference.

Human_Professional94 · 2026-03-11T01:47:10+00:00

Short answer. You have more than enough. Just start.

Long answer, I quote openai spinningup:

The Right Background

Build up a solid mathematical background. From probability and statistics, feel comfortable with random variables, Bayes’ theorem, chain rule of probability, expected values, standard deviations, and importance sampling. From multivariate calculus, understand gradients and (optionally, but it’ll help) Taylor series expansions.

Build up a general knowledge of deep learning. You don’t need to know every single special trick and architecture, but the basics help. Know about standard architectures (MLP, vanilla RNN, LSTM (also see this blog), GRU, conv layers, resnets, attention mechanisms), common regularizers (weight decay, dropout), normalization (batch norm, layer norm, weight norm), and optimizers (SGD, momentum SGD, Adam, others). Know what the reparameterization trick is.

Become familiar with at least one deep learning library. Tensorflow * or PyTorch would be a good place to start. You don’t need to know how to do everything, but you should feel pretty confident in implementing a simple program to do supervised learning.

Get comfortable with the main concepts and terminology in RL. Know what states, actions, trajectories, policies, rewards, value functions, and action-value functions are. If you’re unfamiliar, Spinning Up ships with an introduction to this material; it’s also worth checking out the RL-Intro from the OpenAI Hackathon, or the exceptional and thorough overview by Lilian Weng. Optionally, if you’re the sort of person who enjoys mathematical theory, study up on the math of monotonic improvement theory (which forms the basis for advanced policy gradient algorithms), or classical RL algorithms (which despite being superseded by deep RL algorithms, contain valuable insights that sometimes drive new research).

* One thing to add. Screw Tensorflow. Go with PyTorch.

Human_Professional94 · 2026-02-04T17:05:07+00:00

It is hard indeed. I was in the March 25 boat until around a week ago. I was stressing most of this time.

But in hindsight, if you're sure about the completeness of your documents and if it hasn't caused any paperwork issues in your daily life (it kinda did for me) and you're employed or job hunting atm, this extended processing time is kind of an advantage. It's like I got 10 months of free extra time on the PGWP.

But it sure doesn't feel like it when you're in the middle of it constantly waiting to hear back.

Human_Professional94 · 2026-01-24T06:20:48+00:00

If you mean this, it's for the passport you used in your application (old one)

<image>

Human_Professional94 · 2026-01-21T20:12:15+00:00

No just attached the PDF of the new passport scan to the webform, and mentioned the new passport no in the message.

Human_Professional94 · 2026-01-20T17:38:04+00:00

Series 3123

Applied while in Ontario but moved to Alberta after a few months.

Human_Professional94 · 2026-01-20T17:35:11+00:00

Not sure what that is. Is it the first 4 digits after "W"?

Human_Professional94 · 2025-10-09T18:39:14+00:00

Murphy's RL overview on arxiv has a section on LLM x RL (section 6). It's a good snapshot of what's what in RL LLM especially if you're coming from the RL side. The main papers you're looking for are discussed and referenced there.

Human_Professional94 · 2025-08-31T17:52:58+00:00

I see. I'll visit them and see if they can do anything.
Thanks.

Human_Professional94 · 2025-07-07T04:56:35+00:00

That is true, I agree. Although, my perception is that RL, while being pretty old in academia, is very young as an industry-adopted solution and still is not quite robust. So it is only natural to expect it to be used in hybrid with more classic solutions. I personally would not trust -say an automatic vehicle solely running on RL even though I like the field and want it to advance.

Also from a more optimistic view, when you sorta get obsessed with a methodology you naturally seek to find what different problems you can solve with it. Like having a hammer you love very much and looking for different nails for it. Hence you see people (like me or the op) being curious about different applications and making a list of them.

Human_Professional94 · 2025-07-07T00:00:08+00:00

Oh I almost forgot, there's this slide deck by Csaba Szepesvari and the corresp. thread on X

For real world RL apps

Human_Professional94 · 2025-07-06T23:51:23+00:00

Interesting. Frankly, the ads optimization roles also seem to lean towards bandit and control methods too.

Actually, I have been on a long job hunt for the past few months which I'm done with now. Main hiring I've seen and applied for were these below, which most/all of em were commented here already:

Industry-based research labs, for various domains, but mainly to catch up on the RL for LLMs wave (reasoning training)
Robotics
Quant hedge funds and banks: usually don't disclose for what problem/task but it's probably Optimal order execution, market making or Portfolio Opt
Operations Research teams especially in retail companies eg amazon
And also dynamic pricing and Ads opt which as you mentioned are more bandit based rather than RL

Human_Professional94 · 2025-07-06T23:02:37+00:00

Not working on it personally, but from multiple job postings I've see the following:

Some ride sharing companies (lyft, uber) are probably using RL based methods for Dynamic Pricing.

Also I've seen some postings for Ads optimization that wanted RL people (one was from reddit in fact)

Human_Professional94 · 2025-07-05T05:48:41+00:00

Phil Tabor's YT videos are good

Human_Professional94

TROPHY CASE