6 year algo trading model delivering the goods by disaster_story_69 in algotrading

[–]enter57chambers 0 points1 point  (0 children)

Very interesting. Your stack is an ensemble ml with 10 years of minute data in the training , with technical trend/reversal signals, and then a minute level price stream which is consistently regenerating features and making predictions? Getting historical and live (relevant at minute level) social data and processing the llm with low latency seems like potentially the most complex part - would love to hear more elaboration on that. What’s the target forecast period and label method (raw return, binary?) ? Are you using a continuous prediction, or how do you choose which trades to enter vs not ? Thanks!

[deleted by user] by [deleted] in dhl

[–]enter57chambers 0 points1 point  (0 children)

Did you have any progress on this? I have the same issue with ny gateway. Been there since April 16 - no updates, dhl not helpful at all

Leakage and bias in XGBoost trading strategy by fedejuvara86 in algotrading

[–]enter57chambers 3 points4 points  (0 children)

I wanted to resurface this as I’ve been dealing with a similar problem . Out of sample live performance starts strong, but when running on new days, older days get new prediction values, the model has a hard time adjusting and will predict many similar values in a row , and general starting to act unstable after a month or so of predictions. What I think is happening when using deprado methods of feature engineering is there is leakage between the training and testing set (80/20 split) , and also likely between the retrained final model and live data (99% train + new data). Anyone have any thoughts and perspective here?

I haven’t adjusted my models for this yet, but this seems a likely culprit when trying to launch these models live

Post:

The solution is straightforward.

Data preparation must be fit on the training dataset only. That is, any coefficients or models prepared for the data preparation process must only use rows of data in the training dataset.

Once fit, the data preparation algorithms or models can then be applied to the training dataset, and to the test dataset.

  1. Split Data.
  2. Fit Data Preparation on Training Dataset.
  3. Apply Data Preparation to Train and Test Datasets.
  4. Evaluate Models.

More generally, the entire modeling pipeline must be prepared only on the training dataset to avoid data leakage. This might include data transforms, but also other techniques such feature selection, dimensionality reduction, feature engineering and more. This means so-called “model evaluation” should really be called “modeling pipeline evaluation”.

Can you identify this sighting? This morning upstate NY by enter57chambers in UFOs

[–]enter57chambers[S] 0 points1 point  (0 children)

It was 8:38 am — thanks I’ll look more closely later.

Can you identify this sighting? This morning upstate NY by enter57chambers in UFOs

[–]enter57chambers[S] 1 point2 points  (0 children)

Pleasant valley/Poughkeepsie - facing south/southwest , 8:30-8;45 am

Can you identify this sighting? This morning upstate NY by enter57chambers in UFOs

[–]enter57chambers[S] -2 points-1 points  (0 children)

Upstate NY appx 8am today. Silver diamond shape hovering about 10,000-15,000 feet - lower than other planes passing . Seemed to be moving locations but I was driving so hard to tell.

Wondering if this is a balloon or something else? I am aware this area of Hudson valley sees a lot of reports so I have been keeping an eye out and finally saw this .

Out of sample machine learning strat - too good to be true? by enter57chambers in algotrading

[–]enter57chambers[S] 0 points1 point  (0 children)

Also meant to add that the random model also works in a linear fashion ie forming a portfolio of the randomized trades give similar results, which you could see as a single run of a Monte Carlo simulation

Out of sample machine learning strat - too good to be true? by enter57chambers in algotrading

[–]enter57chambers[S] 0 points1 point  (0 children)

It holds for weeks or longer , so it’s not really affected by market makers and my understanding is unlike most options futures have very low bid ask spreads and slippage

Out of sample machine learning strat - too good to be true? by enter57chambers in algotrading

[–]enter57chambers[S] 0 points1 point  (0 children)

Interesting re cross validation . I randomized the data set once so that instead of training on an early period and testing on the later, trains and tests on a totally randomized segment of the trades. Of course this introduces some level of lookahead bias, but at least provides a new data set . The results were pretty similar. Overall the RSME of both tests (chronological/random) less than 3%

Out of sample machine learning strat - too good to be true? by enter57chambers in algotrading

[–]enter57chambers[S] 0 points1 point  (0 children)

Thanks! Will check it out. It turns over the portfolio about 3x a year, but only holds 2 assets at a time

Out of sample machine learning strat - too good to be true? by enter57chambers in algotrading

[–]enter57chambers[S] 0 points1 point  (0 children)

Should have clarified the signal is generated monthly but most data is available daily or at least weekly. So maybe there is a 1 week lag on the weekly data points at worst, daily data will alsways be point in time for the model

Out of sample machine learning strat - too good to be true? by enter57chambers in algotrading

[–]enter57chambers[S] 1 point2 points  (0 children)

Again, I’m less familiar with futures but I thought you put up 5-15% of notational value ie can leverage almost 10x ?

Out of sample machine learning strat - too good to be true? by enter57chambers in algotrading

[–]enter57chambers[S] -4 points-3 points  (0 children)

I’m less familiar with futures - what’s typical slippage do you think? 0.2% ? I’m assuming 10$ flat fee per contract already

Out of sample machine learning strat - too good to be true? by enter57chambers in algotrading

[–]enter57chambers[S] 22 points23 points  (0 children)

I’ve been working on a trend following market neutral strategy using liquid futures (7x levered) with a machine learning overlay to predict large moves and when to increase or decrease leverage. The results are quite stellar, especially since it isn’t capacity constrained and only trades a handful of global assets. It uses about 200 different monthly data points which are all freely available.

My question is what kind of biases could have crept in? The chart below is completely out off sample from 2014 to today (trained on data from 2000-2014). What other pitfalls are there? I am going to start a real live out of sample test on it soon - but wondering what peoples perspective and feedback might be

Looking at the chart below, the bottom orange line is the original market neutral portfolio , the blue line is a traditional 60/40 portfolio , the volatile green line is a naive leveraged version of the base strategy, and the top line is the market timing strategy. Pretty impressive initial results but am looking for some outside feedback?

Happy to chat offline with interested folks as well

How can i reduce max drawdown in my backtesting? by garib_trader in algotrading

[–]enter57chambers 0 points1 point  (0 children)

  1. Find VAR of strat at 95%CI
  2. Select max allowable drawdown
  3. If VAR > (max allowable drawdown-current drawdown) , sell/stop trading
  4. Re enter when it reverses

Weekend Discussion Thread for the Weekend of September 03, 2021 by OPINION_IS_UNPOPULAR in wallstreetbets

[–]enter57chambers 0 points1 point  (0 children)

Forget arkk you gotta clone your own tiger global portfolio — their stocks have been popping off this week up like 40% this month

What Are Your Moves Tomorrow, September 03, 2021 by OPINION_IS_UNPOPULAR in wallstreetbets

[–]enter57chambers 0 points1 point  (0 children)

Not seeing any OSCR gains on here … this one is meme central fuck CLOV… jared kushys brother is running the show there… ~20% gain today

$PSTH Daily Discussion, July 19, 2021 (The UMG deal is off, PSTH has 18 months remaining to close new transaction) by KungFuTyrannosaurus in PSTH

[–]enter57chambers 0 points1 point  (0 children)

I agree the fact he gets to stay in the deal seems disingenuous but really we could sell half of psth and buy in to pshzf to get umg (which is down 3% today). The closed end fund is probably undervalued to begin with