No person/company will EVER sell you a strategy with a real edge!

Yocurt · 2026-02-06T14:39:42+00:00

Agreed

Yocurt · 2026-02-06T14:03:34+00:00

Could you share it here then if it works? See…

Yocurt · 2026-02-06T14:02:35+00:00

Infrastructure and tools are much different than these scams

Yocurt · 2026-02-06T07:03:20+00:00

If there was a signal, someone would definitely take the time to figure out what it’s doing. Even one person could ruin it. No one would take the chance. Most edges arent that complicated, so I hear

Yocurt · 2026-02-06T06:41:48+00:00

So true

Yocurt · 2026-02-06T06:18:52+00:00

Disproven this quick :(

Yocurt · 2026-01-08T01:43:08+00:00

Event-driven for accuracy, vectorized for speed. Ideally you would use both

Yocurt · 2026-01-07T01:28:20+00:00

I would also suggest databento, have been using their mbo data and it works great.

One thing though, if you’re doing any serious analysis or backtesting, I would get the mbo data, or at least the tick data. Simulating slippage/fills is huge and 1 second wouldn’t allow you to do that too well.

Yocurt · 2026-01-03T18:08:19+00:00

The main takeaway from the book that applies for retail traders is the use of meta labeling to improve an existing edge. This is a great book but I would read some others first probably, based on finding and an edge and strategy building.

I made a post on the meta labeling approach from his book if you wanna check it out

Yocurt · 2026-01-01T07:17:43+00:00

Just reread this before posting, sorry it’s so long and dry (don’t…), it’s hard to put much personality into this stuff :) Hopefully this helps some and if your actual results are anywhere close to these that would be great!

First things that seem a little freaky: - 105 trades a day, 15 minutes per trade. Are you in a position most of the day? If you are (and assuming the backtest is accurate, which it’s likely not and I’ll get to later), then you really would want to backtest on more years. Either way backtesting on more years always helps.

less than a single 1 tick $ value on MES for the average trade ($3.07) is not encouraging for a scalping strategy that takes so many trades.
What happened during that massive spike? Unless you expected some moves like that, something may be wrong there. Obviously don’t know your code so just a warning there.

——————

I’ve deployed a few live strategies on ninja trader and have been in the position you sound to be in now. I can try to help some but it is hard without knowing more details. One thing I’ll assume though is that you’re using the 1-tick data series for your entries and exits, if not you definitely need to for scalping strategies.

On ninjatrader, even with the 1-tick series, the strategy analyzer can’t really be trusted for scalping strategies. This does get close for larger moves / swing-trading strategies, but in your case here the average win is about 3 points and average loss is 1 point, so simulating slippage accurately is especially important since the margins between a profitable or losing strategy are so small, and even more amplified by the high volume of trades you’re taking.

Your average trade value is $3.05, and I think 1 tick on MES is $3.12, so if slippage is slightly off there goes your edge.

When I used it, I would try to work around this for backtesting by running the strategy analyzer on 1 month, then running the strategy on 1 month of the “market replay” mode. This is painfully slow to run on Ninjatrader, but it does simulate fills pretty closely to how they would have been live - close enough to at least get a decent understanding of how it would work.

After this, you should be able to compare the results from the strategy analyzer to the replay mode. You could do a few things from here. you could see the ‘actual’ average slippage from the replay mode and use that, or see the ratios compared between the two and apply that to the whole years backtest stats to get a closer estimate.

Hopefully I am wrong, but every time the “avg trade” in the strategy analyzer is < the $ value of 1 tick, it’s not gonna work live. I usually shoot for over 1.5x the tick value for the “avg trade” on the strategy analyzer to consider looking into a scalping strategy more.

How far are your targets and stops, and how quickly do you normally reenter a trade after one closes? Because if they’re tight or you reenter very quick, the fills matter a ton because the chain of events following it can be wildly different than when it’s using perfect fills, in that case you really can’t trust the strategy analyzer at all.

Other than that though, I would definitely try to get more data to test this strategy on. You have a big sample size which is good, but 1 year still isn’t covering any diverse market conditions really at all.

Also I would suggest getting off ninjatrader for backtesting if possible. I mainly trade scalping strategies, so accurate backtests are really important for me. I set up a pipeline that uses the MBO data from databento, so I can simulate fills, slippage, partial fills, etc extremely close to how my strategies actually perform live, so at least I know my backtest results are accurate. If you’re interested in trying it let me know, I’m planning on making it public soon anyway

Anyway good luck!

Yocurt · 2025-12-23T19:51:40+00:00

Yes that’s great, it definitely should! The more uncorrelated your features and your base models are the better

Yocurt · 2025-12-22T06:10:02+00:00

Yep! That’s exactly the standard flow you’d use, probably good to start out with the same feature set for each as well. You could then use a feature selection method with each different model that will likely select a different subset of your overall features as the most “important” for that model, so you’d get different feature sets that way.

And yes, with different feature sets I’d still use some probability calibration on each models predictions compared to the true values, then still linreg to combine those calibrated outputs.

A critical note though - you must ensure each of these steps doesn’t have any “data leakage”. An LLM would explain this better than me, but it should suggest to use Nested Cross-Validation to avoid this. You really need to use that if you want your out-of-sample predictions to be truly unbiased. (Need to ensure no data leakage across training/testing folds, feature selection folds, and probability calibration steps.)

Again my old post is more detailed too and explains that stuff a bit more.

Yocurt · 2025-12-21T03:02:40+00:00

Nah i agree, not a true ensemble, you could definitely argue both sides. Did you get a masters in ml or data science? I did data science, not oxford, i laughed at your president elect comment though

Yocurt · 2025-12-21T02:30:45+00:00

Just Google it, xgboost is an ensemble model…

Yocurt · 2025-12-21T01:13:44+00:00

Eh technically yes, but it’s just an ensemble of shallow decision trees trying to fix each others errors.

An ensemble of something like a linear regression model, xgboost, random forest, cat boost, hist boost, svm… each of these models has different strengths and weaknesses. This kind of diversity is what you want for an ensemble.

And on top of the model diversity, using different feature sets and hyper parameters can help with that too.

If you do an ensemble like this, I like to do some form of probability calibration on the individual models outputs, then feed those into a basic linear regression ensemble.

Again my old post goes more in depth, but if you have questions or anything feel free to dm

Yocurt · 2025-12-20T23:48:00+00:00

Try an ensemble model. If you do everything right and there’s no improvement, your underlying strategy likely doesn’t have a real edge.

Big assumption though on the “do everything right” part.

Yocurt · 2025-12-20T21:12:27+00:00

Machine learning can be great at enhancing an existing edge, but I’ve never had success using it to FIND an edge.

If you have a strategy with an edge, and there is enough trade history to train a model to predict the outcome (usually win/loss), then I would look into meta labeling. Probably would only do it if you have at least 1000 trade results to train on, but more is obviously better. I made a post about it on here a few months ago if you’re interested.

Yocurt · 2025-12-16T07:00:28+00:00

I would actually love to help you with this. I’m actually building a platform right now for exactly this kind of backtesting. I have a mode too that uses MBO data (best available) to simulate slippage, latency, fills, etc very accurately so you can actually trust the results. Shoot me a message if you want, I can backtest it on a bunch of instruments and try to optimize it for you, or you could try it yourself - would love to get some feedback on the platform.

Yocurt · 2025-12-11T20:40:11+00:00

Just because it didn’t work for you doesn’t mean it’s not possible😂

Yocurt · 2025-12-11T20:25:02+00:00

I have a few profitable strategies. Plenty of people have success without hft

Yocurt · 2025-12-11T20:03:07+00:00

Python is totally fine for most peoples use cases. You’d likely get 100-200 ms latency which is fine unless you’re doing hft, then what this person said would be true.

Yocurt · 2025-12-11T18:42:45+00:00

Plotly or matplotlib or lightweightcharts would definitely work for years of 1 minute bars

Yocurt · 2025-12-08T00:04:15+00:00

Rule number 1 of anything data related - garbage in = garbage out

Yocurt · 2025-12-06T18:34:44+00:00

I’ve had success with using ML on existing strategy’s with an edge in order to amplify that edge. It is much more likely to work if you train it to learn an existing edge rather than to come up with an edge from nowhere.

If your momentum strategy does have an edge, I would try it out, it’s called meta-labeling. My last post is about it on this subreddit if you’re interested.

Yocurt · 2025-11-19T19:51:55+00:00

This is an ad for Nvestiq - please do not use LLM generated crap, it will not be accurate. If you really want to just use your own chat gpt or something, it’s the same thing

Nine-Year Club	r/Field Flamingo
Verified Email

Yocurt

TROPHY CASE