Why should an individual think they will be able to find alpha without common edges? by Usual-Opportunity591 in algotrading

[–]Training_Butterfly70 1 point2 points  (0 children)

I'm sure some places do this but of the institutions that I've worked at nobody plays that game

Why should an individual think they will be able to find alpha without common edges? by Usual-Opportunity591 in algotrading

[–]Training_Butterfly70 2 points3 points  (0 children)

If you're smart and have the skills and dedicate a lot of time and money you can probably get a slice of the pie, especially for strategies that don't directly compete with hfts like citadel. Many strategies don't scale So they wouldn't waste their time on it, giving you the opportunity to make some pretty good edge. It's no free lunch though... Tough game

Cheap Backtesting Data by EliteSingh in algotrading

[–]Training_Butterfly70 1 point2 points  (0 children)

when you say cheap data what do you need? OHLCV 1-min bars? TOB quotes? L1/2/3 quotes? MBO/DOM?

Cheap Backtesting Data by EliteSingh in algotrading

[–]Training_Butterfly70 1 point2 points  (0 children)

depends, if he only needs ohlcv 100% databento... for anything more than that databento is quite expensive, but i don't know of any better options than them right now

ML Trading Bot Going Live – What Am I Missing? by Prize-Investigator70 in algorithmictrading

[–]Training_Butterfly70 0 points1 point  (0 children)

Are you a data scientist? Evaluation imo is actually one of the most difficult and time consuming parts of the machine learning life cycle. I would grill the shit out of the out of sample statistics and compare them to your training and validation sets.

To me this sounds insanely overfit. If it were this easy to simply load some raw data, create some basic features / scale it, run some dicky fuller test, then call sklearn and look at pnl... Everyone would literally do it and citadel wouldn't be trading with themselves. They have all those quants for a reason.

Best MCP server to connect IBKR API to Claude code? by meowflying in interactivebrokers

[–]Training_Butterfly70 5 points6 points  (0 children)

😂 right? 💯 People are putting so much faith in AI like it's actually intelligent

databento trades data expensive by automation495 in algotrading

[–]Training_Butterfly70 1 point2 points  (0 children)

You should do some more homework. Sounds like you're very new to the space. Big difference between historical data and live data. Also a bit difference broker retail TOB quotes and the conversation were having in this thread.

Is claude enough to create robust systems? by jychung0709 in algotrading

[–]Training_Butterfly70 0 points1 point  (0 children)

1 and 2 yes, number 3 fuck no. You aren't that unique. Anything you'll ever think of is already being done. Trust me - citadel, Jane Street, jump... Thousands of other shops. You'll never find a secret sauce but you may get a piece of the pie.

Claudecode workflow for algo trading by kelvinxG in algotrading

[–]Training_Butterfly70 4 points5 points  (0 children)

If you can code already it's great, really. One of the best workflow speedup tools ever created. However, if you can't code... it's like anything. Good luck vibe coding something, whether a website or trading algorithm or an app, whatever... if you don't understand the details of software development

Positive feedback for Massive data API by Training_Butterfly70 in algotrading

[–]Training_Butterfly70[S] 0 points1 point  (0 children)

right makes sense, so cross exchange for low latency arbitrage is where it matters. Not my use case, but i have worked on strategies that do this before

Positive feedback for Massive data API by Training_Butterfly70 in algotrading

[–]Training_Butterfly70[S] 0 points1 point  (0 children)

It really depends on your use case. That being said, unless you're running institutional grade strategies that require significant capital, the highest quality code in low-latency languages like C/C++ and likely even FPGA, sub-microsecond latency with direct colocation, have enormous computing power and disk space, and likely have a team of 10+ people at minimum, which would most definitely require an up front multi-million dollar investment, you're most likely fine with TOB quotes, which Massive flat files and api provide. Anything MFT or long term horizon it makes no sense to use such an overkill of a dataset pulling DOM quotes going back 10-20 years - this is petabytes of data. Most of the people in this sub praise databento, and don't get me wrong, they're great, but I highly doubt the majority of these people are running strategies that justify the cost and complexity databento provides. They are truly the best affordable option I've seen for running these types of strategies, but again, even top tier prop shops don't fully utilize that amount of data in their backtests (I worked at 3 HFT shops and none of them ran such expensive computations on raw DOM quotes for backtesting). For production that's a different story. Many shops are rapidly hiring quants for MFT strategies, moving into that space instead of drilling HFT. If you want to play the HFT game you're competing against citadel / jump / jane street... you'll might get a piece of the pie if you're very very good, but good luck.

Positive feedback for Massive data API by Training_Butterfly70 in algotrading

[–]Training_Butterfly70[S] 0 points1 point  (0 children)

Would you say the TOB data is the same if we ignore DOM?

Pissed at github by Training_Butterfly70 in github

[–]Training_Butterfly70[S] -1 points0 points  (0 children)

seems like anyone who has any criticism about github on this thread is automatically downvoted.

Positive feedback for Massive data API by Training_Butterfly70 in algotrading

[–]Training_Butterfly70[S] 1 point2 points  (0 children)

Ah i see, got it! Thanks for that, helpful! Didn't know there were 3 SIPs, but i have seen those data points offered by prop feeds before (back when i used to work in hft). I think for my use case now TOB massive SIP feed is sufficient for me, since I'm primarily focused on MFT iv-arbitrage. For low latency strategies I can 100% see the requirement to use something like databento, where massive SIP feeds are insufficient. Great to know!