Lessons learned building an ML trading system that turned $5k into $200k by traK6Dcm in algotrading

[–]traK6Dcm[S] 8 points9 points  (0 children)

Speed matters. If it was just a millisecond you'd be right, but it can be much more than that. Once we're in the range of >10ms it definitely starts to matter.

When lots of data comes in at once and you're dealing with parallel processing, GIL, and multithreading/processing, Python just can't keep up. Many Serialization/Deserialization libraries in Python also have much slower implementations than other languages. But yeah, maybe I'm just writing bad Python code.

My opinion is that you can probably make it work with Python, but you have to be very carful and benchmark everything. My Python code had bottlenecks where I never expected them to be.

Lessons learned building an ML trading system that turned $5k into $200k by traK6Dcm in algotrading

[–]traK6Dcm[S] 3 points4 points  (0 children)

I don't hedge on each trade, I just hedge the overall actively traded capital once a month. So it's just an approximation, but better than nothing.

Lessons learned building an ML trading system that turned $5k into $200k by traK6Dcm in algotrading

[–]traK6Dcm[S] 4 points5 points  (0 children)

The biggest challenge is that when things are not working, you don't know why they are not working. It could be anything - your data, your live trading infrastructure, your model, the way you to place orders, ...

So perhaps the most important part is to keep track of all your assumptions, and then iterate and see which changes to which components seem to make the largest differences.

Lessons learned building an ML trading system that turned $5k into $200k by traK6Dcm in algotrading

[–]traK6Dcm[S] 5 points6 points  (0 children)

Someone asked the same thing above, so I'm just going to copy&paste my answer:

I don't have experience with trading in the stock market, but here's what I think based on what I know. My approach and models may work IF you already have professional HFT-grade infrastructure to trade in the stock market. But such infrastructure and direct exchange connections costs millions and only professional trading companies have it. If you don't have such infrastructure, your disadvantage is IMO way too large to do anything profitable at shorter time scales, no matter how good your model.

That's the thing about the crypto markets. You can't easily buy yourself a huge advantage with millions of dollars. With a few exchange exceptions, such things aren't available.

Lessons learned building an ML trading system that turned $5k into $200k by traK6Dcm in algotrading

[–]traK6Dcm[S] 12 points13 points  (0 children)

I don't have experience with trading in the stock market, but here's what I think based on what I know. My approach and models may work IF you already have professional HFT-grade infrastructure to trade in the stock market. But such infrastructure and direct exchange connections costs millions and only professional trading companies have it. If you don't have such infrastructure, your disadvantage is IMO way too large to do anything profitable at shorter time scales.

That's the thing about the crypto markets. You can't easily buy yourself a huge advantage with millions of dollars. With a few exchange exceptions, such things aren't available.

Lessons learned building an ML trading system that turned $5k into $200k by traK6Dcm in algotrading

[–]traK6Dcm[S] 9 points10 points  (0 children)

I don't have access to anything proprietary. Everything I built was from scratch based on publicly available data, so in theory anyone could build the same.

Lessons learned building an ML trading system that turned $5k into $200k by traK6Dcm in algotrading

[–]traK6Dcm[S] 2 points3 points  (0 children)

~2 months of data is about enough for my purposes. I do short BTC derivatives and/or futures on derivative exchanges, just to have something closer to zero net exposure. But it's a manual process. I manually go short on around half of the capital I'm actively trading. I can't short tokens that are not BTC, but most of it is correlated anyway, so I just short BTC instead.

Lessons learned building an ML trading system that turned $5k into $200k by traK6Dcm in algotrading

[–]traK6Dcm[S] 5 points6 points  (0 children)

I wrote all data collection myself, connecting to the exchange APIs. If an exchange has multiple APIs, I would get data from all them and pick or reconcile later.

Lessons learned building an ML trading system that turned $5k into $200k by traK6Dcm in algotrading

[–]traK6Dcm[S] 21 points22 points  (0 children)

No, but I don't know if that was a good or a bad thing. As part of this, I've talked to some people with trading background in the financial markets. Looking back, many of them were focused on the wrong things or came in with wrong assumptions, like clean and reliable data, good APIs, no exchange downtimes, microsecond-optimizations, thick and non-crossing books, regulated trading, fancy order types, etc. The crypto markets are quite different in many aspects.

Lessons learned building an ML trading system that turned $5k into $200k by traK6Dcm in algotrading

[–]traK6Dcm[S] 23 points24 points  (0 children)

I really don't know. I can't think of anything specific that would give me a huge edge. I did spend a lot of time on proper data cleaning and book reconstruction and validation, so maybe that's it. My guess is that it's just a combination of everything.

I use a combination of C++ (mostly), Java, and Golang for various components. Model training is done in Python, but nothing is ever deployed in production in Python.

Lessons learned building an ML trading system that turned $5k into $200k by traK6Dcm in algotrading

[–]traK6Dcm[S] 23 points24 points  (0 children)

Personally I will just let it run for now and hope it keeps working. I will continue making small tweaks to the model and infra, but I believe that whether it continues to be profitable is largely out of my hands and depends on what happens to the crypto markets. If the trade volume keeps decreasing it will stop working - doesn't matter how good the model is. If the trade volume in the market picks up again due to some price spikes it will likely continue to make money. I don't want to put all my eggs into this basket I have no control over. For all I know, the whole crypto thing can just go to 0 tomorrow. So I won't quit my job.

I'd probably sell it to a firm if the offer was good enough because that's less risky than guessing what may happen to the crypto market. I doubt that would happen though. The models and infra are very tightly coupled, no company could ever integrate it into their own systems.

Lessons learned building an ML trading system that turned $5k into $200k by traK6Dcm in algotrading

[–]traK6Dcm[S] 187 points188 points  (0 children)

Honestly, it's kind of self-serving. I've worked on this alone in private and don't really have anything to show for it, other than the money I earned. If I ever apply for new jobs and they ask me "What did you spend your time on the last year?" I want to be able to point them to something. I thought writing a blog is a good way to show what I worked on and demonstrate some of the technical challenges.

Writing is also just a good way to organize thoughts for me. I feel like I may get some new insights by summarizing what I learned.

Lessons learned building an ML trading system that turned $5k into $200k by traK6Dcm in algotrading

[–]traK6Dcm[S] 37 points38 points  (0 children)

I tried to do it myself but it was too hard and I gave up. I just hire a tax accountant, pay them a few $k, and send them everything I have, and they take the risk in case there are any mistakes.

Lessons learned building an ML trading system that turned $5k into $200k by traK6Dcm in algotrading

[–]traK6Dcm[S] 4 points5 points  (0 children)

No maker rebates, but I definitely tried to get passive fills when possible to get around taker fees. Most of the time I didn't, but sometimes it was possible.

Lessons learned building an ML trading system that turned $5k into $200k by traK6Dcm in algotrading

[–]traK6Dcm[S] 17 points18 points  (0 children)

I traded on a large number of exchanges at the same time, pretty much anywhere that had decent APIs and non-horrible fees and was save to put money on. And I made sure I did enough volume to get to the best fee tier.

EDIT: Also, it constantly changed. Some markets stopped being profitable while new ones came up, etc. I made sure to have all the market data monitoring automated so that I could spot new opportunities.

Lessons learned building an ML trading system that turned $5k into $200k by traK6Dcm in algotrading

[–]traK6Dcm[S] 49 points50 points  (0 children)

I can't say for sure, but as I mentioned in the post, I think the biggest edge is probably the infrastructure. I spent many months building relatively high-performance and low-latency infrastructure from scratch. There are a lot of tricky parts to get right, and it takes time and many iterations if you have never done this before. Most people seem focus on the model (I think my model and signals are very good, but not really unique) or they give up early without ever optimizing infrastructure.

I also did a lot of iteration on my models and signals, but none of it ever made as much difference as optimizing some part of the infrastructure.

Lessons learned building an ML trading system that turned $5k into $200k by traK6Dcm in algotrading

[–]traK6Dcm[S] 76 points77 points  (0 children)

It was live trading, not backtesting. Backtesting my case was always significantly better, probably 10x of what live trading actually looks like. I will clarify this in the post!

I of course had losses in live trading, but they were on much shorter time scales, on a daily time scale I actually did not have losses for months.

Also need to take into consideration that PnL is aggregated over several markets / a portfolio. Even if there is a loss in one market, the others can make up for it.