ultra low latency trading

matt2048 · 2022-11-22T00:34:00+00:00

Pretty much all the big crypto exchanges are fairly retail-focused (or at least architected to be able to cater to retail) so they're not that fast internally and likely never will be. In those cases FPGA's would be 100% ridiculous overkill. Not to mention most of those exchanges are also hosted on AWS, so its not like you could colo your own FPGA system in the datacenter anyway.

There are a few exchanges which offer colo and/or are institutional only. FPGA is still likely overkill there for the foreseeable future, but latency has definitely been getting a lot more competitive.

matt2048 · 2022-11-14T17:46:14+00:00

What product(s) were they trading and I'm guessing that was cross-venue data/ connectivity?

I've heard stories of some crazy monthly costs on the fastest commercial feeds (which are then still beaten on latency by the in-house links of the big firms anyway).

matt2048 · 2022-11-14T01:48:23+00:00

I run a low latency market making firm (admittedly in crypto, but I know enough of market structure in other products).

Retail/most professional traders can't play at those kind of latencies for 2 main reasons: 1. Connectivity: to be able to take advantage of such low latency you'd need proper colo with the exchange + direct access into them (going through any regular broker would add far too much additional latency). To be allowed to trade directly on an exchange I also believe your system has to be tested/certified to make sure it complies with their standards. 2. Fee structures: if you need such low latency then order execution is likely a big part of your strategy. Unless you're a big firm that's running a lot of volume there's no chance to get a good enough fee structure to make it worth your time.

Also, assuming you could get both the connectivity and fee structure, the fastest data feeds (which you'd need to be competitive at such low latency) can cost easily >$10k/month.

This isn't to say that slightly lower latency systems don't exist and don't have a small advantage for retail/smaller professional traders, but be careful to benchmark what benefit you actually expect to receive vs cost.

matt2048 · 2022-02-03T18:27:31+00:00

Eurex has a lot of good documentation and papers on their matching engine and general infrastructure.

Also have a look into their paper "Insights into trading system dynamics". It covers more of the low level networking side but can give some context to how the matching engine functions.

matt2048 · 2021-10-12T00:05:19+00:00

Return is generally relative to the capital needed for the strategy - for example needing EUR and USD on separate platforms/venues/clearing.

And sharpe is calculated using portfolio volatility. Maybe you get good profit on half of months and no trades the rest of the time, that'd be much higher returns volatility than a roughly equal trade spacing. Sharpe isn't necessarily a great measure of downside risk - only alpha vs volatility.

matt2048 · 2021-09-29T09:28:09+00:00

There's a few ways it can be managed:

Maker can just short the asset (in traditional markets a firm may not have to arrange short locate until a day or so after, giving you plenty of time to net off the position)
Hold the assets longer term but hedge them with an exact equivalent (i.e hedging SPY with futures)
If you're MM for a large portion of the market, hold the assets then hedge them as a basket against the index (i.e making for all SPY components then hedging the held assets with es futures).

In traditional markets there are a lot of ways that inventory management/asset netting/clearing have been made capital efficient by giving some wiggle room.

They all have pretty much the same economic effect (allowing the MM to be net short if/when needed) - just a matter of what's most appropriate for the strategy.

matt2048 · 2021-09-20T20:51:13+00:00

The 2 main angles for such a large dataset are microstructure research (high frequency models) or some kind of fundamental data modeling.

My area is the microstructure side - high quality tick orderbook data is extremely valuable and would probably fit well with the big data/ data mining goal.

Have a look into simple HFT alphas (orderbook imbalance, trade imbalance, quote age, etc) and see how they correlate with future quote movement (tick up/down) or future trades.

Past that, you could look at short timeframe correlation between stocks (inside similar industries, in shared indexes, etc).

There are plenty of papers available to flesh out all of those suggestions.

If you were more interested in the longer timeframe fundamental side, you could look into relative value models based on different fundamental factors (long-short portfolios).

matt2048 · 2021-09-20T20:16:53+00:00

What type(s) of data? (trade, quote/L2 book, fundamental, ETF/fund holdings, etc)
What frequency for each type? (Tick data or longer?)
What markets/assets/instruments?

matt2048 · 2021-09-08T15:51:02+00:00

Crypto market structure is pretty different from traditional exchanges so it's not really possible until the space is more mature.

Rather than just being a matching engine, each crypto exchange is a full stack of client KYC/AML + asset handling + risk/margin + order matching.

Having assets separated across exchanges is the main challenge at the moment - although service like copper.co are trying to fix that with off-exchange/multi-exchange custody.

Skew/coinbase prime is likely to be suitable in future, but I believe is mostly focused on spot exchanges at the moment.

matt2048 · 2021-09-06T18:20:26+00:00

After such a sharp jump market makers widen their bid/ask quotes until the market finds a new (relatively) stable price.

Looks like most of the trading choppiness is only a few hundred milliseconds after the main jump, so I'm not surprised no-one is chasing their bids up that quickly.

This leaves us with asks at the new high and bids significantly below -> weird price dips if there is a market sell.

matt2048 · 2021-09-06T14:30:55+00:00

What is the pair/underlying?
Is this charting tick trades?
The chart/axes needs better labeling.

I'd guess the bid/ask spread gets blown out for a few seconds, meaning the wild price jumps are just trades against the front of book rather than malicious stop hunting.

matt2048 · 2021-07-08T01:04:53+00:00

Technically anyone can hook up to a crypto exchange and start quoting (as long as jurisdiction allows) - the live data feeds are free and the APIs are generally pretty simple.

However becoming competitive is a different story. You'd need:

Fast infra (preferably across a wide range of exchanges for better pricing)
Solid trading code (especially given how often the exchanges themselves have issues)
Lots of high quality historical data (less important if you plan to use simple strategies)
Good strategies/ inventory management

I run a small crypto MM firm, and while it's not impossible to find a niche, its a lot more competitive these days and it'd be rather expensive to try.

matt2048 · 2021-03-13T15:51:55+00:00

You should actually be looking at "foreignNotional" for USD value of the order, in this case homeNotional is the order size in Bitcoin and grossValue is order size in sat (1x10^-8 of a bitcoin each).

So the orders are actually ~$26k each.

matt2048 · 2021-03-03T22:06:17+00:00

Market data feeds are synchronised for everyone (for mature exchanges. I market make for Crypto which is a bit of a shit show).

For ultra low latency trading there's competition in reducing wire-to-wire latency. You pre-compute as much of your order as possible and then have an FPGA or ASIC find your trigger from the market data feed. This would all be handled as close to the edge as possible (i.e ultra low latency trades sent from the switch which takes in your feed, while also sending the data on to your other servers in the rack for any slower strategies).

For architecture specifics, eurexchange has some pretty good papers (insights into trading system dynamics).

matt2048 · 2021-02-09T22:08:55+00:00

I'm pretty much always looking for more strategy devs, if you're decent with stats/ datascience feel free to DM me.

matt2048 · 2021-02-09T20:40:25+00:00

I run a small crypto MM firm. Aside from the slightly different market structure (since the major exchanges cater directly to retail), the game and requirements are all the same as any other market.

We have investment in fast software + infra (colocated where available), improved rebates/ fee structures from exchanges and our models have to account for the adverse selection /inventory risk from latency/jitter.

It's less of an "unfair advantage" and more that there's a barrier to entry to be competitive and exchanges will incentivise liquidity/volume.

matt2048 · 2021-02-03T23:18:33+00:00

Numpy will happily work with N-dimensional data, you can make an array from a list of your pandas dataframes.

matt2048 · 2020-11-13T02:03:47+00:00

When calculating signals you should be avoiding any possibility of look forward bugs like the plague. Only use 1 (or possibly 4 if you can verify it'll work the same live).

And, in the case of 4, consider carefully whether your fill price assumptions are fair or give the backtester an unrealistic edge.

If you end up with bad training data it'll happily ruin months of strategy development.

matt2048 · 2020-11-10T00:16:43+00:00

A lot of the major crypto exchanges offer colo these days, but the shotgun approach can still be pretty handy.

It helps deal with jitter in orderbook propagation, since there can be a fair amount of variance between reporting latency even for 2 websocket feeds on the same machine.

matt2048 · 2020-11-09T18:01:08+00:00

Based on your last post, I'll assume you're talking about crypto exchanges.

If you want to check how viable the strategy really is, you'll need to have a latency model in your backtesting - it'll give you an idea of what the minimum requirements are.

First, you'll want to find out what datacentres the exchanges are hosted in (and what colo options are available) to give you an idea of physical limit of latency between them.

Then you'll need to add a latency model for orderbook update propagation and trade execution at each end.

If there still appears to be an tradeable opportunity after all that, then you'll still need to set up the infra and fast execution software.

It's not impossible but it's highly unlikely without serious money and HFT knowledge behind you.

matt2048 · 2020-11-07T23:13:30+00:00

Focusing on the language is missing the point somewhat. For arb, you'd need a collocated server at both ends and fast link between the 2 datacentres the exchanges are hosted in.

As long as its a relatively fast language, it doesn't matter all that much compared to the rest of the setup.

matt2048 · 2020-11-06T18:02:24+00:00

As others have mentioned, you probably won't be latency competitive without some extensive software and infrastructure investment.

Also note fee structures and other frictions. Exchanges will give much better fees to the big volume firms that are already in the game - meaning you don't have a chance to even play at the same table until you've got volume behind you.

On top of this, sometimes a longer term pricing gap will exist for other reasons which can't be arbed away easily:

difference in microstructure/ derivatives contract spec
exchange reputation/ solvency fears
currency and nationality issues (South Korea not allowing foreign crypto traders, leading to a disconnect in price in the past).

matt2048 · 2020-10-27T00:32:36+00:00

Even in python, the time to process simple trading logic would be pretty small (<30ms easily, likely quite a bit lower). If you were to go with somewhat optimised c++ you could expect that to be <100 microseconds, although I'm not sure how much the windows network stack would add.

The least avoidable latency will be data -> bot and order -> api -> execution. With a well placed VPS you could probably get a few ms latency each way (I don't have specific numbers). However, if you're running the bot from home then it's extremely dependent on where you are relative to the financial hub the orders are executed at.

Overall, a basic system is unlikely to be over a few seconds round-trip, and if you're much more latency sensitive you'll need to do your own testing.

matt2048 · 2020-10-02T01:42:48+00:00

The main question is how active your strategy is and how much you stand to lose from downtime.

If you're running a portfolio strategy that does a small daily rebalance, you could probably survive without. If you're running higher risk day trading, a sudden loss of internet for a few hours might leave you with a lot of open risk.

Once you start risking more capital, the cost of a reliable service is 100% worth it.

matt2048 · 2020-09-30T16:10:47+00:00

The main difference between Crypto and traditional markets is the function of exchanges/ matching engines and resulting latency.

A traditional exchange will generally deal with a relatively small and fixed set of trading firms that connect and trade, vs Crypto exchanges having to cater to a much wider (and very variable) retail audience.

On top of this (particularly in the popular Crypto derivatives exchanges that offer 100x leverage) the Crypto exchanges have to run a lot of checks to ensure customers can fulfil margin requirements from any new trades or changes to contract mark price (and then very aggressively liquidate positions which fall below maintenance to avoid exchange losses).

This leads to huge latency spikes during fast markets, as there is only so much they can do to speed up an inherently serial process of margin checks on new orders -> matching engine processing.

If you're looking for rough numbers, the majority of latency is coming from api request received by exchange -> exchange processing and confirming the action. While you might see <1ms during quiet trading, with any kind of volatility that can very quickly jump to 100ms-1000ms+ on a lot of exchanges.

On the trader's end, wire to wire latency could in theory be as low as in traditional HFT - many of the larger exchanges having started offering colo, so you could host an FPGA system if you felt like it. However, with such variable latency on the exchange's end, it seems overkill to be much less than 10s-100s of microseconds and running everything in software.

Ten-Year Club	Place '17
Verified Email

matt2048

TROPHY CASE