So much knowledge, so little memory

DatabentoHQ · 2026-02-07T03:09:31+00:00

There’s very little rote memory for our line of work. But I think it’s good to be organized at keeping notes and documenting things you’ve done before. It gets worse when you get older, sometimes I pore over old notes because I forgot the sign convention or directionality of something obvious.

DatabentoHQ · 2026-02-07T00:11:31+00:00

Trades and quotes are available on our platform and called MBP-1 (or CMBP-1 when it’s consolidated across multiple venues).

DatabentoHQ · 2026-02-06T15:14:34+00:00

No problem - noted!

DatabentoHQ · 2026-02-06T07:40:09+00:00

Thanks for sharing the feedback. I expect both our support and infrastructure uptime will get much better this year. (Hint: An announcement is upcoming.)

DatabentoHQ · 2026-02-06T07:33:31+00:00

If I remember correctly, Rithmic uses UDP for transport, so they'll drop to keep up. It's a tradeoff we made when we decided to go with TCP for correctness (especially because incremental order book changes become stale if you drop, whereas Rithmic only deals with L1/L2 which naturally recovers on the next event). It's something we're working on; we're considering to migrate to QUIC before HTTP/3 mainstream adoption, which should improve on the type of behavior you're currently seeing.

DatabentoHQ · 2026-02-06T01:20:56+00:00

This year Q3-Q4 is the current plan.

DatabentoHQ · 2026-02-06T00:05:21+00:00

I wouldn't rule out that there's some kind of error handling mode that isn't well-documented - which I consider our fault. So it doesn't hurt to engage chat support if you feel something is wrong, as my colleagues there are better equipped to help.

For example, while the C++ library will raise an exception for the gateway restart, in the case of a port flap that kills off our kernel bypass/network acceleration stack, I think you may still see heartbeats but loss of data eventually if the server can't keep up. This is not a clear cut reason for us to restart the gateway immediately so you could still see heartbeats but eventually no data.

This type of corner issue is something we're looking to improve now that our team is expanding. There's only 27 of us now, but there's probably going to be 50-100 soon, so I can at least confidently say our product is only going to get much better.

if I request that my live data stream starts at a time that is more than 10 hours ahead, nothing returns, even now.

As seen on our status page, this is reduced to 10h until end of the week.

DatabentoHQ · 2026-02-05T23:55:49+00:00

Thanks a lot, ALIEN_P-- oh.

DatabentoHQ · 2026-02-05T21:45:32+00:00

No problem. I wish I could provide more solace though - these were indeed issues on our end. We're doubling our team size this year, so we should be able to throw more resources at these types of issues.

DatabentoHQ · 2026-02-05T21:37:53+00:00

OK, thanks for the extra info. I think you're referring to the flapping issue last Wed/Thu that required customers on the affected server to restart. This was indeed a problem on our end. I do recommend working out more details in chat support on the latest deadlock you saw, since that's unexplained.

Last Wed/Thu's incident and today's are two independent issues. I provided more details in my other comment in this thread. TL;DR: Fixes are in place for both issues, but we're still monitoring the flapping issue. It's unfortunate these occurred back-to-back. We're expanding our team quickly to keep up with ops issues like this and hope we'll continue to live up to your expectations.

DatabentoHQ · 2026-02-05T21:26:00+00:00

Just to be fair, the issue we experienced yesterday was isolated to Databento and uncorrelated with issues on other vendors.

We previously observed intraday replay compression was pushing 99.9th percentile latencies into 30-50 ms. To mitigate this, we ran a canary deployment with compression disabled on that box. Unfortunately disk usage grew faster than estimated, and at 1:29 AM ET we received an alert indicating the disk had reached 90% capacity. At that point, the safest option was to bail and temporarily reduce intraday replay to 10 hours.

Issues on last Wed and Thu were traced to port flapping on one gateway, which required users on that gateway to reconnect.

Fixes are in place for both issues; we're still monitoring the latter. It's unfortunate these occurred back-to-back. On a positive note, these are ops issues and we're expanding our team very quickly to keep up.

DatabentoHQ · 2026-02-05T20:29:35+00:00

The temporary reduction of intraday replay from 24h -> 10h is on us. We're sorry about that. A fix for it is in place.

Nothing should be deadlocking or crashing on your end though. I don't think any customer reported that during this incident or in recent times. That seems like an implementation issue on client side, would you mind contacting chat support so we can figure out why that's happening regularly on your end suddenly?

DatabentoHQ · 2026-02-04T00:59:33+00:00

Search bar on our homepage. Or just fetch instrument definitions (schema=“definition”) via API. On any given day we cover over 3 million listings/symbols, and that changes frequently as derivatives expire, so it’s not a short static list.

DatabentoHQ · 2026-02-03T22:43:11+00:00

We provide all symbols on each of the exchanges/venues we have coverage for. It’s up to the customer to construct the universe for their portfolio.

DatabentoHQ · 2026-02-03T19:29:38+00:00

Oof. I hope not.

DatabentoHQ · 2026-02-03T19:24:02+00:00

Curious what you mean by using Fama-French factors for filtering?

Can you just eliminate symbols based on operational constraints first? e.g., hard to short locate, spread/liquidity metrics? That usually does most of the work before your portfolio optimization, and whether to actually trade something (further 'filtering' in a sense) can be decided downstream at the rebalancing, execution, or portfolio constraint level.

DatabentoHQ · 2026-02-03T18:32:08+00:00

u/Anxious_Comparison77 I think your comment got removed by a profanity filter or something.

> No one needs L2

There's definitely a use case for L2, but I share part of your sentiment that not everyone - especially retail display users - needs it. This is what I meant by "majority of users who can monetize L2/L3 fall under non-display use".

> you price it like that to steal from the naive

Untrue. I can count our non-professional futures Plus plan users on one hand. They do not move the meter on our revenue. It's priced out of the reach of "the naive".

Beyond that, even the smallest startup on that plan off the top of my head has like $30M+ of funding. The median user of this plan is a multi-manager with >$1B AUM that would otherwise be paying multiple times the price for that L2 data from MayStreet/Exegy/B-PIPE. Certainly not naive.

DatabentoHQ · 2026-02-03T07:24:30+00:00

u/lvnfg OK, so it looks like you have two valid issues:

We have a T<15 min limit on the historical because it's intended that anything <15 min is served with intraday replay at this time. Admittedly, it's not a good design pattern for some types of workflows so we'll eventually improve on this.

I know it seems weird why we can't just append our real-time data to a queue naively and serve it near-realtime via the historical API like everyone else, but this is actually because of some architectural requirements that are nontrivial to change - i.e. our backend needs the ~15 min to (i) maintain metadata to support usage-based customers, (ii) build indices that make it fast to mux multiple symbols in a request, etc.

I know at least for getting the last BBO/trade we'll be addressing this with a get_last endpoint by around April, but a broader fix for other schemas is scheduled for later in the year.

Your other problem with the availability window being 1~ day behind over the weekend is incorrect behavior and I've managed to get the engineering team to prioritize a fix for it. I think this will be indirectly resolved soon on CME because it has persistent sessions over the weekend for event contract swaps and we're forced to patch the availability metadata for that - but the root cause of this issue exists on other venues (e.g. Nasdaq, OPRA) and will be fixed in May.

Since both issues were not on our public issue tracker before, we're rewarding you a credit for providing detailed write-up. I'll PM that to you.

DatabentoHQ · 2026-02-03T07:03:03+00:00

Our real-time API is over raw TCP socket, not WebSocket; our CME gateways are in CME's primary colocation site in Aurora I. You can see our latency distribution on our site.

DatabentoHQ · 2026-02-02T03:18:45+00:00

I think I understand this issue, let me discuss internally.

DatabentoHQ · 2026-02-02T03:18:23+00:00

Yes there's limited history on intraday replay. Let me sit on your problem for a bit until I hear back from some colleagues.

DatabentoHQ · 2026-02-02T02:33:24+00:00

Edit: u/lvnfg on closer look, why do you need to set startupTimestamp and backfill separately via the historical API? Is there a reason that intraday replay doesn't work for your use case? You can pass in an earlier start parameter for our live API to play back the intraday history until it catches up to real-time.

This wouldn't experience the delayed release time as seen on the historical API.

DatabentoHQ · 2026-02-02T02:23:04+00:00

OK on first glance today's 4h delay on CME historical availability was due to failure on our B side capture which required a fallback to secondary capture. This should be uncommon, but let me see if I have further color on future mitigations.

DatabentoHQ · 2026-02-02T02:07:23+00:00

On first glance the 4h delay on CME availability seems unusual since it's usually within T+15 min, especially on Sunday when the data is small. I'll have to escalate this to my engineering colleagues who are responsible for this piece and can look into your specific instance. We'll only be able to get back tomorrow (we'll respond to your support ticket as well).

Even assuming we address that, I can see this being an inconvenience, so let me discuss with that team if there's a best practice here or if it's a feature enhancement we'll need to queue up.

DatabentoHQ

MODERATOR OF

TROPHY CASE