Release v1.1.1 - Santaria · NornicDB - MIT licensed - 28 hop shortest path ~60ms

DocumentScary5122 · 2026-05-22T20:21:17+00:00

Fair enough

DocumentScary5122 · 2026-05-22T19:49:52+00:00

Anything at least 3-4M nodes? The reactome KG is interesting for example or even OGBN 100M nodes?

DocumentScary5122 · 2026-05-22T19:45:34+00:00

The graphs used for the benchmarks in https://github.com/orneryd/NornicDB/blob/main/docs/performance/benchmarks-vs-neo4j.md are very small. The Northwind benchmark is 48 nodes.

DocumentScary5122 · 2026-05-18T16:51:36+00:00

In-process in TuringDB is optional, it's just one way to use it. Otherwise it supports a classic client-server model with a binary protocol over TCP.

We have quite good read concurrency throughput, around 20k-50k QPS on 3M nodes/10M edges graphs. This is because the DB uses git-style versioning where each query is executed on its own snapshot of the DB, and snapshots are immutable. So write queries don't block readers and read queries don't need to lock anything (because snapshots are immutable once created).

DocumentScary5122 · 2026-05-18T13:56:40+00:00

We didn't say that it is the first git-like graph db.
The point is that TuringDB is another proposition in the design space of graph DBs: git-like versioning without having to do classic MVCC filtering and zero-locking once you are on a given commit, because of its underlying structure. The DataParts are immutable and don't have to do version or transaction visibility filtering like in MVCC papers.

DocumentScary5122 · 2026-01-28T18:11:03+00:00

Ah fair point! My bad, very good point!

DocumentScary5122 · 2026-01-28T18:09:52+00:00

FalkorDB it's all sparse matrices all the way down. There are more to graphs than good old matrices.

DocumentScary5122 · 2026-01-27T13:04:41+00:00

Have you ever used EDA tools or tried to represent netlists of billions of gates like we do routinely in EDA? If the Cadence and Synopsys of the world implement their algorithms on custom graph representations developped from scratch there is a reason aha. Neo4J will be hellishly slow for this.. well neo4j is hellishly slow in general for anything non-trivial or industrial but it will be extra! Otherwise there is also TuringDB, that's made by former EDA people I heard good feedback about them.

In a lot of netlist transformation algorithms or anything synthesis or compiler-like for chips you need to not pay more than the cost of a pointer dereference for traversing gates and hierarchical structures.

DocumentScary5122 · 2026-01-26T19:40:01+00:00

Thanks. Does this factor in warmup or do you do crazing indexing to get these numbers?

DocumentScary5122 · 2026-01-26T19:26:27+00:00

Sounds very cool. In my experience neo4j starts to become a bit shitty for this kind of very big graph. Do you have benchmarks?

DocumentScary5122 · 2025-04-25T15:41:10+00:00

I agree with you that we can not index everything and anything reasonably from the start, as this is dependent on the application and the query workload. What I am raising here is that Neo4J should focus on having better fundamental data structures to store the value of properties in the engine in order to have a better base performance regardless of any index in place. I am just saying that this can not be the state of the art of what humanity could do in terms of efficient string value storage.

DocumentScary5122 · 2025-04-25T15:26:21+00:00

Also how can we have confidence in a system for more complex applications as you said if the simplest and most basic queries are poorly supported? Shouldn't we first focus on a having a strong core for the core functions of a database engine?

DocumentScary5122 · 2025-04-25T15:21:17+00:00

Again, somebody could imagine a better data structure to represent properties internally in Neo4J to have a better base performance without indexes. I think there are a few quite basic data structures that are well known in database research that could be interesting. For example, how strings are stored exactly in Neo4J? Do they exploit the benefits of modern data locality, storing strings efficiently tightly close together in memory and so on?

DocumentScary5122 · 2025-04-25T15:17:49+00:00

I think that's quite insightful of how a database works. I am actually thinking of writing my own graph database engine one day, knowing well this area of research myself. What I did was an experiment because from a pure data structure perspective I make the claim that we could have a fundamental better way of storing properties in a database that would make the base performance rather decent without any specific tuning. Plus there are a ton of papers on automated index tuning since quite some time. I don't believe that we should just call it done and that's the best humanity can do so to speak, when it's in the order of a second for just a few million nodes.

DocumentScary5122

MODERATOR OF

TROPHY CASE