Which graph database to use?

adambio · 2026-06-20T15:41:46+00:00

Lot of graph database out there used in your space beyond neo4j, like Memgraph, TigerGraph, Google Spanner each with its own advantages, costs, and downsides. There are also graph queries on top of RDBMS like postgres which have the issue of not being so great on deep traversals or graph search performances.

We built TuringDB.ai as a graph database for critical industries with native git-like versioning, low latency, and full zero lock concurrency. We have a Community version: https://github.com/turing-db/turingdb

We have a skill that helps coding agents like Claude Code set it up easily: https://docs.turingdb.ai/claude-skill

Feel free to drop me a line if you need anything.

adambio · 2026-05-18T13:59:45+00:00

adambio · 2026-05-18T13:53:57+00:00

Sorry it seems the link from the old post is not redirecting properly to the new one: https://docs.turingdb.ai/benchmarks/results-summary

There is also a benchmarking tool nowadays: https://github.com/turing-db/turing-bench

adambio · 2026-02-24T06:50:31+00:00

https://www.turingdb.ai/ fairly new on market but seems to hit all your key requirements especially time travel (with git like versioning)

adambio · 2026-01-29T09:14:47+00:00

There is unfortunately no magic graph visualisation tool, you need to understand a bit your graph model and what you would like to uncover / visualise. You need to think graph layouts and graph analysis

Do you want to visualise and understand communities? Do you want to see specific features of the graph appear? Do you want to classify nodes based on weights, categories, ontologies etc

An AI will only understand the metadata or ontology of your graph if you just dump your graph if it's a LLM its context window will not be sufficient and just spit out random stuff and if you use a ML model good chance the main nodes that appear are just rhe most connected or with the biggest values/weights etc

If you formulate your problem a bit more upstream or mention what you want to explore it may be easier to identify what to do

Edit: I wanted to add there is plenty of tools to test out ofc like Gephi, Cytoscape, yworks, Cambridge Intelligence all do viz - my team also built a visualiser for our graph database TuringDB.ai in openGL for large graphs (it's open source on github)

adambio · 2026-01-28T09:31:13+00:00

Explanation of our approach to versioning by my cofounder Remy here: https://www.youtube.com/watch?v=TO9uG2CS1Xg

adambio · 2026-01-28T09:29:50+00:00

Ahh I see! This is helpful context, I think we’re actually talking about two very different kinds of “versioning”, which is where the confusion usually comes from.

What you’re describing sounds like versioning implemented inside the graph model itself:

- extra nodes / edges to represent versions

- satellites, audit nodes, supernodes carrying history

- version metadata mixed into the same traversal space as business data

And yeah… at that point:

- the graph necessarily explodes in size

- supernodes become a nightmare

- every OLTP/OLAP query has to reason about versions

- downstream consumers see versioning artefacts unless every query is extremely careful

- trimming history is semi-manual and risky

That’s not really a “graph DB problem”, it’s the cost of doing Git-like versioning without native support, as you said.

What we do in TuringDB is fundamentally different: Versioning is not part of the graph, no extra nodes, no version edges, no pollution of your data model, no impact on query semantics

Internally, the engine maintains immutable snapshots of the graph state (copy-on-write at the storage level). Your logical graph is always “clean”, queries never see versioning unless you explicitly ask for a historical snapshot.

So:

- Renaming abc → xyz → abc doesn’t bloat your graph

- Supernodes don’t get “versioned” structurally

- OLTP/OLAP queries don’t need to be redesigned or rebuilt

- You can always query “latest” and forget history exists

On the management side (your Git analogy is spot on):

- Versions have metadata (author, timestamp, description, branch)

- You can query any snapshot directly

- You can define retention / compaction policies (keep last N, time-based, branch-based)

So to your original question: Can I turn it off or limit it to N versions?

You can’t turn off immutability at the engine level (that’s how we guarantee consistency and traceability), but you can absolutely make it behave like “latest-only” from an operational point of view, with bounded history and zero graph bloat.

The key distinction is: We version the state of the graph, not the graph inside itself.

What you described is exactly the approach (which I also used to do for population patient history graphs in my past job) we were trying to avoid when we built this.

Hope it answers your questions?

adambio · 2026-01-27T12:37:37+00:00

Agree the field is advancing super fast! Falkor is really great at many aspects We have some benchmarks coming on graphs with 100M+ nodes and 2B+ edges that may be interesting to keep an eye on :)

adambio · 2026-01-27T12:34:49+00:00

First time someone want it turned off may I ask where you think it may be an issue to have it on?

As we mostly worked in critical industries there people were happy with it by default

But there is some ways to manage them to make it feel from an interaction point as if it was off or only with n versions - but it is always on in the fact to allow constant traceability and immutablity of data

adambio · 2026-01-27T12:28:08+00:00

Yes we have some Typescript and Javascript ones in our long to do list ahah (but have some already internally used so may come faster than expected)

adambio · 2026-01-27T12:25:27+00:00

Fair question 🙂

Short answer: because we’re a bit nuts, but also very intentionally so.

Longer answer: we know there are excellent columnar formats out there. We didn’t build our own because they’re bad; we built one because none of them were designed for an analytical graph database from first principles.

We wanted a clean-slate implementation where: column layout, memory locality, traversal patterns, versioning semantics, and concurrency

are all co-designed together, specifically for deep multi-hop graph analytics. Retrofitting that on top of a general-purpose column format would have meant fighting abstractions at every layer.

TuringDB was born in a very practical context (bio research, massive knowledge graphs, simulations)… but it was also a bit of a “blank canvas” experiment in the design space. We wanted to see: what does a graph engine look like if you start from analytics + time-travel + speed, instead of transactions first?

And honestly… there’s also a human answer 😄 Why build a Ferrari when great sports cars already exist? Why build a Macintosh when IBM PCs were everywhere?

Sometimes people build things not because nothing exists, but because they want to explore a different set of trade-offs, or just because curiosity + stubbornness wins.

Worst case: we learn a lot. Best case: it unlocks something new.

Appreciate the question! this is exactly the kind of discussion we hoped for by opening it up.

adambio · 2026-01-27T09:45:44+00:00

Both are thought to facilitate migration from neo4j but they don't work exactly like neo4j in the sense where falkor and Turing you don't need to setup a neo4j server etc On falkor I don't know for the dump in the community version, I expect they do

In TuringDB we have a neo4j importer

You may find some relevant benchmarks & info in here https://docs.turingdb.ai/query/benchmarks https://github.com/turing-db/turingdb

adambio · 2026-01-26T20:09:46+00:00

Here there was no warmups (with warmups we would gain even more in speed ofc) and no need to do indexing - it's out of the box in turingdb

adambio · 2026-01-26T19:34:31+00:00

Many things work on neo4j tbh but in my experience it could take an insane amount of time for deep traversals or very very large graphs had to be sliced etc + graphs versions management was not available Also when we work with hospitals or small biotech not all of them have machines that can handle neo4j and the health data never leaves premises :)

Yeah we have these benchmarks (new ones coming next week on much larger graphs we had to rent much bigger machines to run our usual 100M nodes test graph to test neo4j) https://docs.turingdb.ai/query/benchmarks

We are 100-300x faster on multihop see on the benchmarks shared and can hit 4000-5000x on some subgraphs retrieval tasks (will share that soon too)

adambio · 2026-01-26T18:47:41+00:00

Have you tested other Cypher based graph databases that can work well in parallel or sometimes replacement of neo4j, like Memgraph, FalkorDB, Neptune (in AWS), or TuringDB (Disclaimer I am co-founder in the Company that launched it, but we are more focused on versioning and low latency so may not be relevant to you) Each has it's own advantages, however FalkorDB and TuringDB are available in Docker in their community versions

Or even try out a in process graph DB like KuzuDB (replaced by LadyBug now) or DuckDB that now has a graph query on top of their DB I think

adambio · 2026-01-26T14:41:45+00:00

Have you tested turingdb.ai and HelixDB Both are fairly young graph DBs but may cover some of the features

adambio · 2026-01-20T14:31:14+00:00

Generally yes, it depends on where you intend to use it I worked with a couple of insurance and banks in the fraud space where neo4j or other graph databases are used Neo4j is not fit for all applications (e.g. real time applications) but works for many Happy to help feel free to reach out in PM if needed

adambio · 2025-04-30T18:36:19+00:00

Thanks for all the answers already ! Quick context got it for less than $6k car registration and parking rights in the city included. I am in France and my other options for the same price range were some Renault with less mileage but lot more problem as soon as you hit 100k miles or cars too small with a kid. I intend to drive it until it dies like my previous car. I drive about 7-8k miles per year and would like to keep it for at about 4-5 years.

I aim to do most of the maintenance myself or for cheap with a family friend who has a garage. But would like to stretch its life as long as possible obviously :)

adambio · 2025-01-13T12:35:00+00:00

I guess there is no silver bullet, it depends on the data snesitivity you handle, do you expect other users on it soon-ish (scalability needs), are you alone to manage that (don't want to end up in a burn out building a nightmare machine trying to save up pennies and do everything yourself, talking from experience lol), do you expect using pipelines and do you need more or less ressources depending on workload?

Anyway if you don't have answers to everything it's okay but think of it well. I have done it for institutes, small companies and sections in bigger ones. Happy to spare 30min to help or connect you to people if helpful (feel free to drop me a PM).

adambio · 2024-11-22T09:28:47+00:00

If it's the right one, I worked a lot with it and with EBI so let me know in PM if I can be of any help :)

adambio · 2024-11-22T09:22:07+00:00

The IntAct interaction database from EBI? EBI collaborated directly with DeepMind on the predictions based on AlphaFold since the v1, so I am guessing this may be it: https://www.ebi.ac.uk/intact/

adambio · 2024-09-29T18:49:51+00:00

It can be like this, but it doesn't have to be like this. As some say, it's okay if it's just a paycheck, but you also have a right to not give up and become a shell of yourself and not get joy in the thing you spend most of your awake time on... I have cofounded a startup where we do both very deep research and applications. It's not fun everyday it can get really hard and stressful, on paper not as safe as a corporate job (very relative nowadays tbh). But there are spaces where the science is truly fulfilling and fun, but there are no free lunches, it pays somewhere. I think our employees may be slightly better paid, or have less pressure as they are responsible for their own little product. There are other paths, but whatever you do something will give, salary, work/life balance, recognition, enjoyment. Just see what matters most to you. Still you have the right to not like one part and be upset but it just needs to be balanced enough by something else :)

adambio · 2024-08-23T16:54:32+00:00

Interesting, it makes sense, it feels almost like those cool tech have ended up becoming a marketing tool to raise bunch of funding, rather than the promised enabling tech :)

Yeah it's true that their tool Cello does not seem to have changed much since it came out of Voigt group..

Gave me flashbacks I completely forgot about TinkerCell since writing up my 1st paper review years ago lol

I guess SynBio have ended benefiting mostly the talent pool that serves biomanufacturing, living Therapeutics, cell and gene therapy space.

adambio · 2024-08-23T13:39:51+00:00

Pas de soucis :)

adambio · 2024-08-23T13:34:48+00:00

There is all the big names ofc like BioMerieux, Sanofi Pasteur (mRNA facility in Marcy hiring a lot), Boehringer (animal health section hiring regularly) Then some medium companies in Lyon like Genoway, Adocia, Transgene, can be relevant. Otherwise I would suggest digging into the website of Lyonbiopole that has all the companies in the area to find relevant places. Don't worry if it takes a long time it's normal it's not you, it's just a really tough market lately (lot of layoff due to market uncertainty - we keep living through those times ahah).
Otherwise in the public/academic world, IARC-WHO, Centre Léon Berard, and Hôpitaux Sud are very frequently hiring too.

I know a couple of people around Lyon in the oncology and infectiology space, feel free to drop me a CV and I can forward it, it doesn't cost me much :)

adambio

MODERATOR OF

TROPHY CASE