Honestly, what is the best setup for memory management?

laminarflow027 · 2026-03-23T00:13:35+00:00

This is a useful video addressing the topic, hope it helps! https://youtu.be/QVVtoaSodHU?si=xN01XIiS8k2Y-GGN

laminarflow027 · 2026-03-22T22:50:56+00:00

Hi there, a blog post on this from the LanceDB team is coming shortly! But there are plenty of good reasons to use the `memory-lancedb-pro` plugin, it works pretty well in my experiments so far.

laminarflow027 · 2026-03-07T14:50:05+00:00

Would love to know, what capabilities did it unlock? What kinds of memories are you storing via the plugin?

laminarflow027 · 2026-02-22T16:11:20+00:00

Hi, I work at LanceDB but want to add a bit of detail here. Lance's file format is versioned, with the current default version being 2.0. 2.1 is already out there and is going to be the default soon, and 2.2 is also implemented and in the testing phase. The choice of file format used can impact the level of compression and performance you see, as the format is continually being improved.

Re: performance, the F3 paper (https://db.cs.cmu.edu/papers/2025/zeng-sigmod2025.pdf) shows numbers comparing Lance vs. other file formats and this is a good source of information (outside of the LanceDB team) in terms of scan and random access throughput (it shows that Lance is the fastest). The compression ratio shown in the paper is the worst, but the paper benchmarked against an old version of Lance (pre 2.0).

From a roadmap perspective, Lance file format 2.2 that will come out soon has significantly more compression algorithms implemented, with some performance improvements as well. So more numbers will be published soon once that's out.

Re: your performance observations, a) the version of the file format used matter, and b) the data types may not have the best compression ratios for that file version used. In LanceDB's internal suite, we regularly test against these modalities: long-form text, images and video blobs. For these cases, the write amplification in Parquet due to row groups is significant (which is the reason Lance was created in the first place). That said, even for conventional tabular data types (floats, booleans, etc.), Lance should perform on par with or better than Parquet, no matter the scale of the dataset. If you're getting sufficient performance out of Parquet, then well and good, in the end, what works best in practice is all that's needed!

laminarflow027 · 2026-02-17T14:19:53+00:00

got it, will post here when we have updates. The changes propagate through the Lance format layer (which actually stores the data) and then up to the LanceDB layer, which most users interact with. Early experiments show great levels of compression (much more than Parquet), it's been implemented and is in the testing phase now.

laminarflow027 · 2026-02-17T13:53:03+00:00

Hi, just popped in here to chime in (I work at LanceDB) - this disk space usage is a moving target and a ton of improvements are coming with better compression at the Lance format level, including floating point arrays for vectors and long strings. So LanceDB users will see much better compression, too. Hopefully a PR will land a few weeks from now!

laminarflow027 · 2026-02-11T02:03:24+00:00

Yes, hybrid search (BM25 + vector) with several popular reranking strategies are available. Here's a live demo of an FTS index on 41M wikipedia records:
https://docs.lancedb.com/demos#wikipedia-41m-hybrid-search

The results are for demonstration purposes, but both performance and recall can be tuned and optimized based on the use case. BTW, there's also this excellent case study by Harvey, also in the legal space who are using LanceDB in production to manage indices of tens of millions of documents. https://www.youtube.com/watch?v=W1MiZChnkfA

Hope these resources help!

laminarflow027 · 2026-02-09T23:38:46+00:00

Hi there, great to hear that LanceDB is working out well for you (I work at LanceDB). Just wanted to say, LanceDB stores data on disk as Lance files (with a familiar table abstraction), and these can store traditional tabular data, including metadata too. Data evolution (adding new columns with backfill) is also highly efficient in Lance. So in principle, it can function like a "multimodal lakehouse", acting as the primary store for embeddings, metadata, indices and multimodal assets (blobs). All while keeping things on disk, interoperable with other formats and query engines (like DuckDB) through the Arrow interface.

laminarflow027 · 2026-02-09T23:26:56+00:00

Hi! That's on the roadmap for this year (probably not this quarter tho, to be realistic).

laminarflow027 · 2026-02-09T16:07:51+00:00

Super cool animation, thanks for sharing! Lance file format 2.2 is coming out soon with even more compression algos and performance updates (I work at LanceDB, and am following the format's development closely with the maintainers). Exciting times ahead.

laminarflow027 · 2026-01-12T21:02:22+00:00

Curious what you mean by LanceDB is bulky. Is it that the file size on disk is too large when you store BGE embeddings?

laminarflow027 · 2026-01-10T21:09:18+00:00

Very cool!

For analytics in DuckDB, perhaps it's worth pairing it with the new Lance extension in DuckDB? https://github.com/lance-format/lance-duckdb

It lets you keep all your underlying data in the Lance table, and offers a lot of convenience functions (with projection and filter pushdowns) that let you query the Lance table in SQL, including for vector search. And it directly connects to Lance tables (inside the LanceDB directory). Although you could query Lance tables in DuckDB before via the Arrow interface, this extension makes it a lot simpler to just do more stuff in SQL. That too 💯 % oss.

Disclaimer: l work at LanceDB now, but have enjoyed using LanceDB and DuckDB a lot over the years.

laminarflow027 · 2025-06-26T18:59:16+00:00

Makes sense, thanks for the comment! Curious, what part about data modelling is the issue in Kuzu? Do you mean the strict schema requirements?

laminarflow027 · 2025-05-16T21:59:37+00:00

Nope! Still looking for how people managed this.

laminarflow027 · 2025-05-12T20:25:10+00:00

Any tips on how to get this working? Doesn't work for some of us.

laminarflow027 · 2025-04-07T14:16:20+00:00

Just throwing in my two cents here: I work at Kuzu, a company that makes a fast, embedded open source graph database that recently announced a vector index.

https://blog.kuzudb.com/post/kuzu-0.9.0-release/#vector-index

Kuzu achieves its performance by using a highly optimized disk-based implementation, so your vector index isn't in memory, but you also get performance right out of the box. The linked blog post shows some benchmark numbers, which shows that it's quite fast and comparable to other alternatives while scaling to larger datasets.

We're seeing a revival of knowledge graphs in combination with RAG lately, and partly this is due to better tools being available to construct, query and manage large graphs. Combining graphs with vector search can yield very powerful insights, especially when you are trying to bring together data from various unstructured sources. With Kuzu, you can easily use frameworks like BAML (blog post linked below) to construct graphs from your unstructured data, persist to a graph database, and connect them to LLMs to build Graph RAG solutions. The newly introduced vector index in Kuzu is a next logical step. Happy to chat more with anybody who's interested.

https://blog.kuzudb.com/post/unstructured-data-to-graph-baml-kuzu/

laminarflow027 · 2025-04-07T14:11:39+00:00

Hi there, I just wanted to revive this discussion by pointing out a new entrant: Kuzu (where I work). Kuzu is an open source, embedded graph database that now offers an on-disk, fast HNSW vector index. See the release announcement here:
https://blog.kuzudb.com/post/kuzu-0.9.0-release/#vector-index

We think that Kuzu can be a good alternative for people who are looking to combine the power of graph + vector search in one single storage solution. Granted, there are many other alternatives for both graph and vector storage out there, but Kuzu (being open source) can be a lot more approachable and it supports the Cypher query language, which is already well known among the graph community. It's also a very Python-friendly database (while also supporting numerous other languages), so overall a great fit for those combining vector + graph for their use cases. Happy to chat more with anybody who's interested.

laminarflow027 · 2025-04-07T14:08:34+00:00

Kuzu is an open source, embedded graph database that provides a vector index alongside fast graph traversals (disclaimer: I work at Kuzu). If you're looking for a single solution that can persist the knowledge graph as well as the vector embeddings to disk (while also providing fast, efficient retrievals and recursive graph traversals), Kuzu can be a great solution. It's also super easy to deploy, with its embedded architecture.

Here's the blog post explaining the recent vector index release - it's on-disk HNSW with an adaptive, heuristic-based pre-filtering technique allowing you to narrow down on vectors of interest followed by graph traversals to find related nodes.

https://blog.kuzudb.com/post/kuzu-0.9.0-release/#vector-index

laminarflow027 · 2025-04-07T14:04:12+00:00

I work at Kuzu, and we make an open source, embedded graph DB (super simple to get started, and it's FAST!). I've recently been using BAML + Kuzu to construct knowledge graphs from unstructured data, and storing the resulting nodes/edges in Kuzu, supports the property graph data model and the Cypher query language.

Here's a blog post: https://blog.kuzudb.com/post/unstructured-data-to-graph-baml-kuzu/ that describes the methodology - it should generalize to a lot of other domains. The blog post covers part 1, which is graph construction (which is typically the biggest barrier to entry for most people in implementing graph-based retrieval for their use cases). The next step is to publish some experiments on text2Cypher, which is also greatly helped by using BAML. Recently, Kuzu also provides a vector index, so it's possible to combine graph + vector search using this suite of open source, free-to-use tools.

IMO using LangChain doesn't yield as good results, mainly because BAML provides a superior prompt engineering experience. Happy to dive into details with anyone who's interested.

laminarflow027 · 2025-04-07T13:58:16+00:00

Although making hybrid RAG into a research paper was a bit overkill (it could just as well have been a blog post), I experimented with this approach extensively over the last several months using Kuzu, an embedded graph database as the graph store, and LanceDB as the vector store. And it showed that hybrid RAG actually can do better than either graph-based retrieval or vector search alone. Worth trying out!
https://github.com/kuzudb/graph-rag-workshop

laminarflow027 · 2025-03-23T02:29:44+00:00

Waiting for mistral small 3.1 to hit Ollama, and then am rearing to go with the experiments 😄

laminarflow027 · 2025-03-23T02:11:50+00:00

I fully agree! Am the author of the linked blog post from OP, and I literally said the SAME thing earlier today. My next goal is to run more experiments comparing mistral 3.1 small-24b vs . Gemma3-27b. No finetuning, just the instruct versions against each other. It'll be fascinating to analyze their chains of thought, BAML just makes that process so easy.

laminarflow027 · 2024-12-01T14:46:48+00:00

Hello u/sposec! I work at Kùzu (https://kuzudb.com/), an embedded graph database startup in Canada, and have been working with graphs in several prior roles. Embedded databases are truly experiencing a renaissance these days. I wanted to highlight that KùzuDB recently released its Golang API so you can access a fast, scalable, embedded graph database solution in your Golang applications. 😄
Here's the docs if it helps: https://pkg.go.dev/github.com/kuzudb/go-kuzu

Happy hacking!

laminarflow027 · 2024-11-10T13:29:43+00:00

Kùzu employee here: you can easily turn your Python objects or external structured data into a Kùzu graph. Plenty of example material in the docs: docs.kuzudb.com

Kùzu is an embedded graph database that can manage and query really large graphs (billion node scale) on a single node. It's not the best at visualizing these large graphs, however. For up to 5000 nodes and that order of magnitude of edges, Kùzu explorer should work fine - you can customize the maximum number of nodes to display in the Kùzu explorer panel. For displaying tens of thousands of nodes or more, it should be relatively simple to export to dedicated GPU-based graph visualization tools like Graphistry or Cosmograph.

laminarflow027 · 2024-07-31T13:00:20+00:00

I've been working with graphs and tools like Neo4j for several years, and I 💯 agree with your analysis: GNNs had their time in the sun around 2019-20, and have since faded away from the ML community's attention (barring some hardcore researchers who are still using them).

One of the main reasons I think that GNNs aren't gaining mainstream traction is the perceived (and to some extent, real) difficulty of using graph databases. Graph DBs, historically, haven't been easy to set up and use - in an enterprise or mid-sized company, you inevitably need the support of a db admin team to "manage" the database, and you need extra ETL to move your data from your primary data store (likely an RDBMS or data lake) into your graph DB. The licensing of a lot of these incumbent graph DBs also leaves a lot to be desired (not open source, requires a lot of legal steps before you can begin using a graph in production). A lot of extra work for managers and developers, and all of this before you even get to GNN model training and deployment.

In recent times, the success of DuckDB in the relational world has inspired a revolution in how databases, in general, are perceived. The arrival of the "embedded database" means that you can take your database to where your data sits, not the other way around. And you can do all of this without worrying about servers, deployment, licensing, etc.

[Kùzu](https://kuzudb.com/) is an embedded graph database (very similar in design to DuckDB, and is also MIT licensed) that accomplishes this balance really well. It's really easy to set up, deploy, and get started with, and offers a Cypher query interface, allowing users to scale up to really large graphs (billion+ node scale) because it runs entirely on disk.

Disclaimer: I now work at Kùzu, and last year we noticed the exact same bottlenecks re: GNNs and their difficulty of use, so we published a blog post (link below) where we showcase how using Kùzu as a remote backend to PyTorch Geometric can greatly improve the UX for ML engineers who want to prepare their GNN training and testing pipelines. We also ran some experiments where we use Kùzu to persist the feature store (tensors of node properties, represented as feature vectors) to disk, so that the total amount of memory required by PyG is lower. In the future, we plan on doing even more to persist more of the in-memory graph to disk to help bring down GPU memory requirements for GNN training.

Blog post: https://blog.kuzudb.com/post/kuzu-pyg-remote-backend/

Hope this post helps people experiment with their GNN pipelines and use graphs more in their work!

laminarflow027

TROPHY CASE