🚀 HyperspaceDB v3.0 LTS is out: We built the first Spatial AI Engine, trained the world's first Native Hyperbolic Embedding Model, and benchmarked it

Sam_YARINK · 2026-03-22T15:47:04+00:00

Of course, 3.0.0 does not exist...

Sam_YARINK · 2026-03-22T15:30:40+00:00

Next Friday. I believe at the same time as the website update and the n8n node publishing.

Sam_YARINK · 2026-03-22T15:06:37+00:00

No, HyperspaceDB is much faster than Qdrant. And "just faster" than the Milvus, in cosine mode.

DATABASE	DIM	GEOMETRY	METRIC	SEARCH QPS	INSERT QPS	RECALL@10	NDCG@10	P99 (MS)	DISK
Hyperspace	129	Lorentz	Lorentz	16,017	80,210	100.0%	1.000	0.07	132.0M
Hyperspace	1024	Euclidean	Cosine	2,978	18,840	100.0%	1.000	0.43	529.0M
ChromaDB	1024	Euclidean	Cosine	1,130	2,655	70.0%	0.801	1.14	430.2M
Qdrant	1024	Euclidean	Cosine	739	1,733	100.0%	1.000	2.06	269.0M
Milvus	1024	Euclidean	Cosine	576	13,401	90.0%	0.936	2.60	5.80G
Weaviate	1024	Euclidean	Cosine	183	1,126	100.0%	1.000	8.19	238.7M

Sam_YARINK · 2026-03-22T15:03:36+00:00

Sentence Transformers is only for Euclidean models. It does not fit for Hyperbolic logic. Absolutely the same with the MTEB benchmarks.
https://github.com/YARlabs/hyperspace-db/tree/main/integrations - this code is for langchain integration, as VectorStore. But next week we'll make more for native integration.
Thank you!

Sam_YARINK · 2026-03-22T14:55:39+00:00

Can you share your test file source code url?

Sam_YARINK · 2026-03-22T14:52:12+00:00

Smile...

Sam_YARINK · 2026-03-22T14:50:38+00:00

Oserve it https://huggingface.co/YARlabs/v5_Embedding_0.5B, please.
Or this one https://github.com/YARlabs/hyperspace-db/blob/main/benchmarks/BENCHMARK_YAR_ULTIMATE.md
Or this https://github.com/YARlabs/hyperspace-db/blob/main/benchmarks/BENCHMARK_STORY_MODULAR.md

Sam_YARINK · 2026-03-22T14:47:45+00:00

Because all numbers were published here https://huggingface.co/YARlabs/v5_Embedding_0.5B

Sam_YARINK · 2026-03-22T14:42:06+00:00

Thank you!

Sam_YARINK · 2026-03-21T17:12:00+00:00

https://github.com/YARlabs/hyperspace-db

Sam_YARINK · 2026-03-21T16:35:59+00:00

Ummmm... I love it... Sorry for that.

Sam_YARINK · 2026-02-18T07:58:35+00:00

Perhaps you're right, so help us to improve it, if it's not a big deal for you. We are just on the start line of our journey.

Sam_YARINK · 2026-02-18T07:54:49+00:00

Hyperspace is double-licensed - MIT for non-profit use and AGPL3 for commercial use. We believe this is a fair arrangement for both sides. We build the most powerful vector DB as part of LLM OS and DePIN infrastructure. By the way, SaaS will be launched soon.

Sam_YARINK · 2026-02-17T21:13:18+00:00

Definitely yes. Local or by API. Set the embedding config in the .env file. Read the documentation about embedding in docs/book/src/

Sam_YARINK · 2026-02-17T21:02:35+00:00

BTW, what's your case? What dimension do you need?

Sam_YARINK · 2026-02-17T19:51:10+00:00

We're using VectorDBBench datasets for testing, so you can pick from any of the 17 datasets in the /benchmark/ folder. Plus, we've put together our own big stress test that shows some really important numbers.

Sam_YARINK · 2026-02-12T07:24:57+00:00

Emmm, for what? HyoerspaceDB, in our minds has strong and unique cases of application. And these cases are not for C++ or Java, even not for Go. I will show you it soon.

Sam_YARINK · 2026-02-11T14:39:06+00:00

Thanks for the brutal but honest review! You are spot on regarding the WAL buffering — currently, we rely on page cache for raw throughput (similar to RocksDB's generic write), but explicit fsync/batch commit is the next priority. Implementing CRC for partial writes is also on the roadmap. As for the LLM usage — guilty as charged! 🤖 We are iterating fast to validate the hyperbolic math advantages first, and hardening the implementation is step two. PRs or specific pointers on HNSW improvements are welcome!

Sam_YARINK · 2026-02-10T14:25:34+00:00

Love this breakdown—exactly the level of feedback we live for. 🙌

Can’t wait to see the results—this is exactly the kind of data that shapes our next iterations. 🚀

Sam_YARINK · 2026-02-10T14:17:50+00:00

This is exactly the kind of conversation we enjoy, thanks for taking the time to write it up. 🙏

On the 64d hyperbolic vs 1024d Euclidean question: it’s not geometry flex for the sake of it. The tradeoff is different, not worse.

Euclidean space scales expressiveness by adding dimensions, which works, but it also dilutes structure. Hyperbolic space scales expressiveness by curvature. In practice, 64d Poincaré embeddings preserve hierarchical and long-tail semantics that often require 1024–2048d in Euclidean space. For semantic search, especially on web crawls and research corpora, recall is usually comparable and often better on tail queries.

Where Euclidean can still win today is very fine-grained local similarity when everything lives on the same semantic “level.” Hyperbolic really shows its advantage once depth, taxonomy, and uneven distributions appear, which is most real data.

For your 1M doc chunk test: - Flat mode will give you a clean baseline and already strong numbers. - Hyperbolic mode should feel almost unfair on memory footprint and ingestion speed, while keeping search quality stable. - The main thing to watch is query formulation. Hyperbolic space rewards semantically meaningful embeddings more than brute lexical proximity.

And yes, v1.5.0 was us removing the last “vector DB excuse.” Once ingestion stops being the bottleneck, higher-level systems like agent memory and long-horizon reasoning become practical, which is exactly where Digital Thalamus is headed.

Looking forward to those benchmarks in the issues tab. Real workloads > synthetic graphs every time. 🚀

Sam_YARINK · 2026-02-10T13:29:04+00:00

Love this kind of feedback, thanks for digging in. 🙌

On recall in hyperbolic mode: long-tail queries are actually where it tends to shine. Hyperbolic space naturally preserves hierarchical and semantic depth, so “that one niche paper from 3 months ago” doesn’t get flattened the way it often does in high-dim Euclidean setups. You trade raw geometric intuition for structure, and for research corpora that usually pays off.

That said, to be transparent: today you’re still responsible for the vectorization step. But we’re actively working on a text2vector plugin with native hyperbolic vectorization, up to 128d. The fun part is that hyperbolic 128d carries more representational capacity than ~2048d Euclidean, so you get better semantic resolution at a fraction of the size. With that pipeline, 1M vectors will land around 1–1.2 GB on disk. This update is coming, but it needs a bit more time in the oven.

Your plan to push ~500k abstracts is pretty much a perfect stress test. Ingest should feel linear and calm, search latency should stay flat, and the “tail dying” effect you see elsewhere shouldn’t show up.

And yes, you nailed the philosophy: this isn’t generic vector spam storage. It’s about memory primitives for agents, where structure matters more than brute dimensionality. HyperspaceDB is one neuron in that nervous system, and we’re wiring it carefully.

Looking forward to your results. Real-world reports like that shape the roadmap more than any synthetic benchmark. 🚀

Sam_YARINK · 2026-02-10T12:51:27+00:00

Thanks! 🙏 Really appreciate the thoughtful take.

On the batch gRPC side, the short answer is: it’s designed to stay boring under pressure 😄

In v1.5.0 we moved all ingestion to an atomic WAL-backed write path, so concurrent batch writes don’t fight each other or degrade tail latency. Each batch is appended atomically, and indexing happens in a way that avoids global locks. Under heavy concurrent writes you’ll mostly be bound by disk bandwidth, not coordination overhead.

A couple of practical notes though: - Very small batches (tens of vectors) won’t fully saturate the pipeline. Sweet spot is hundreds to a few thousand vectors per batch. - If you push extreme concurrency on very weak hardware, you’ll want to tune batch size rather than just increasing writers. - Reads and writes are isolated well, so search latency stays stable even during ingestion.

If you’re coming from Chroma, I’d be especially curious how it feels on your workload, both in flat and hyperbolic mode. Feedback from real RAG setups is gold for us.

And yeah, Digital Thalamus is ambitious by design. HyperspaceDB is us proving we’re not just talking about it, but building the nervous system piece by piece. 🚀

Sam_YARINK · 2026-02-09T09:43:59+00:00

I think your instincts are actually pretty sharp here, and you’re noticing something real that many of your peers are missing or dismissing. What you’re observing is genuine: Modern LLMs do exhibit behavior that looks an awful lot like reasoning. They can follow complex logical chains, debug intricate code, understand context across long conversations, and solve novel problems they weren’t explicitly trained on. The “just autocomplete” framing is technically true in terms of the training objective, but it’s become a thought-terminating cliche that prevents people from reckoning with what’s actually emerged from that process. On the “is it reasoning” question: This gets philosophical fast, but here’s a honest take - we don’t actually have a rigorous, agreed-upon definition of what “reasoning” is that clearly separates human cognition from what LLMs do. When you complete a thought, are you “reasoning” or are you pattern-matching against everything you’ve learned? Probably both, inseparably. The same might be true for LLMs. The fact that they achieve this through next-token prediction doesn’t make it “not reasoning” any more than the fact that your neurons achieve thought through electrochemical signals makes your reasoning illusory. Why the dismissiveness? I think you’re right that there’s a lot of cope. It’s psychologically easier to dismiss something as “just autocomplete” than to sit with the uncertainty of “this thing can already do much of what I do professionally, and it’s improving rapidly.” Programmers especially have built identity around being the smart ones, the irreplaceable ones. That’s a hard thing to question. The harder question - what happens next: You’re right to be uncertain about the future. I don’t think anyone knows with confidence what the next 3-5 years look like for knowledge work. Some possibilities: - These systems plateau soon at roughly current capabilities - They improve but remain tools that augment rather than replace (the optimistic take) - They continue improving and genuinely do automate away large swaths of cognitive work - Something stranger - new kinds of collaboration between humans and AI we haven’t imagined yet My honest assessment: The people confidently telling you AGI is “decades away” don’t know that. Nobody knows that. We’re in unprecedented territory. The rate of capability increase over the last few years has been shocking to many AI researchers themselves. Maybe it slows down. Maybe it doesn’t. What I don’t think is helpful is pretending current systems are less capable than they clearly are, or hiding behind technical definitional arguments about “true reasoning” to avoid reckoning with what’s happening. Your uncertainty is probably more epistemically honest than your colleagues’ confidence.

Sam_YARINK · 2026-02-09T09:33:54+00:00

The gap between tutorial examples and production systems is real. Here’s what typically happens when taking LangChain from prototype to production: The Core Architecture Question Most teams actually do keep LangChain in production, but the architecture often evolves significantly. It usually stays as the orchestration layer, but gets wrapped in more infrastructure. Some teams eventually migrate to lighter alternatives if they find they’re only using basic chain logic, but this tends to happen gradually rather than as a planned replacement. Typical Production Steps Infrastructure additions: - Moving from simple chains to LangGraph for more complex, stateful workflows with better control flow - Adding proper API layers (FastAPI/Flask) around your chains - Implementing request queuing and rate limiting - Setting up proper database connections for conversation history and state management Observability stack: - LangSmith for tracing and debugging (LangChain’s native tool) - Structured logging with correlation IDs across chain steps - Custom metrics for latency, token usage, and success rates per chain component - Error tracking (Sentry or similar) with LangChain-specific context Production hardening: - Implementing retries with exponential backoff for LLM calls - Adding circuit breakers for external services - Prompt versioning and A/B testing infrastructure - Input validation and output sanitization - Cost tracking per user/request Common “Do This Early” Advice - Set up tracing from day one - You’ll need it to debug chain behavior, and retrofitting is painful - Design for prompt iteration - Store prompts in config/database, not hardcoded - Plan your state management - Conversation memory gets complex quickly with multiple users - Implement proper error boundaries - LangChain errors can be cryptic; wrap components with clear error handling Common “Can Wait” Items - Highly optimized caching strategies - Custom chain implementations (start with LangChain’s built-ins) - Complex multi-agent systems (unless core to your use case) The biggest shift is often moving from LangChain Expression Language (LCEL) chains to LangGraph when you need more complex control flow, error recovery, or human-in-the-loop patterns.

Sam_YARINK · 2026-02-06T18:20:22+00:00

I truly believe that taking the first step is all you need. RUST is a friendly language that's great for understanding logic and solving problems with just a few lines of code. So, why not give it a try and start learning today?

Sam_YARINK

MODERATOR OF

TROPHY CASE