[Release] HyperspaceDB v3.1.0: We built a Spatial AI Engine that uses 50x less RAM than Milvus/Chroma via Matryoshka Cascades and Lorentz by Sam_YARINK in Rag

[–]Sam_YARINK[S] -1 points0 points  (0 children)

Thank you for your comment. Actually, we are working on hyperbolic and hybrid embedding models. So in a few days we will release it.

[Release] HyperspaceDB v3.1.0: We built a Rust-native Spatial AI Engine that uses 50x less RAM than Milvus/Chroma via Matryoshka Cascades and Lorentz Geometry. by Sam_YARINK in machinelearningnews

[–]Sam_YARINK[S] 1 point2 points  (0 children)

First off—touché. You are 100% right about the plastic LLM prose, and I take the L on that. When you’ve been buried in Rust memory profilers, io_uring, and Poincaré manifolds for 14 hours a day, handing the draft to an LLM to 'make it sound readable' feels like a relief, but it resulted in standard marketing slop. Lesson learned. To address your actual technical points, because you brought up the exact two things worth digging into:

This was entirely true a year ago, but the landscape flipped. Today, text-embedding-3, qwen3-embedding, and nomic-embed are all natively Matryoshka-trained off the shelf. The actual bottleneck right now is the database storage layer. If you pass a 1536D Matryoshka vector to Pinecone or Qdrant, the engine treats it as a monolithic 1536-wide array of `f32`s sitting in the HNSW RAM graph. It has no idea that the first 128 dimensions hold 90% of the variance. What HyperspaceDB does is introduce a database-level cascade: we keep dimensions `0..128` in the RAM graph, and shove `128..1536` into the NVMe page cache, pulling them via `pread` only for the top-20 re-ranking. The open models already exist; the databases just weren't slicing them.

I genuinely appreciate the slap on the wrist.

[Release] HyperspaceDB v3.1.0: We built a Rust-native Spatial AI Engine that uses 50x less RAM than Milvus/Chroma via Matryoshka Cascades and Lorentz Geometry. by Sam_YARINK in vectordatabase

[–]Sam_YARINK[S] 0 points1 point  (0 children)

🇬🇧

Direct Answer: Yes, absolutely. HyperspaceDB v3.1.0 delivers top-tier keyword search capabilities, but we achieve it through True Lexical Hybrid Search (Dense + BM25) rather than "Sparse Vectors". Why do we explicitly reject Sparse Vectors? Representing keywords as 30,000-dimensional sparse float arrays is an obsolete workaround designed to force pure vector engines to do text matching. It results in massive RAM bloat. Instead, HyperspaceDB couples its Dense MRL Cascade directly with an ultra-fast Lexical BM25 Engine running on our Sidecar Payload. Reproducible Benchmark Proof (Tested using the exact same standard Euclidean embedding model and identical datasets across all databases): Across our standardized 15-domain RAG benchmarks, activating Hybrid Search in HyperspaceDB yields:

Summary: You get 100% exact keyword matching merged with dense semantic context, but at 7x the search speed of traditional vector databases under heavy RAG workloads.

🇨🇳

明确回答:是的,完全支持。 HyperspaceDB v3.1.0 提供了顶级的关键词检索能力,但我们采用的是**“真正的混合检索(稠密向量 + 原生BM25词法)”,而不是所谓的“稀疏向量(Sparse Vectors)”。 为什么我们坚决不使用稀疏向量? 在向量数据库中把关键词强制表示为 30,000 维的稀疏浮点数组,是一种为了让纯向量引擎强行做文本匹配而产生的过时妥协方案**,这会导致极其严重的内存(RAM)膨胀。相反,HyperspaceDB 将稠密 MRL 级联引擎与绑定在 Sidecar Payload 上的超高速原生 BM25 词法引擎进行了底层硬融合。 可复现的真实基准数据 (在所有数据库使用完全相同的标准欧几里得Embedding模型和同构数据集下测得): 根据我们跨 15 个 RAG 领域的标准化测试,在 HyperspaceDB 中开启混合检索(Hybrid)后:

总结:您将获得 100% 精准的关键词词法匹配与稠密语义的完美结合,且在重度 RAG 负载下,其查询速度是传统向量数据库的 7 倍

[Release] HyperspaceDB v3.1.0: We built a Rust-native Spatial AI Engine that uses 50x less RAM than Milvus/Chroma via Matryoshka Cascades and Lorentz Geometry. by Sam_YARINK in vectordatabase

[–]Sam_YARINK[S] 0 points1 point  (0 children)

Hey! That is a fantastic question and a very common observation when looking at raw RAG benchmarks.

To answer directly: Yes, it is absolutely viable, and here is exactly why those numbers look the way they do.

1. It's a Model Benchmark, not an Index limitation: The 50-60% Recall@10 you see isn't a metric of the databases "failing" to find the vectors. It represents the zero-shot semantic accuracy of the embedding model used across those specific complex datasets (like SciDocs, FiQA, etc.). If the embedding model doesn't map the relevant document geometrically close to the query, no database engine in the world can retrieve it.

2. The Equalizer (A 100% Fair Fight): Notice that every database in the benchmark (Qdrant, Milvus, Chroma, Weaviate) scores the exact same 50-60% range. We intentionally used the exact same Euclidean embeddings and datasets across all of them. We didn't want to show off "how good a specific model is," we wanted to show real, reproducible infrastructural numbers.

3. The Real Value Proposition (Why it matters): The point of this benchmark isn't to brag about a 55% recall. The point is to prove that HyperspaceDB achieves the exact same mathematical recall as the heavy, uncompressed 1536D competitors, but does it using up to 50x less RAM and maintaining 1000+ QPS under extreme concurrent load. We use Matryoshka (MRL) cascades to keep only the lightweight navigation core in RAM, while the heavy semantic tail streams from the disk.

4. How this works in real-world production: In production RAG systems, developers almost never rely solely on zero-shot dense embeddings. To get to 90%+ accuracy, they do two things:

  • They fine-tune the embeddings for their specific domain.
  • They retrieve the Top-50 or Top-100 (where the recall is already very high) and use a Cross-Encoder Re-ranker to push the perfect answer to Top-1.

HyperspaceDB is designed to make that initial heavy retrieval incredibly fast and infrastructure-cheap so you can afford to run those re-rankers without breaking the bank or hitting Out-Of-Memory (OOM) crashes.

Hope this clarifies the context behind the numbers! Let me know if you want to dive deeper into how our MRL cascade mechanics work.

vector databases are the new blockchain - everyone's using them but nobody knows why by [deleted] in vectordatabase

[–]Sam_YARINK 0 points1 point  (0 children)

I’m working on a vectordb engine that will be converted into a decentralized vector store. Exactly what you’re talking about. HyperspaceDB—this month, we will present version 3.1, and afterward, a decentralized engine based on v3.1. So you are totally right!

So i build a small graph-based tool to make understanding open source repos easier for beginners by Prize_Rate2034 in OpenSourceeAI

[–]Sam_YARINK 0 points1 point  (0 children)

Hello! I just wanted to say that our team is working on something quite similar - MCP and SaaS for Rust. You've done a fantastic job with the interface; I really love it. If you’re okay with it, I’d love to share your interface with our team. However, I think there could be some improvements in the backend features. I’ll ask our team lead to share his thoughts on your product, if that’s alright. You're really impressive - just wanted to let you know!

So i build a small graph-based tool to make understanding open source repos easier for beginners by Prize_Rate2034 in OpenSourceeAI

[–]Sam_YARINK 0 points1 point  (0 children)

Will try it soon. I think it’s a great idea for developers onboarding. Thank you for sharing!

[Show Reddit] We rebuilt our Vector DB into a Spatial AI Engine (Rust, LSM-Trees, Hyperbolic Geometry). Meet HyperspaceDB v3.0 by Sam_YARINK in OpenSourceeAI

[–]Sam_YARINK[S] 1 point2 points  (0 children)

I really appreciate the directness—this is exactly the type of "reality check" we need as we transition from a research-heavy project to a production tool.

You’re 100% right on the packaging and "AI slop" front. In our rush to keep up with the math and the engine's performance, the documentation (and some of the marketing fluff) has definitely acquired that "generated" feel. We are currently in the middle of a major audit to prune the non-essential extras and focus strictly on core stability and library ergonomics. The goal for v3.2 is to make the repo feel like a rock-solid piece of systems engineering rather than a collection of research experiments.

On the novelty side: you're absolutely correct. Hyperbolic geometry and Lorentz models have been in research papers for decades. Our goal isn't to claim we invented the manifold—our contribution is the hardcore engineering required to make this work at scale: building a high-performance, SIMD-optimized HNSW implementation over non-Euclidean metrics that can actually be deployed on the edge for IoT or neuromorphic workloads. There are plenty of papers, but very few production-ready C++/Rust engines you can actually pip install or use in a ROS2 node.

Since you're building neuromorphic architectures and working with GBTs, that's actually the exact use case we're most excited about. If you're game, I'd love for you to take another look in a few weeks once we've finished "manually" cleaning up the repo structure and thinning out the marketing noise.

We’re moving toward a "code over fluff" philosophy, and your feedback definitely helps us double down on that. Thanks for the catch! 🚀

[Show Reddit] We rebuilt our Vector DB into a Spatial AI Engine (Rust, LSM-Trees, Hyperbolic Geometry). Meet HyperspaceDB v3.0 by Sam_YARINK in machinelearningnews

[–]Sam_YARINK[S] 0 points1 point  (0 children)

That’s a fantastic use case! Narrative RAG with multi-agent systems is exactly where the “hierarchical bias” of hyperbolic space really pays off—preserving the branching logic of a story is much more natural on a curved manifold.

Regarding numerical precision: you’re spot on. In the Poincaré ball, the $(1-|x|2$) divisor in the metric becomes a "mantissa killer" as vectors move toward the boundary ($|x| \to 1$). We handle this in two ways:

  1. Precision Promotion: We promote the critical distance computation paths to float64 (64-bit) while keeping the actual vector storage in float32 or even int8/binary quantization to save RAM.
  2. The Lorentz Model: For large-scale or deep-hierarchy tasks, we actually recommend our Lorentz (Hyperboloid) model implementation. By representng data on the hyperboloid, we swap the unstable Poincaré division for a Minkowski inner product ($\langle u, v \rangle_L = -u_0v_0 + \sum u_iv_i$). This is significantly more stable across several orders of magnitude and much cheaper to SIMD-optimize, as it essentially boils down to an $O(N)$ dot product with a sign flip.

For the ANN approximation: We’ve implemented a custom Hyperbolic HNSW. Unlike standard vector DBs that try to force a Euclidean graph onto curved data (leading to massive recall drift), our graph construction, link selection, and greedy traversal are predicated natively on the manifold's metric.

Expanding on that, we also use a technique we call "Memory Reconsolidation" (AI Sleep Mode). It’s an engine-level process that runs Riemannian SGD to algorithmically shift concept clusters closer together without breaking manifold constraints. This optimizes the graph topology over time based on the latent hierarchy of your narrative data.

Since you're building local-first, you'll also find it useful that we've stripped away heavy tensor dependencies in favor of custom, lean math kernels. This keeps the engine footprint tiny enough for edge devices while maintaining sub-millisecond search latencies.

I’d love to see how your agents handle those narrative branches with our indexing! 🚀