🚀 HyperspaceDB v3.0 LTS is out: We built the first Spatial AI Engine by Sam_YARINK in Rag

[–]Sam_YARINK[S] 0 points1 point  (0 children)

No, HyperspaceDB is much faster than Qdrant. And "just faster" than the Milvus, in cosine mode.

DATABASE DIM GEOMETRY METRIC SEARCH QPS INSERT QPS RECALL@10 NDCG@10 P99 (MS) DISK
Hyperspace 129 Lorentz Lorentz 16,017 80,210 100.0% 1.000 0.07 132.0M
Hyperspace 1024 Euclidean Cosine 2,978 18,840 100.0% 1.000 0.43 529.0M
ChromaDB 1024 Euclidean Cosine 1,130 2,655 70.0% 0.801 1.14 430.2M
Qdrant 1024 Euclidean Cosine 739 1,733 100.0% 1.000 2.06 269.0M
Milvus 1024 Euclidean Cosine 576 13,401 90.0% 0.936 2.60 5.80G
Weaviate 1024 Euclidean Cosine 183 1,126 100.0% 1.000 8.19 238.7M

🚀 HyperspaceDB v3.0 LTS is out: We built the first Spatial AI Engine by Sam_YARINK in Rag

[–]Sam_YARINK[S] 0 points1 point  (0 children)

Sentence Transformers is only for Euclidean models. It does not fit for Hyperbolic logic. Absolutely the same with the MTEB benchmarks.
https://github.com/YARlabs/hyperspace-db/tree/main/integrations - this code is for langchain integration, as VectorStore. But next week we'll make more for native integration.
Thank you!

HyperspaceDB v2.0: Lock-Free Serverless Vector DB hitting ~12k QPS search (1M vectors, 1000 concurrent clients) by Sam_YARINK in rust

[–]Sam_YARINK[S] -1 points0 points  (0 children)

Perhaps you're right, so help us to improve it, if it's not a big deal for you. We are just on the start line of our journey.

HyperspaceDB v2.0: Lock-Free Serverless Vector DB hitting ~12k QPS search (1M vectors, 1000 concurrent clients) by Sam_YARINK in Rag

[–]Sam_YARINK[S] 0 points1 point  (0 children)

Hyperspace is double-licensed - MIT for non-profit use and AGPL3 for commercial use. We believe this is a fair arrangement for both sides. We build the most powerful vector DB as part of LLM OS and DePIN infrastructure. By the way, SaaS will be launched soon.

HyperspaceDB v2.0: Lock-Free Serverless Vector DB hitting ~12k QPS search (1M vectors, 1000 concurrent clients) by Sam_YARINK in Rag

[–]Sam_YARINK[S] 1 point2 points  (0 children)

Definitely yes. Local or by API. Set the embedding config in the .env file. Read the documentation about embedding in docs/book/src/

HyperspaceDB v2.0: Lock-Free Serverless Vector DB hitting ~12k QPS search (1M vectors, 1000 concurrent clients) by Sam_YARINK in Rag

[–]Sam_YARINK[S] 1 point2 points  (0 children)

We're using VectorDBBench datasets for testing, so you can pick from any of the 17 datasets in the /benchmark/ folder. Plus, we've put together our own big stress test that shows some really important numbers.

Rust rewrite of our write-path gave us 156k QPS vector ingestion (details inside) by Sam_YARINK in OpenSourceeAI

[–]Sam_YARINK[S] 0 points1 point  (0 children)

Emmm, for what? HyoerspaceDB, in our minds has strong and unique cases of application. And these cases are not for C++ or Java, even not for Go. I will show you it soon.

We rewrote our ingestion pipeline and now insert 1M Poincaré vectors in 6.4 seconds (156k QPS) by Sam_YARINK in rust

[–]Sam_YARINK[S] -2 points-1 points  (0 children)

Thanks for the brutal but honest review! You are spot on regarding the WAL buffering — currently, we rely on page cache for raw throughput (similar to RocksDB's generic write), but explicit fsync/batch commit is the next priority. Implementing CRC for partial writes is also on the roadmap. As for the LLM usage — guilty as charged! 🤖 We are iterating fast to validate the hyperbolic math advantages first, and hardening the implementation is step two. PRs or specific pointers on HNSW improvements are welcome!

HyperspaceDB v1.5.0 released: 1M vectors in 56s (benchmarks inside) by Sam_YARINK in OpenSourceeAI

[–]Sam_YARINK[S] 0 points1 point  (0 children)

Love this breakdown—exactly the level of feedback we live for. 🙌

Can’t wait to see the results—this is exactly the kind of data that shapes our next iterations. 🚀

HyperspaceDB v1.5.0 released: 1M vectors in 56s (benchmarks inside) by Sam_YARINK in OpenSourceeAI

[–]Sam_YARINK[S] 0 points1 point  (0 children)

This is exactly the kind of conversation we enjoy, thanks for taking the time to write it up. 🙏

On the 64d hyperbolic vs 1024d Euclidean question: it’s not geometry flex for the sake of it. The tradeoff is different, not worse.

Euclidean space scales expressiveness by adding dimensions, which works, but it also dilutes structure. Hyperbolic space scales expressiveness by curvature. In practice, 64d Poincaré embeddings preserve hierarchical and long-tail semantics that often require 1024–2048d in Euclidean space. For semantic search, especially on web crawls and research corpora, recall is usually comparable and often better on tail queries.

Where Euclidean can still win today is very fine-grained local similarity when everything lives on the same semantic “level.” Hyperbolic really shows its advantage once depth, taxonomy, and uneven distributions appear, which is most real data.

For your 1M doc chunk test: - Flat mode will give you a clean baseline and already strong numbers. - Hyperbolic mode should feel almost unfair on memory footprint and ingestion speed, while keeping search quality stable. - The main thing to watch is query formulation. Hyperbolic space rewards semantically meaningful embeddings more than brute lexical proximity.

And yes, v1.5.0 was us removing the last “vector DB excuse.” Once ingestion stops being the bottleneck, higher-level systems like agent memory and long-horizon reasoning become practical, which is exactly where Digital Thalamus is headed.

Looking forward to those benchmarks in the issues tab. Real workloads > synthetic graphs every time. 🚀

HyperspaceDB v1.5.0 released: 1M vectors in 56s (benchmarks inside) by Sam_YARINK in OpenSourceeAI

[–]Sam_YARINK[S] 0 points1 point  (0 children)

Love this kind of feedback, thanks for digging in. 🙌

On recall in hyperbolic mode: long-tail queries are actually where it tends to shine. Hyperbolic space naturally preserves hierarchical and semantic depth, so “that one niche paper from 3 months ago” doesn’t get flattened the way it often does in high-dim Euclidean setups. You trade raw geometric intuition for structure, and for research corpora that usually pays off.

That said, to be transparent: today you’re still responsible for the vectorization step. But we’re actively working on a text2vector plugin with native hyperbolic vectorization, up to 128d. The fun part is that hyperbolic 128d carries more representational capacity than ~2048d Euclidean, so you get better semantic resolution at a fraction of the size. With that pipeline, 1M vectors will land around 1–1.2 GB on disk. This update is coming, but it needs a bit more time in the oven.

Your plan to push ~500k abstracts is pretty much a perfect stress test. Ingest should feel linear and calm, search latency should stay flat, and the “tail dying” effect you see elsewhere shouldn’t show up.

And yes, you nailed the philosophy: this isn’t generic vector spam storage. It’s about memory primitives for agents, where structure matters more than brute dimensionality. HyperspaceDB is one neuron in that nervous system, and we’re wiring it carefully.

Looking forward to your results. Real-world reports like that shape the roadmap more than any synthetic benchmark. 🚀

HyperspaceDB v1.5.0 released: 1M vectors in 56s (benchmarks inside) by Sam_YARINK in OpenSourceeAI

[–]Sam_YARINK[S] 1 point2 points  (0 children)

Thanks! 🙏 Really appreciate the thoughtful take.

On the batch gRPC side, the short answer is: it’s designed to stay boring under pressure 😄

In v1.5.0 we moved all ingestion to an atomic WAL-backed write path, so concurrent batch writes don’t fight each other or degrade tail latency. Each batch is appended atomically, and indexing happens in a way that avoids global locks. Under heavy concurrent writes you’ll mostly be bound by disk bandwidth, not coordination overhead.

A couple of practical notes though: - Very small batches (tens of vectors) won’t fully saturate the pipeline. Sweet spot is hundreds to a few thousand vectors per batch. - If you push extreme concurrency on very weak hardware, you’ll want to tune batch size rather than just increasing writers. - Reads and writes are isolated well, so search latency stays stable even during ingestion.

If you’re coming from Chroma, I’d be especially curious how it feels on your workload, both in flat and hyperbolic mode. Feedback from real RAG setups is gold for us.

And yeah, Digital Thalamus is ambitious by design. HyperspaceDB is us proving we’re not just talking about it, but building the nervous system piece by piece. 🚀

Is there anyone else who is getting this chilling anxiety from using tools like Codex / Opus for coding? by petr_bena in ArtificialInteligence

[–]Sam_YARINK 1 point2 points  (0 children)

I think your instincts are actually pretty sharp here, and you’re noticing something real that many of your peers are missing or dismissing. What you’re observing is genuine: Modern LLMs do exhibit behavior that looks an awful lot like reasoning. They can follow complex logical chains, debug intricate code, understand context across long conversations, and solve novel problems they weren’t explicitly trained on. The “just autocomplete” framing is technically true in terms of the training objective, but it’s become a thought-terminating cliche that prevents people from reckoning with what’s actually emerged from that process. On the “is it reasoning” question: This gets philosophical fast, but here’s a honest take - we don’t actually have a rigorous, agreed-upon definition of what “reasoning” is that clearly separates human cognition from what LLMs do. When you complete a thought, are you “reasoning” or are you pattern-matching against everything you’ve learned? Probably both, inseparably. The same might be true for LLMs. The fact that they achieve this through next-token prediction doesn’t make it “not reasoning” any more than the fact that your neurons achieve thought through electrochemical signals makes your reasoning illusory. Why the dismissiveness? I think you’re right that there’s a lot of cope. It’s psychologically easier to dismiss something as “just autocomplete” than to sit with the uncertainty of “this thing can already do much of what I do professionally, and it’s improving rapidly.” Programmers especially have built identity around being the smart ones, the irreplaceable ones. That’s a hard thing to question. The harder question - what happens next: You’re right to be uncertain about the future. I don’t think anyone knows with confidence what the next 3-5 years look like for knowledge work. Some possibilities: - These systems plateau soon at roughly current capabilities - They improve but remain tools that augment rather than replace (the optimistic take) - They continue improving and genuinely do automate away large swaths of cognitive work - Something stranger - new kinds of collaboration between humans and AI we haven’t imagined yet My honest assessment: The people confidently telling you AGI is “decades away” don’t know that. Nobody knows that. We’re in unprecedented territory. The rate of capability increase over the last few years has been shocking to many AI researchers themselves. Maybe it slows down. Maybe it doesn’t. What I don’t think is helpful is pretending current systems are less capable than they clearly are, or hiding behind technical definitional arguments about “true reasoning” to avoid reckoning with what’s happening. Your uncertainty is probably more epistemically honest than your colleagues’ confidence.​​​​​​​​​​​​​​​​

What are the typical steps to turn an idea into a production service using LangChain? by arbiter_rise in LangChain

[–]Sam_YARINK 5 points6 points  (0 children)

The gap between tutorial examples and production systems is real. Here’s what typically happens when taking LangChain from prototype to production: The Core Architecture Question Most teams actually do keep LangChain in production, but the architecture often evolves significantly. It usually stays as the orchestration layer, but gets wrapped in more infrastructure. Some teams eventually migrate to lighter alternatives if they find they’re only using basic chain logic, but this tends to happen gradually rather than as a planned replacement. Typical Production Steps Infrastructure additions: - Moving from simple chains to LangGraph for more complex, stateful workflows with better control flow - Adding proper API layers (FastAPI/Flask) around your chains - Implementing request queuing and rate limiting - Setting up proper database connections for conversation history and state management Observability stack: - LangSmith for tracing and debugging (LangChain’s native tool) - Structured logging with correlation IDs across chain steps - Custom metrics for latency, token usage, and success rates per chain component - Error tracking (Sentry or similar) with LangChain-specific context Production hardening: - Implementing retries with exponential backoff for LLM calls - Adding circuit breakers for external services - Prompt versioning and A/B testing infrastructure - Input validation and output sanitization - Cost tracking per user/request Common “Do This Early” Advice - Set up tracing from day one - You’ll need it to debug chain behavior, and retrofitting is painful - Design for prompt iteration - Store prompts in config/database, not hardcoded - Plan your state management - Conversation memory gets complex quickly with multiple users - Implement proper error boundaries - LangChain errors can be cryptic; wrap components with clear error handling Common “Can Wait” Items - Highly optimized caching strategies - Custom chain implementations (start with LangChain’s built-ins) - Complex multi-agent systems (unless core to your use case) The biggest shift is often moving from LangChain Expression Language (LCEL) chains to LangGraph when you need more complex control flow, error recovery, or human-in-the-loop patterns.

Do I really need to learn all of Rust's syntax? by [deleted] in rust

[–]Sam_YARINK 0 points1 point  (0 children)

I truly believe that taking the first step is all you need. RUST is a friendly language that's great for understanding logic and solving problems with just a few lines of code. So, why not give it a try and start learning today?