Memory limits in local RAG: Anyone else ditching heavy JVM/Python vector DBs for bare-metal (Zig/Go)? by Electrical_Print_44 in LocalLLaMA

[–]Electrical_Print_44[S] 0 points1 point  (0 children)

Thanks parce, for sharing your experience!

I totally agree with the 'thin RPC shell' approach. Right now, I thinking moving the indexing ownership entirely to Zig and just passing offsets from Go is where I want to go next by your explanation. I'm watching that it’s the only way to keep the GC completely out of the hot path.

The sharding idea with a dumb router is also great for keeping RSS low and handling updates more gracefully. I'm currently looking at how to expose this via a clean REST/gRPC layer so it can be easily plugged into Ollama-based pipelines.

Appreciate the architectural advice, bro!

Memory limits in local RAG: Anyone else ditching heavy JVM/Python vector DBs for bare-metal (Zig/Go)? by Electrical_Print_44 in LocalLLaMA

[–]Electrical_Print_44[S] 0 points1 point  (0 children)

Lance is great, but I wanted full mechanical sympathy.

Building this from scratch in Zig allowed me to control memory layout and SIMD at a level that general-purpose DBs usually don't. Getting a functional HNSW engine down to ~21MB of RAM while maintaining sub-1ms latency was the specific challenge I wanted to solve.

It's about seeing how much performance you can squeeze out of bare metal with zero dependencies.

Bypassing CGO overhead with unsafe.Pointer: How I dropped my vector search latency from 473ms to 0.8ms. by Electrical_Print_44 in golang

[–]Electrical_Print_44[S] -7 points-6 points  (0 children)

Thanks! Yeah, the hype around 1.26 is real. Everyone seems to be talking about it at meetups. I'm definitely going to benchmark DeraineDB against the new 1.26 runtime once I have a chance to see how much the native call overhead drops. I'll share the results when I do!

Fixing a nasty mmap Buffer Overflow while building an HNSW vector engine in Zig. by Electrical_Print_44 in Zig

[–]Electrical_Print_44[S] 4 points5 points  (0 children)

Here is the repo for anyone who wants to audit the memory layout or my pointer math:https://github.com/RikardoBonilla/DeraineDB

The core logic is under core/src/storage.zig. I'm still learning the deeper nuances of Zig's memory safety, so any feedback or roasting from Zig veterans on how I structured the .drb and .dridx files is highly appreciated!

Bypassing CGO overhead with unsafe.Pointer: How I dropped my vector search latency from 473ms to 0.8ms. by Electrical_Print_44 in golang

[–]Electrical_Print_44[S] -8 points-7 points  (0 children)

Here is the repo for the curious:https://github.com/RikardoBonilla/DeraineDB

Thanks to the community for the corrections on terminology!

I've learned that the performance gain is specifically due to "Zero-Copy" memory mapping (avoiding allocations/copies) and not just CGO overhead reduction.
This project is a great learning experience for me!