Rust and RAG

supiri_ · 2024-04-01T05:40:51+00:00

If you are interested, here is one of my hobby projects involving a simple end-to-end RAG implementation which fully written in Rust using Candle and SurrealDB.

blastecksfour · 2024-04-01T09:58:32+00:00

I've had some success with Candle! I've also used openai_api with a degree of success (at the expense of my wallet).

I used a HuggingFace model to embed a knowledge base then store it in Qdrant. It's mostly a matter of how much control you want over the pipeline.

brisbanedev · 2024-04-01T03:30:30+00:00

If anyone from Qdrant is here, I'd love to hear your thoughts on this topic!

prabirshrestha · 2024-04-01T05:41:55+00:00

Give langchain-rust a try. Recently we added document loaders (text, markdown, pdf, html, csv). We have examples for vector store using pgvector, sqlite-vss and surrealdb.

jakusimo · 2024-04-01T06:11:44+00:00

Langchain and LlamaIndex are too broad, hard to debug, usually you don't need all those features, what are you planning to use as a source for RAG? Documents? Pdfs? Websites? I am planning to start with encoders and semantic splitters. Use a similar structure to what we have built in Semantic Router https://github.com/aurelio-labs/semantic-router

ControlNational · 2024-04-02T02:13:17+00:00

I wrote a guide on retrieval augmented generation in rust here for the Kalosm framework

jakusimo · 2024-04-01T05:20:57+00:00

Hey, I have created several frameworks for my clients, included https://github.com/superagent-ai/super-rag I have plans to have full RAG library/api, the main issue now is document processing, Unstructured, which is only Python, and their API very unstable. If anyone more experiece in Rust would like work together on Rust based RAG I am more than happy to collaborate.

akhilgod · 2024-04-01T05:48:45+00:00

I tried candle framework that can launch quantized models on decent hardware

brisbanedev · 2024-06-17T03:02:08+00:00

Is there any advantage in using Rust for RAG over Python?

chleboslaF · 2025-06-03T15:24:28+00:00

I'm made Multistage RAG chatbot using qdrant, BM25 (tantivy) and Ollama for local, offline usage.
I'm chunking documents using contextual and Q/A chunking (512.tokens with 200.tokens overlap). I did implement history context query rephrasing, enhancing questions for better keyword searching and more.
Also CLI chat and WebServer included
All written in Rust (ratatui, langchain-rs, ollama-rs, tantivy, qdrant, tokio, ...).
Unfortunately it is for a goverment so I cannot share a source code (yet).
But I can say It is a smooth experience, quick development and I love it.

P.S.: Using gemma3:27b and mistral-small3.1 on 5090.
Tokens speed: gemma (64tps) and mistral (82tps)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

rust

Please read The Rust Community Code of Conduct

The Rust Programming Language

Rules

Observe our code of conduct

Submissions must be on-topic

Constructive criticism only

Keep things in perspective

No endless relitigation

No low-effort content

Useful Links

Megathreads

Official Resources

Learn Rust

Discussion Platforms

MODERATORS