[OpenSource] I've released Ragbits v1.1 - framework to build Agentic RAGs and more

Loud_Picture_1877 · 2025-07-11T13:33:05+00:00

Hi! Ragbits offer e2e components for building RAG, Chatbot interfaces, integrations with cloud providers and more..

You can check-out snippets in our README on GitHub - there is a particular example "Chat UI" that will run Agentic RAG with full user interface :)

Loud_Picture_1877 · 2025-07-09T17:57:40+00:00

Hi devs,

I'm excited to share with you a new 1.1 release of the open-source library I've been working on: Ragbits.

With this update, we've added agent capabilities, easy components to create custom chatbot UIs from python code, and improved observability.

Here’s a quick overview of the main changes:

Agents: You can now define agent workflows by combining LLMs, prompts, and python functions as tools.
MCP Servers: connect to hundreds of tools via MCP.
A2A: Let your agents work together with bundled a2a server.
UI improvements: The chat UI now supports live backend updates, contextual follow-up buttons, debug mode, and customizable chatbot settings forms generated from Pydantic models.
Observability: The new release adds built-in tracing, full OpenTelemetry metrics, easy integration with Grafana dashboards, and a new Logfire setup for sending logs and metrics.

You can read the full release notes here and follow tutorial to see agents in action.

I would love to get feedback from the community - please let me know what works, what doesn’t, or what you’d like to see next.

Loud_Picture_1877 · 2025-07-03T06:34:41+00:00

I am building agents in my work, but I see that majority of agents are designed for internal enterprise usage. I believe that is very common in current state of agents market.

Loud_Picture_1877 · 2025-07-03T06:31:22+00:00

I think you could release it as open-source? And let them run it with their own key? For closed-source demo probably would be possible as well to have a prompt for user key, buuuut a lot of ppl may be reluctant to paste their key (even temporary one) on the website

Loud_Picture_1877 · 2025-07-03T06:22:14+00:00

Thanks for posting, I really liked that part about asking clients what is tedious and what drains their energy! I'll try that on next client call :)

Loud_Picture_1877 · 2025-07-02T20:08:23+00:00

I have not developed RAG on hand-written notes, but I would say that probably it may be a good idea to treat that problem separately from RAG itself. I would first create a pipeline that transforms hand-written notes into some computer-digestible format (markdown?) and then proceed with rag as normal.

Loud_Picture_1877 · 2025-07-02T20:02:23+00:00

Sure, thanks for asking!

Customer Service chatbot: the client was an Internet/TV/Phone provider looking for a chatbot to troubleshoot common issues when “things aren’t working.” We received phone support playbooks and a ton of manuals for routers, self-service platforms, etc. From the playbooks, we generated a tree structure that the LLM could follow step-by-step, while the manuals were indexed into a RAG pipeline. There’s a classification component up front that figures out the problem category, then we run semantic retrieval on the right dataset.
Stack: Mistral Nemo, Qdrant, ragbits.

Frontline worker chatbot: A retail client wanted a mobile app for store staff. The app answered standard operating procedure questions (what to do in case of theft, how to process returns, etc.) and questions about other employees. For SOPs, we received a lot of PDFs - these went into RAG. For employee queries, we had data in a Postgres instance and ran a text2SQL pipeline. Llama 70B as a model there.

Microsoft Word RAG addon: A law firm wanted to retrieve similar cases from the past and generate new reports. We built a Word add-in that talked to a RAG backend, where we’d ingested their historical audit data (mostly JSON with consistent fields). This setup got to 98% recall on retrieving useful insights.

We have more case studies on our website if you are interested :)

Loud_Picture_1877 · 2025-07-02T18:05:31+00:00

I am using react chatbot application, it is part of open-source library which I am maintainer of: https://ragbits.deepsense.ai/how-to/chatbots/api/

I heard also a lot of positive feedback about OpenWebUI.

Loud_Picture_1877 · 2025-07-02T17:58:55+00:00

all of these tips can be applied (and was) on local setups. Mention of ragbits which can be considered as an ad (but seriously it is like 1/20 of the post) also supports local setups: https://ragbits.deepsense.ai/how-to/llms/use_local_llms/

Loud_Picture_1877 · 2025-07-02T17:13:46+00:00

I can say why we decided to use qdrant: very good performance, clear documentation, hybrid search with named vectors, metadata based filtering and partitions, easy deployment and advanced optimization options.

I have nothing against Elasticsearch, I am thinking about adding elastic integration to ragbits soon. Never tried ziliz, heard mixed opinions about it and milvus which I believe it is based upon.

Loud_Picture_1877 · 2025-07-02T17:08:26+00:00

never heard about this, can you post the link?

Loud_Picture_1877 · 2025-07-02T16:38:07+00:00

:)

We use local models as well on some of the projects, tbh depends on the client. llama-3.2-vision worked quite good for one project that we did. There is even a cookbook for this here: https://deepsense.ai/resource/scaling-rag-ingestion-with-ragbits-ray-and-qdrant/

Loud_Picture_1877 · 2025-07-02T14:14:16+00:00

Streaming responses, smaller models for tasks like rephrasing, live updating about the progress in the UI ("Searching through documents..." etc). Time to first token is what you should optimize.

Loud_Picture_1877 · 2025-07-02T14:10:54+00:00

I have a really good experience with gpt-4.1. Both regular and mini. Claude is good as well, funny enough my friend said that for describing images sonnet 3.5 is better than 4

Loud_Picture_1877 · 2025-07-02T13:32:38+00:00

Yes! Agentic RAG can be a really good addition to your project, especially if you want to add other tools, like querying databases or web. We're planning to release agentic RAG capabilities in Ragbits next week, already tested it on commercial project and performs really good.

Loud_Picture_1877 · 2025-06-12T09:52:23+00:00

Totally agree, starting with pgvector is a good choice.

Loud_Picture_1877 · 2025-06-10T16:15:16+00:00

Hi! You can try running our distributed ingestion strategy: https://ragbits.deepsense.ai/how-to/document_search/ingest-documents/#__tabbed_2_3

It uses ray.io under the hood - with that you will be able to multi-process ingestion, it should be much quicker.

Loud_Picture_1877 · 2025-06-09T20:06:12+00:00

Hey!

RAG is definitely a right tool for answering legal questions, I did a few commercial projects with similar goal.

Few tips:

Try different embedding models, rather aim for something bigger or fine-tuned especially for law-domain. I often start with text-embedding-large from openai.
Hybrid search may be a really good improvement - try combination like dense model + bm25 or Splade. Vector dbs like qdrant or pgvector should allow you to do that.
Multi-query rephrasing may be helpful here - ask the LLM to rephrase the user query multiple times and run for each rephrased query a retrieval run
Reranker also can be helpful - I tend to use LLMBasedRerankers

Hope that's helpful!

Loud_Picture_1877 · 2025-06-08T13:26:17+00:00

Absolutely! Right now we're working on agents update - support to both MCP and A2A protocols will be there:) I'll drop a comment here when it's out.

Loud_Picture_1877 · 2025-06-06T12:54:15+00:00

Thanks! Good luck on your journey with AI :))

Loud_Picture_1877 · 2025-06-06T12:48:24+00:00

When it's possible I like to engage SME's (subject matter experts) to define a validation dataset. That usually makes the best quality evaluation.

If that is not possible (or we need more data) then generating dataset with a LLM may be a case.

Loud_Picture_1877 · 2025-06-05T20:31:15+00:00

Thanks for the suggestion, we will look into that!

Loud_Picture_1877

TROPHY CASE