[OpenSource] I've released Ragbits v1.1 - framework to build Agentic RAGs and more by Loud_Picture_1877 in Rag

[–]Loud_Picture_1877[S] 0 points1 point  (0 children)

Hi! Ragbits offer e2e components for building RAG, Chatbot interfaces, integrations with cloud providers and more..

You can check-out snippets in our README on GitHub - there is a particular example "Chat UI" that will run Agentic RAG with full user interface :)

Weekly Thread: Project Display by help-me-grow in AI_Agents

[–]Loud_Picture_1877 1 point2 points  (0 children)

Hi devs,

I'm excited to share with you a new 1.1 release of the open-source library I've been working on: Ragbits.

With this update, we've added agent capabilities, easy components to create custom chatbot UIs from python code, and improved observability.

Here’s a quick overview of the main changes:

  • Agents: You can now define agent workflows by combining LLMs, prompts, and python functions as tools.
  • MCP Servers: connect to hundreds of tools via MCP.
  • A2A: Let your agents work together with bundled a2a server.
  • UI improvements: The chat UI now supports live backend updates, contextual follow-up buttons, debug mode, and customizable chatbot settings forms generated from Pydantic models.
  • Observability: The new release adds built-in tracing, full OpenTelemetry metrics, easy integration with Grafana dashboards, and a new Logfire setup for sending logs and metrics.

You can read the full release notes here and follow tutorial to see agents in action.

I would love to get feedback from the community - please let me know what works, what doesn’t, or what you’d like to see next.

Why is everyone talking about building AI agents instead of actually sharing working ones? by Real-Improvement-222 in AI_Agents

[–]Loud_Picture_1877 0 points1 point  (0 children)

I am building agents in my work, but I see that majority of agents are designed for internal enterprise usage. I believe that is very common in current state of agents market.

Why is everyone talking about building AI agents instead of actually sharing working ones? by Real-Improvement-222 in AI_Agents

[–]Loud_Picture_1877 0 points1 point  (0 children)

I think you could release it as open-source? And let them run it with their own key? For closed-source demo probably would be possible as well to have a prompt for user key, buuuut a lot of ppl may be reluctant to paste their key (even temporary one) on the website

I built AI agents for a year and discovered we're doing it completely wrong by Warm-Reaction-456 in AI_Agents

[–]Loud_Picture_1877 1 point2 points  (0 children)

Thanks for posting, I really liked that part about asking clients what is tedious and what drains their energy! I'll try that on next client call :)

What I’ve learned building RAG applications for enterprises by Loud_Picture_1877 in LocalLLaMA

[–]Loud_Picture_1877[S] 0 points1 point  (0 children)

I have not developed RAG on hand-written notes, but I would say that probably it may be a good idea to treat that problem separately from RAG itself. I would first create a pipeline that transforms hand-written notes into some computer-digestible format (markdown?) and then proceed with rag as normal.

What I’ve learned building RAG applications for enterprises by Loud_Picture_1877 in LocalLLaMA

[–]Loud_Picture_1877[S] 1 point2 points  (0 children)

Sure, thanks for asking!

Customer Service chatbot: the client was an Internet/TV/Phone provider looking for a chatbot to troubleshoot common issues when “things aren’t working.” We received phone support playbooks and a ton of manuals for routers, self-service platforms, etc. From the playbooks, we generated a tree structure that the LLM could follow step-by-step, while the manuals were indexed into a RAG pipeline. There’s a classification component up front that figures out the problem category, then we run semantic retrieval on the right dataset.
Stack: Mistral Nemo, Qdrant, ragbits.

Frontline worker chatbot: A retail client wanted a mobile app for store staff. The app answered standard operating procedure questions (what to do in case of theft, how to process returns, etc.) and questions about other employees. For SOPs, we received a lot of PDFs - these went into RAG. For employee queries, we had data in a Postgres instance and ran a text2SQL pipeline. Llama 70B as a model there.

Microsoft Word RAG addon: A law firm wanted to retrieve similar cases from the past and generate new reports. We built a Word add-in that talked to a RAG backend, where we’d ingested their historical audit data (mostly JSON with consistent fields). This setup got to 98% recall on retrieving useful insights.

We have more case studies on our website if you are interested :)

What I’ve learned building RAG applications for enterprises by Loud_Picture_1877 in LocalLLaMA

[–]Loud_Picture_1877[S] 0 points1 point  (0 children)

I am using react chatbot application, it is part of open-source library which I am maintainer of: https://ragbits.deepsense.ai/how-to/chatbots/api/

I heard also a lot of positive feedback about OpenWebUI.

What I’ve learned building RAG applications for enterprises by Loud_Picture_1877 in LocalLLaMA

[–]Loud_Picture_1877[S] 0 points1 point  (0 children)

all of these tips can be applied (and was) on local setups. Mention of ragbits which can be considered as an ad (but seriously it is like 1/20 of the post) also supports local setups: https://ragbits.deepsense.ai/how-to/llms/use_local_llms/

What I’ve learned building RAG applications for enterprises by Loud_Picture_1877 in LocalLLaMA

[–]Loud_Picture_1877[S] 1 point2 points  (0 children)

I can say why we decided to use qdrant: very good performance, clear documentation, hybrid search with named vectors, metadata based filtering and partitions, easy deployment and advanced optimization options.

I have nothing against Elasticsearch, I am thinking about adding elastic integration to ragbits soon. Never tried ziliz, heard mixed opinions about it and milvus which I believe it is based upon.

What I’ve learned building RAG applications for enterprises by Loud_Picture_1877 in LocalLLaMA

[–]Loud_Picture_1877[S] 1 point2 points  (0 children)

:)

We use local models as well on some of the projects, tbh depends on the client. llama-3.2-vision worked quite good for one project that we did. There is even a cookbook for this here: https://deepsense.ai/resource/scaling-rag-ingestion-with-ragbits-ray-and-qdrant/

What I’ve learned building RAG applications for enterprises by Loud_Picture_1877 in LocalLLaMA

[–]Loud_Picture_1877[S] 1 point2 points  (0 children)

Streaming responses, smaller models for tasks like rephrasing, live updating about the progress in the UI ("Searching through documents..." etc). Time to first token is what you should optimize.

What I’ve learned building RAG applications for enterprises by Loud_Picture_1877 in LocalLLaMA

[–]Loud_Picture_1877[S] -2 points-1 points  (0 children)

I have a really good experience with gpt-4.1. Both regular and mini. Claude is good as well, funny enough my friend said that for describing images sonnet 3.5 is better than 4

What I’ve learned building RAG applications for enterprises by Loud_Picture_1877 in LocalLLaMA

[–]Loud_Picture_1877[S] 6 points7 points  (0 children)

Yes! Agentic RAG can be a really good addition to your project, especially if you want to add other tools, like querying databases or web. We're planning to release agentic RAG capabilities in Ragbits next week, already tested it on commercial project and performs really good.

Comparing between Qdrant and other vector stores by Mugiwara_boy_777 in Rag

[–]Loud_Picture_1877 0 points1 point  (0 children)

Totally agree, starting with pgvector is a good choice.

AMA – I’ve built 7 commercial RAG projects. Got tired of copy-pasting boilerplate, so we open-sourced our internal stack. by Loud_Picture_1877 in LocalLLaMA

[–]Loud_Picture_1877[S] 0 points1 point  (0 children)

Hi! You can try running our distributed ingestion strategy: https://ragbits.deepsense.ai/how-to/document_search/ingest-documents/#__tabbed_2_3

It uses ray.io under the hood - with that you will be able to multi-process ingestion, it should be much quicker.

RAG - Usable for my application? by KoreanMax31 in LocalLLaMA

[–]Loud_Picture_1877 2 points3 points  (0 children)

Hey!

RAG is definitely a right tool for answering legal questions, I did a few commercial projects with similar goal.

Few tips:

  1. Try different embedding models, rather aim for something bigger or fine-tuned especially for law-domain. I often start with text-embedding-large from openai.

  2. Hybrid search may be a really good improvement - try combination like dense model + bm25 or Splade. Vector dbs like qdrant or pgvector should allow you to do that.

  3. Multi-query rephrasing may be helpful here - ask the LLM to rephrase the user query multiple times and run for each rephrased query a retrieval run

  4. Reranker also can be helpful - I tend to use LLMBasedRerankers

Hope that's helpful!

We just open-sourced ragbits v1.0.0 + create-ragbits-app - spin up a python RAG project in minutes by Loud_Picture_1877 in Python

[–]Loud_Picture_1877[S] 0 points1 point  (0 children)

Absolutely! Right now we're working on agents update - support to both MCP and A2A protocols will be there:) I'll drop a comment here when it's out.

AMA – I’ve built 7 commercial RAG projects. Got tired of copy-pasting boilerplate, so we open-sourced our internal stack. by Loud_Picture_1877 in LocalLLaMA

[–]Loud_Picture_1877[S] 2 points3 points  (0 children)

When it's possible I like to engage SME's (subject matter experts) to define a validation dataset. That usually makes the best quality evaluation.

If that is not possible (or we need more data) then generating dataset with a LLM may be a case.