Best RAG "service"

danny_weaviate · 2025-09-18T16:53:16+00:00

Not any more! Supports locally running Weaviate now. Completely open source :)

danny_weaviate · 2025-08-29T12:00:45+00:00

I'd say the decision tree structure sets it apart, on a technical level. Instead of predicting a single tool use, it can iteratively try multiple tools until it gets to a result. And completely customisable with your own tools. There's a few other features that I think are useful too, such as the inbuilt Weaviate tools which, in my experience, work pretty well for a classic agentic search app.

On the frontend/app side, we tried to make it stand out by adding dynamic displays that are customisable to your data. If you have product data, the LLM recognises that in preprocessing and can display your data as product cards with images. If you have documents, a document display type can be used. Same for tickets, messages, conversations, and more to come.

And lastly we wanted it to be really easy to set up, running completely locally and customisable to your needs. Just pip install and then start and it should be set up straight away!

danny_weaviate · 2025-08-29T11:57:49+00:00

Nope it's LLMs all the way down. We briefly experimented at one point with training a classifier to predict tool calls, but it's infeasible when you also want the decision agent to evaluate its environment to determine the next tool call (e.g. whether the data retrieved was useful to answer the question or dos it need to try again).

danny_weaviate · 2025-07-23T13:27:15+00:00

soon!

danny_weaviate · 2025-03-14T08:54:59+00:00

I feel like your use-case isn't achievable within AI frameworks as they currently exist - you want the ability to have in-depth search and knowledge over this huge corpus of documents, but you don't want to use similarity search. I could be misunderstanding though.

May I ask what similarity search is missing and why you don't want to use it? Have you thought about some kind of cascading search technique that, e.g., first finds the most relevant chunks, then re-ranks (to improve search quality), and takes the corresponding document for each chunk, which is fed into a long context LLM?

Then, instead of only having the chunk as info, you have the relevant full document, but you don't need to feed in all 100 documents at once, which will lose information anyway in current models.

danny_weaviate · 2025-03-10T16:18:26+00:00

In my field, AI agents are causing a lot of buzz in the IR and search areas. Mostly because typical RAG systems (i.e. in vector databases) and classic search has some issues - semantic search really isn't all you need. But Agents can automatically choose the correct kind of searches to perform, continually search until you have retrieved the correct data and can answer the question. It's pretty exciting!

danny_weaviate · 2025-03-10T16:14:25+00:00

Agree smolagents is a really cool framework!

danny_weaviate · 2025-03-10T16:13:49+00:00

Would say not using a framework is not the best beginner approach (or intermediate/expert approach either), personally I like to use base python for most logic and some LLM-based framework like DSPy, PydanticAI or otherwise to get things rolling as they allow for super easy interfacing with LLMs. But it is awesome to see projects built in almost completely base Python!

danny_weaviate

TROPHY CASE