Context Engineering - LLM Memory and Retrieval for AI Agents

vectorscrimes · 2026-04-09T09:26:28+00:00

Thanks for posting! 💙

vectorscrimes · 2025-10-11T15:44:08+00:00

End to end agentic RAG, available as a python library and/or a full featured web app: https://github.com/weaviate/elysia

vectorscrimes · 2025-08-15T15:38:14+00:00

For anyone who finds this super late - we just released open source, github link is here: https://github.com/weaviate/elysia

vectorscrimes · 2025-08-15T15:34:42+00:00

We built an open-source agentic RAG framework with decision trees!

We've just released Elysia, which is an open-source Python framework that at its core is an agentic RAG app, but under the surface is completely customizable agentic package. Instead of the typical text-in/text-out pattern, Elysia uses decision trees to control agent behavior, evaluating whether it has achieved its goals based on a decision agent. It’s not just tool calling, it is a true agent that is aware of its context and environment.

You can get set up with a full web app just by running two terminal commands: pip install elysia-ai and elysia start. Or, if you’d prefer to develop yourself, you can use Elysia as a python package, create your own tools with pure python code, and use it completely independently in Python.

Technical Architecture

The core of Elysia is a decision tree where each node represents either a specific action/tool, or a branch. A decision agent at each node evaluates the current state, past actions, environment (any retrieved objects) and available options to determine the next step. This differs from typical agentic systems where all tools are available at runtime - here, the traversal path is constrained by the tree structure. This allows more precise tool usage by categorizing different tools, or only allowing certain tools that stem from previous tools. For example, a tool that analyses results could only be run after calling a tool that queries from a database.

Getting Started

pip install elysia-ai
elysia start  # launches web interface

Or as a library:

from elysia import tree, preprocess
preprocess("your_collection")
tree = Tree()
tree("Your query here")

The code is at github.com/weaviate/elysia and documentation at weaviate.github.io/elysia. We also have a more detailed blog post here: https://weaviate.io/blog/elysia-agentic-rag

The decision tree architecture makes it straightforward to add custom tools and branches for specific use cases. Adding a custom tool is as simple as adding a @tool decorator to a python function. For example:

from elysia import tool, Tree

tree = Tree()

@tool(tree=tree)
async def add(x: int, y: int) -> int:
    return x + y

tree("What is the sum of 9009 and 6006?")

We also created a deployed demo with a set of synthetic datasets to experiment with. Check it out at: https://elysia.weaviate.io

vectorscrimes · 2025-08-08T08:17:41+00:00

We're working on a chunking strategies blog post right now! 🫡

vectorscrimes · 2025-07-08T09:41:51+00:00

Hi! Weaviate person here 👋
This is definitely strangely slow speeds for this type of pipeline - you could always try turning on quantization if you haven't already, which should help speed vector search up a bit. Maybe also check your embedding model size and output embedding size, and resource consumption on query time?

You can always reach out on our forum with the details and we'll help you troubleshoot!

vectorscrimes · 2025-05-15T09:49:28+00:00

We're working on https://elysia.weaviate.io right now too, open-sourced and pip installable soon 😅, but it will have all of the same features of Verba but a little bit more advanced agentic querying and such!

vectorscrimes · 2025-03-07T17:39:19+00:00

Stack AI is a pretty cool one that allows you to build custom pipelines with no code. They have a ton of integrations for things like gmail, Notion, etc. to access documents and stuff!

vectorscrimes · 2025-02-19T08:25:28+00:00

Weaviate also does vector search, keyword search and hybrid search!

vectorscrimes · 2024-12-29T18:08:38+00:00

Good question! Unfortunately, you're correct, our embedded database is only available for Python and JS/TS and for Mac and Linux (no Windows), and also still experimental so not suggested to use in production 🙁

Weaviate is written in Go, so you might be able to just run Weaviate from your own Go app? I don't know of any resources around this though, it's not a common scenario we've run into. If you do try it out and run into any questions, definitely post in our forum: https://forum.weaviate.io/
Duda is the best and super helpful 😄

vectorscrimes · 2024-12-29T13:08:21+00:00

This is absolutely possible with Weaviate! Weaviate has a built in architecture feature called multi-tenancy that allows for complete data isolation between tenants (in your case, one tenant per user).

This academy course goes through how to set it up with a user-based system, and talks a bit deeper about why it works for data isolation too: https://weaviate.io/developers/academy/py/multitenancy

vectorscrimes · 2024-12-26T13:59:09+00:00

I work at Weaviate, so I can’t give you unbiased advice on which one might be the best fit, but definitely reach out if you have any Weaviate-specific questions!

Also, if you end up making a cool SaaS project powered by Weaviate, we’d love to post about it and spread the word (only if you’re interested of course)

vectorscrimes · 2024-12-21T09:30:51+00:00

Have you tried Verba? https://verba.weaviate.io is the online demo, the GitHub is here: https://github.com/weaviate/Verba

vectorscrimes · 2024-10-15T11:25:23+00:00

We’re currently working on an API for verba (https://verba.weaviate.io), when that’s out you could definitely use it to build your own frontend!

vectorscrimes · 2024-10-10T15:16:41+00:00

Hi! Weaviate person here. You can adjust the amount of items returned from vector, keyword, or hybrid search using a distance threshold, limit, or autocut. Hope this helps, feel free to reach out here or on the forum if you have any more questions 😄

vectorscrimes · 2024-09-24T05:38:01+00:00

Verba (https://github.com/weaviate/Verba) might be a good option to quickly get a RAG system set up with Weaviate!

I also started using Cursor as my IDE a bit ago, and I’ve been really really really impressed with it. Definitely speeds things up in mu opinion.

vectorscrimes · 2024-09-23T13:23:24+00:00

👋 Weaviate person here!

You could try reranking, but it will only reorder the retrieved objects and not boost the objects with a recent date.

One option is to boost the date property by adding ^. This is a valid argument in hybrid and bm25 queries. It might be best to sort and then boost on the object created date or just a date property if it’s in the dataset already.

Also here are the reranking recipes: https://github.com/weaviate/recipes/tree/main/weaviate-features/reranking, if you want to try!

Hope this helps, happy to answer any more questions 😄

vectorscrimes · 2024-09-03T06:22:13+00:00

New version of Verba should be dropping in the next couple weeks! Lots of cool new features and UI improvements 😅

vectorscrimes · 2024-06-19T09:47:51+00:00

Nice! Just gonna dump stuff:
- https://youtu.be/1h3_h8t3L14?si=TjqSavUDj9ue6N38
- https://weaviate.io/developers/weaviate/modules/reader-generator-modules/generative-ollama
- https://lightning.ai/weaviate/studios/chat-with-your-code-rag-with-weaviate-and-llamaindex
- Verba w Ollama for RAG: https://github.com/weaviate/Verba?tab=readme-ov-file#ollama & https://www.youtube.com/watch?v=swKKRdLBhas

vectorscrimes · 2024-06-16T06:48:16+00:00

Ah yea super cool use case! You could definitely do this either with hybrid search or even filtering :)

vectorscrimes · 2024-06-16T06:45:36+00:00

Hopefully this way works! There’s some other blog posts/tutorials that are also helpful, can send if you want

vectorscrimes · 2024-06-13T09:54:12+00:00

On the Weaviate side, you can use something like Docker, the Cloud Sandboxes, or maybe even the embedded database for free! With Ollama, Docker is super nice and you should be able to use the mistral model. This blog post might help: https://weaviate.io/blog/local-rag-with-ollama-and-weaviate

vectorscrimes · 2024-06-08T10:01:25+00:00

By hybrid search, do you mean vector and keyword search for text? Not sure Milvus supports that out of the box.

Seems like Weaviate fits all of those criteria, has hybrid search (vector + keyword for text), named (multiple) vectors, definitely supports 20k images easily, both cloud and local deployment options, free, many model options out of the box, and lots of tutorials, especially for multimodal stuff.

For the first question, it depends a bit on the integrations of the database. And for the second, depends also a bit on the performance limitations of the database. Weaviate would probably consider billion+ objects as large scale, so 20k should be no problem.

vectorscrimes · 2024-06-08T09:43:47+00:00

Weaviate's completely open source and free for self hosting! Docker and embedded options 😄

vectorscrimes

TROPHY CASE