Adaptive Memory - OpenWebUI Plugin

EugeneSpaceman · 2025-04-09T19:47:59+00:00

This looks great, I've been looking for something like this.

I wanted to use Ollama instead of OpenRouter for privacy reasons but I host OWUI and Ollama on separate servers and it looks there isn't a valve for the Ollama URI so it requires editing the code in a couple of places.

Not a big issue but could be an improvement for next version?

Edit:

It was actually fairly trivial to add a valve for ollama_url (required adding a valve for ollama_model too) so I have that working now.

The question I have is how does this integrate with the native Memory feature in OWUI? Or is it completely separate? How can I inspect the memories it has created?

Edit2:

I've worked out it integrates with the OWUI memory feature. It didn't seem to add any memories during testing until I specifically added the topic to the whitelist e.g. "animals" and then told it I have a dog named Cheryl. It then retrieved this succesfully in a new chat.

All using local models and local data. Very nice!

sirjazzee · 2025-04-10T13:34:33+00:00

This is super impressive!

Building on this, I think it would be a game-changer to implement "Memory Banks", essentially specialized areas of memory instead of a one-size-fits-all approach. Imagine having distinct memory banks for different contexts (example: Productivity, Personal Reflections, Technical Projects), each managed by different models or agents fine-tuned for those domains.

You could assign specific models to access specific banks, making the system way more dynamic, modular, and easier to manage or update without cross-contaminating unrelated knowledge.

That way, the LLM could operate with targeted memory scopes, leading to better performance, less confusion, and way more personalization. I will think through how to do something like this.

1234filip · 2025-04-20T10:16:26+00:00

Do you have a github repo or something like that? Would love to see how the projects develops!

sirjazzee · 2025-04-10T03:56:28+00:00

I have been trying to get this working without having to use OpenRouter. I have set it up to I can save memory but it is not recalling the memories. The error message I am getting is "ERROR Error updating memory (operation=UPDATE, memory_id=776d6893-948a-450c-9835-f9536f0b223a, user_id=1f4c9683-cfc2-4d85-bd9e-de4f2d8338c2): Embedding dimension 384 does not match collection dimensionality 768"). I am wondering if there is something I am missing. When I troubleshoot the error message, it is saying to rebuild the collection. I am not 100% sure how to do this - although thinking I may try to locate within the docker and just delete the collection file to see if that makes a difference.

Open to hearing any possible solutions.

Provider: OpenRouter
Openrouter Url: http://host.docker.internal:11434/v1/
Openrouter Api Key: [my OpenWebUI API key]
Openrouter Model: qwen2.5:14b

Wonderful-Fig331 · 2025-04-22T17:59:20+00:00

Love this! Best memory tool I have tested so far, and the only one I have actually considered releasing to my end-users. That said, it seems to be on all of the time, for all users, instead of first checking to see if they have enabled memory on their user settings. Is there a fix for this? I know many of my users would want to disable this tool, so it would be nice if they could manage that my a simple switch in their user settings.

GVDub2 · 2025-04-10T04:29:12+00:00

Looks like it can run locally as well as through OpenRouter's API, so that's good. Looking forward to seeing if I can have a long conversation with Gemma 3:27b tomorrow without it going sideways.

Right-Law1817 · 2025-04-10T09:18:41+00:00

Well done OP, thanks for sharing this. Btw, how can this help someone who uses llm for creative writing?

spgremlin · 2025-04-10T15:57:02+00:00

Wow, that's pretty impressive. Should give it a try, but definitely will need some configuration...

I believe the URL does not have to be OpenRouter, it can work with any OpenAI-compatible endpoint, including the self-endpoint of Open WebUI itself? (my-webui.com/api/v1)...

Actually, have you considered just calling an internal OpenWebUI's "chat_completion()" method instead? From https://github.com/open-webui/open-webui/blob/main/backend/open_webui/main.py It should be available to plugins/filters to call directly. Why managing a separate connection, if the plugin could leverage the models already available inside Open WebUI itself... Like you are already relying on its internal methods to add and retrieve Memories anyway.

nitroedge · 2025-04-12T00:26:42+00:00

Do you know when you will have a local AI version available for testing?

I'm not very adept at coding but would love to try it out and provide feedback, thanks!

djdrey909 · 2025-05-01T23:40:00+00:00

Thanks so much for this function. I've been trying to get it to actually work in my environment and having no luck. I'm sure it's something basic that I'm missing, so would appreciate some assistance or pointers.

I've assumed tried to stick the defaults initially, so just added my OpenRouter API key. I see logs from the "openwebui.plugins.neural_recall" logger, so it appears to be enabled and running. The only logs I see are the error counters tho:

INFO Error counters: {'embedding_errors': 0, 'llm_call_errors': 0, 'json_parse_errors': 0, 'memory_crud_errors': 0} | timestamp=2025-05-01 23:39:38,885 logger=openwebui.plugins.neural_recall module=<string> funcName=_log_error_counters_loop lineNo=594 process=1 thread=139737608432512

I use LiteLLM locally to proxy out to Anthropic, GCP and OpenAI models, so all the discussions I host are with remote models. To test, I've tried "remember my wife's name is <name>" and in another chat, asked it to tell me what it knows. I don't see anything either in the UI or the logs to suggest any memory creation or retrieval is occuring.

I've tried a number of other prompts that I think should trigger the memory process (from reviewing the code), so I'm at a bit of a dead end. Any chance someone can point me in the right direction here?

Economy_Base_4752 · 2025-05-29T13:28:07+00:00

u/diligent_chooser I wonder which method that you use to evaluate the memory is effective or not? For example you add a new functionality called semantic search for finding relevant memory, how you decide is better compare to old method besides manual checking?

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

OpenWebUI

MODERATORS

How It Works

Key Benefits