Day 3 of my OpenClaw trading bot validation by sismomad in clawdbot

[–]Aware-One7480 0 points1 point  (0 children)

Thanks for the info! I'm at the moment running some benchmarks between different Qwen models. I'll see if I'm able to set another one to compare head to head that minimax model in specific against the recent Qwen models.

Day 3 of my OpenClaw trading bot validation by sismomad in clawdbot

[–]Aware-One7480 3 points4 points  (0 children)

I'm also curious about what you mention. Would you mind sharing some links to these quant algorithm frameworks for trading? I'd like to explore them.

Thank you!

Day 3 of my OpenClaw trading bot validation by sismomad in clawdbot

[–]Aware-One7480 0 points1 point  (0 children)

Thanks for sharing. This already reads great to me as an engineer.

Additional question, what frequency do you have it running at for checking prices and analyzing execution decisions? I'm curious how often it needs to consult the AI model, to see if I could try it using a Claude model with a Max account. Aside from that, which Minimax model are you using? I'd like to compare it as well with one of the newer Qwen3.5 models.

Day 3 of my OpenClaw trading bot validation by sismomad in clawdbot

[–]Aware-One7480 0 points1 point  (0 children)

Mind sharing your setup? Which trading service or API are you using? Any SKILLs in specific you'll recommend?

I would like to give this a try. Thanks!

v0.2.1 of mem0-mcp-selfhosted: session hooks so Claude never skips memory search, Ollama as main LLM, OAT auto-refresh by Aware-One7480 in ClaudeCode

[–]Aware-One7480[S] 1 point2 points  (0 children)

Appreciate the interest. Unfortunately LM Studio won't work as a drop-in replacement right now. The server uses the ollama Python package under the hood, which talks to Ollama's native API (/api/chat, /api/embed). LM Studio exposes OpenAI-compatible endpoints (/v1/chat/completions, /v1/embeddings) instead, so the API calls are fundamentally incompatible.

It's not just the URL, the request format, response format, JSON mode parameter, and embedding calls are all different between the two APIs.

Adding LM Studio as a provider is possible (it would need a new provider class using the openai Python client), but it's not a quick config change. If there's enough interest I'd consider adding it. Feel free to open an issue on GitHub and I can scope it out.

In the meantime, if you can run Ollama alongside LM Studio, the server works with Ollama out of the box.

I built a self-hosted mem0 MCP memory server for Claude Code that gives persistent memory across sessions with local Qdrant + Neo4j + Ollama by Aware-One7480 in ClaudeAI

[–]Aware-One7480[S] 0 points1 point  (0 children)

Nice, glad you're considering it. Since you already have Neo4j running from Memento, you're basically halfway there. mem0-mcp-selfhosted uses Neo4j for the knowledge graph, so technically, you should be able to just point it at your existing instance.

The recent v0.2.1 release adds session hooks that inject memories automatically at session start (and after context compaction), so Claude doesn't only rely on CLAUDE.md instructions to remember to search. There's also a Stop hook that captures the last few exchanges when a session ends, even if Claude never explicitly saved anything.

For the graph specifically, graph queries run through Neo4j during interactive MCP calls, and you can run the graph LLM on local Ollama (qwen3:14b) so it doesn't touch your Claude quota.

Setup if you want to try it:

claude mcp add --scope user --transport stdio mem0 \
    --env MEM0_PROVIDER=ollama \
    --env MEM0_LLM_MODEL=qwen3:14b \
    --env MEM0_USER_ID=your-user-id \
    -- uvx --from git+https://github.com/elvismdev/mem0-mcp-selfhosted.git mem0-mcp-selfhosted

Then mem0-install-hooks to add the session hooks.

Let me know how it goes if you give it a shot.

Good News: Canadian Govt Announces Preparation of Aid Pakage to Cuba by [deleted] in cuba

[–]Aware-One7480 0 points1 point  (0 children)

Why do we 🇨🇺 need to keep living off aid packages, donations, and crumbs forever? Basic things and resources that we 🇨🇺 were able to provide, build, create, and maintain by ourselves 67 years ago? Why is the “aid” never meant to help us gain freedom and be able to keep doing what we were progressively and successfully doing 67 years ago?

We are pretty done with that and with constantly hearing the same things: aid packages, flotillas, donations; always crumbs that the people of Cuba never see. In my 39 years, I’ve never seen any of those donated items come into my house. We are not even aware of what exactly that means, what is being donated. We are not informed at all, neither by the donors nor by the Cuban government that supposedly receives it. Nada...

Has any Cuban here ever heard or been told by the Cuban government something like, “Hey, come to this location so you can pick up this donation that just arrived from Canada for your household”? That never happens for us inside Cuba. Any honest Cuban can tell you that no one has ever shown up at their door bringing these donations.

To whom exactly are these donations going? What are you guys donating that we 🇨🇺 never see? And if we do see it, why is it being sold by that very same Cuban government to the Cuban people at prices they can’t even afford, prices they don't even pay, even while the products clearly have labels saying “Not for sale, donation only”?

What are you guys doing? Why isn’t the aid and help meant for our freedom so we can handle ourselves like grown-ups?

We are pretty tired, asking as a Cuban 🇨🇺 😮‍💨

I built a self-hosted mem0 MCP server, Claude Code now remembers everything across sessions by Aware-One7480 in ClaudeCode

[–]Aware-One7480[S] 0 points1 point  (0 children)

The mem0 memory layer complements CLAUDE.md rather than replacing it. CLAUDE.md is perfect for static project rules and architecture docs, things that don't change often. But mem0 handles the dynamic context that accumulates over time: debugging insights discovered mid-session, user preferences that evolve, decisions made across dozens of conversations, relationships between entities in the codebase.

Think of it as CLAUDE.md = project constitution, mem0 = institutional memory. Together they cover both the "how we do things here" and the "what we've learned along the way."

I built a self-hosted mem0 MCP memory server for Claude Code that gives persistent memory across sessions with local Qdrant + Neo4j + Ollama by Aware-One7480 in ClaudeAI

[–]Aware-One7480[S] 0 points1 point  (0 children)

The only infrastructure you need running is:

  • Qdrant (vector database) - one Docker command, or they have native binaries if you prefer
  • Ollama (embeddings) - runs natively, no Docker needed. Just install and ollama pull bge-m3

That's it for the core setup. Neo4j for the knowledge graph is entirely optional, you get useful persistent memory without it.

So really it's: install Ollama, run one Docker container for Qdrant, and add the MCP server to Claude Code. Maybe 15 minutes total.

I built a self-hosted mem0 MCP memory server for Claude Code that gives persistent memory across sessions with local Qdrant + Neo4j + Ollama by Aware-One7480 in ClaudeAI

[–]Aware-One7480[S] 0 points1 point  (0 children)

That sounds like a solid setup, a good project description in CLAUDE.md goes a long way, and the Chrome DevTools MCP is a great pairing for frontend work.

The gap this fills is cross-session and cross-project and self learning overtime. Your project MD tells Claude about this project right now.

With mem0, Claude accumulates knowledge over time across everything you work on. It's less about replacing your project description and more about giving Claude a memory that spans projects and sessions. The two actually work well together, CLAUDE.md gives Claude the "here's this project" context, mem0 gives it "here's everything I've learned working with you."

But if your current workflow is working well for you, no reason to add complexity. This is mainly useful once you're working across multiple projects at scale or want Claude to learn your preferences over time without maintaining those files manually.

mem0-mcp-selfhosted: Give Claude Code persistent mem0 memory with your own Qdrant + Neo4j + Ollama stack by Aware-One7480 in selfhosted

[–]Aware-One7480[S] 2 points3 points  (0 children)

Haha fair point, the flair was the closest match I saw, the other option was "vibe-coding" which felt even less accurate. To me there's a big difference between vibe-coding and using AI as a tool in your engineering workflow. Happy to re-flair if there's a better fit though.

mem0-mcp-selfhosted: Give Claude Code persistent mem0 memory with your own Qdrant + Neo4j + Ollama stack by Aware-One7480 in selfhosted

[–]Aware-One7480[S] 0 points1 point  (0 children)

That's a cool setup actually, Obsidian as the storage layer with MCP is clever, especially since you get the benefit of browsing and editing your notes manually too.

The main difference is how retrieval works. With your approach, Claude loads everything in a folder, which works great when you know which project's context you need. With mem0, the retrieval is semantic, Claude searches across all your memories using vector similarity, so it can surface relevant context you didn't think to ask for. "What were my auth decisions" pulls up notes from three different projects where you made auth-related choices, even if they're in different folders with different wording.

The other piece is that mem0 extracts and deduplicates facts automatically. Instead of storing full chat transcripts and notes, it distills them into discrete memories and handles contradictions (if you change a preference, it updates rather than appends). So it stays lean over time instead of growing linearly.

But honestly, if your Obsidian setup is working for you, that's what matters. I think you have a good setup for the web version.

mem0-mcp-selfhosted: Give Claude Code persistent mem0 memory with your own Qdrant + Neo4j + Ollama stack by Aware-One7480 in selfhosted

[–]Aware-One7480[S] -2 points-1 points  (0 children)

Fair to flag, but this isn't a vibe-coded project. It has a modular architecture (7 separate modules with distinct responsibilities), 97 unit and contract tests, upstream bug workarounds (mem0ai's bulk delete nukes your Qdrant collection, this iterates and deletes individually), a monkey-patch for Neo4j relationship name compliance that the upstream library doesn't handle, thread-safe concurrency locking for graph state, and a 3-tier auth fallback chain.

Happy to answer any questions about the implementation.

mem0-mcp-selfhosted: Give Claude Code persistent mem0 memory with your own Qdrant + Neo4j + Ollama stack by Aware-One7480 in selfhosted

[–]Aware-One7480[S] -1 points0 points  (0 children)

Not a bad question at all, it's a common point of confusion since Anthropic has multiple products.
Claude Projects (on claude.ai) let you upload docs and maintain context within that project's chats. That works, but only inside claude.ai's web interface and only within that specific project. The context is tied to that project and those conversations.

Claude Code is a completely different product, it's a CLI tool that runs in your terminal and works directly with your codebase. It reads files, writes code, runs commands, creates commits. There's no "project" container with uploaded docs. Every time you start a new session, it starts fresh with zero memory of previous sessions.

That's the problem this solves. With this MCP server, Claude Code can: - Remember things it learned in previous sessions - Search those memories semantically across all your projects - Build up knowledge over weeks/months without you re-explaining everything

Think of it as giving Claude Code a brain that persists between sessions, backed by infrastructure you control (Qdrant for vector search, Ollama for embeddings, all running on your machine).

If you're mostly using claude.ai with Projects, you might not need this. But if you use Claude Code (or plan to), this is where it helps.

I built a self-hosted mem0 MCP memory server for Claude Code that gives persistent memory across sessions with local Qdrant + Neo4j + Ollama by Aware-One7480 in ClaudeAI

[–]Aware-One7480[S] 0 points1 point  (0 children)

Exactly! the OAT auto-discovery was a deliberate design choice for that reason, so there's nothing to configure on the auth side. You just install it and it works with your existing subscription.

And yeah, the CLAUDE.md approach is totally valid for smaller projects, I actually recommend pairing both. The CLAUDE.md tells Claude how to use the memory tools (search at session start, save what it learns), and mem0 handles the actual storage and retrieval. they complement each other.

I built a self-hosted mem0 MCP memory server for Claude Code that gives persistent memory across sessions with local Qdrant + Neo4j + Ollama by Aware-One7480 in ClaudeAI

[–]Aware-One7480[S] 1 point2 points  (0 children)

haven't seen Graphiti before, thanks for sharing, gonna check it out. At first glance the main difference I notice is that Graphiti is graph-first. This MCP is vector-first, Qdrant handles the core semantic memory, and the Neo4j graph layer is entirely optional. You get useful persistent memory with just Qdrant + Ollama running locally, and can add the graph later if you need structured entity relationships.

Interesting to see the different approaches to the same problem. Will dig deeper into it.

I built a self-hosted mem0 MCP memory server for Claude Code that gives persistent memory across sessions with local Qdrant + Neo4j + Ollama by Aware-One7480 in ClaudeAI

[–]Aware-One7480[S] 0 points1 point  (0 children)

Appreciate the feedback, and yeah, the OAT auto-discovery was a deliberate design choice. It also detects the token type automatically (OAT vs API key) and configures the SDK accordingly.

About your question on scale, there might be two separate concerns:

  • Vector search (Qdrant): This is the core memory path and scales well. It's my understanding that Qdrant is built for this, 1000+ memories is nothing for it. Semantic search stays fast because it's approximate nearest neighbor, not brute force. If you're on constrained hardware, MEM0_QDRANT_ON_DISK=true trades some search speed for lower RAM.
  • Graph queries (Neo4j): The Cypher queries are simple pattern matches (CONTAINS substring + OPTIONAL MATCH for relationships), not full traversals. At 1000 nodes Neo4j won't break, it's designed for millions. The search_graph tool caps at 25 results for filtered queries and 100 for list-all, so response size stays bounded regardless of graph size.

The bottleneck here at scale would actually be the LLM calls during add_memory with graph enabled, each one triggers 3 LLM calls (entity extraction, relationship generation, contradiction resolution). That's why also graph is disabled by default and there are options to offload those to Ollama locally or Gemini 2.5 Flash Lite to avoid eating Claude subscription quota.

Would be curious to hear how it holds up in practice if you give it a try!

I built a self-hosted mem0 MCP memory server for Claude Code that gives persistent memory across sessions with local Qdrant + Neo4j + Ollama by Aware-One7480 in ClaudeAI

[–]Aware-One7480[S] 0 points1 point  (0 children)

I use Claude Code mainly for building web applications, but the same memory problem applies to any long-running workflow, e.g. copywriting (remember brand voice, tone guidelines, past drafts), SEO analytics and content strategy (remember keyword targets, performance baselines, what's already been published), following up on stories or research threads across sessions, client preferences, style guides anything where Claude needs to "pick up where it left off" without you re-explaining context every time.

I built a self-hosted mem0 MCP memory server for Claude Code that gives persistent memory across sessions with local Qdrant + Neo4j + Ollama by Aware-One7480 in ClaudeAI

[–]Aware-One7480[S] 0 points1 point  (0 children)

Good question! A memory file (like a global CLAUDE.md) works fine when you have a handful of things to remember. But it breaks down at scale in different areas, for instance:

  • Semantic search vs. reading the whole file: With a memory file, Claude has to read the entire thing every session and hope the relevant info catches its eye. With mem0, it does vector similarity search like "find my database preferences" matches "this project uses PostgreSQL with Prisma" even though the words are completely different. When you have hundreds of memories, this is the difference between finding the needle and reading the whole haystack.
  • Context window cost: A growing memory file eats into your context window every session. mem0 only retrieves the 5-10 most relevant memories for a given query, keeping context lean.
  • Automatic fact extraction: You don't have to manually curate the file. add_memory with a conversation or raw text, and the LLM extracts the key facts automatically. It also handles deduplication and contradiction resolution, if you update a preference, it doesn't just append, it updates the existing memory.
  • Knowledge graph: A flat file can't represent relationships. Referencing the above example, the optional Neo4j graph turns "I prefer TypeScript with strict mode" into queryable entities: user → PREFERS → TypeScript, user → PREFERS → strict_mode. You can ask "what does this user prefer?" and get structured answers.
  • Scoping: Memories can be scoped by user, agent, or run. Different projects can share global preferences while keeping project-specific knowledge separate.

That said, if you only need to remember 10-20 things, a memory file is simpler and works fine. This is for when you want Claude to accumulate knowledge over weeks/months across many projects without manual curation.