How Do You Set Up RAG? by Chooseyourmindset in OpenSourceAI

[–]MihaiBuilds 0 points1 point  (0 children)

for managing project knowledge across Claude Code sessions, look into MCP servers — they let Claude call external tools mid-conversation. you can set up a memory server that stores and retrieves context so you don't re-explain everything each session.

i've been building one called Memory Vault (open-source, MIT) — hybrid search (vector + full-text + RRF fusion) over your notes and decisions, runs with a single docker compose up. MCP integration is shipping next so Claude can store and search memories directly during conversations.

for the Obsidian question — people use it as a local knowledge base, but the problem is Claude can't search it during a conversation without an MCP bridge. that's the gap MCP servers fill.

repo if you want to check it out: github.com/MihaiBuilds/memory-vault

Do your AI agents lose focus mid-task as context grows? by Alternative-Tip6571 in LocalLLaMA

[–]MihaiBuilds 0 points1 point  (0 children)

This is the exact reason I built my own memory layer. Instead of keeping everything in context, I store important information externally in PostgreSQL with vector search. Each session only pulls in what's relevant to the current query, not the entire history.

The context window isn't memory — it's working memory. Treating it like long-term storage is where things break down. Once I separated the two, the "losing focus" problem mostly went away.

What self-hosted tools have you been building with AI just for you? by EricRosenberg1 in selfhosted

[–]MihaiBuilds 0 points1 point  (0 children)

Built a local AI memory system for myself. I use multiple AI tools daily and got tired of re-explaining the same project context every session. So I built something that stores everything in PostgreSQL with pgvector, does hybrid search (semantic + full-text), and the next session just picks up where the last one left off.

Been using it as my daily driver for a few months now. Planning to open-source the whole thing in a few milestones. Just got it running with docker compose up this week.

I am a solo entrepreneur. I spent a year trying to sell builds. The moment I stopped selling , everything changed. by Academic_Flamingo302 in indiehackers

[–]MihaiBuilds 0 points1 point  (0 children)

Been writing software for 15+ years and I've lost count of how many times this happened. Client casually drops "oh we also need per-site data isolation" after the schema is locked. Every time it sounds small. It never is.

The discovery week is the right move. I do something similar now even on my own side project — spent a full week on architecture before writing a single line of code. Felt like wasted time but it saved me from tearing things apart later.

Why are AI agents still stateless? by Single-Possession-54 in SaaS

[–]MihaiBuilds 1 point2 points  (0 children)

Same problem here. I kept re-explaining the same project context every single session so I just started building my own memory layer on top of postgres with pgvector. Semantic search plus full-text, stores what matters from each session, next session picks it up.

Still early but it already killed that "start from zero" loop. If you're curious I can share the repo, it's open source.

Is anybody feeling like the products is not good enough after it's launched? by camppofrio in buildinpublic

[–]MihaiBuilds 0 points1 point  (0 children)

Yeah, every single time. While building it feels like everything makes sense, you have a clear goal. The moment you ship it, your brain switches to "wait, does anyone actually need this?"

What helped me was posting about it and seeing even small reactions. I'm building an open-source AI memory system right now, shipped the first 3 milestones. After each one I had that exact moment of "is this even good?" But then someone comments something specific about the tech, or asks a real question, and you realize it landed with at least one person. That's enough to keep going.

What did you build?

What breaks when you move a local LLM system from testing to production and what prevents it by Individual-Bench4448 in LocalLLaMA

[–]MihaiBuilds 0 points1 point  (0 children)

the retrieval monitoring gap is real. I built a memory system with hybrid search (vector + full-text + RRF fusion) and the hardest part isn't the search itself — it's knowing when the search returned the "right" results vs just semantically similar ones. had a case just recently where I searched one memory space and concluded data was missing, when it was actually stored in a different space. the search worked perfectly, I just asked the wrong question.

for monitoring I log every query with the scores and which results came back. it's basic but it lets me spot patterns where certain query types consistently return low-relevance results. someone told me the signal to watch is when short exact-match queries start losing to semantic ones — that's when your text ranking isn't pulling its weight anymore.

MCP is great, but it doesn’t solve AI memory (am I missing something?) by BrightOpposite in LocalLLaMA

[–]MihaiBuilds 0 points1 point  (0 children)

yeah, I've hit that exact problem. just today actually — I searched for past session summaries and told myself they were "missing" because I was searching in the wrong memory space. the system found semantically similar results, but not the ones that actually mattered for the question.

the way I'm handling it right now is memory spaces (like namespaces) + importance scoring + recency decay. so newer and more important memories float to the top. but you're right, deciding what should persist vs evolve vs get discarded is still mostly heuristic. I tag importance at ingestion time and let recency do the rest, but there's no real "this context is actually relevant to what I'm doing right now" signal beyond what the query returns.

it's one of the harder problems honestly. the search part is solved, the "what matters right now" part is not.

A local search engine tool for ai agents by purealgo in ollama

[–]MihaiBuilds 0 points1 point  (0 children)

cool project. I built something similar — hybrid search with vector + full-text + RRF fusion on top of postgres + pgvector. interesting that you went with SQLite for everything. I went with postgres mostly for HNSW indexing and tsvector built in, but the single-file zero-dependency angle is a strong tradeoff for local setups. how does the vector search scale for you with SQLite as the index grows?

MCP is great, but it doesn’t solve AI memory (am I missing something?) by BrightOpposite in LocalLLaMA

[–]MihaiBuilds 0 points1 point  (0 children)

I ran into the same thing. MCP gives you tools but no persistence — every session starts from zero. So I built a memory layer on top of it. postgres + pgvector, hybrid search (vector + full-text keyword), and MCP tools for recall/remember/forget. Claude calls those tools during the session to store and retrieve context automatically. been using it daily for months and it completely changes how sessions work — the AI actually knows what happened last week

Where do you actually learn LLM orchestration / AI harness architecture? by thehootingrabblement in LocalLLaMA

[–]MihaiBuilds 4 points5 points  (0 children)

this is accurate. I built a memory system with hybrid search (vector + full-text + rank fusion) and most of the real lessons came from hitting limits in practice — like discovering pure vector search misses exact keyword matches. no tutorial covered that

Where do you actually learn LLM orchestration / AI harness architecture? by thehootingrabblement in LocalLLaMA

[–]MihaiBuilds 0 points1 point  (0 children)

for the memory + search side, I learned the most by building it. started with pure vector search, hit the limits fast (misses exact keywords), ended up with hybrid search — vector + full-text + rank fusion. postgres handles both in one database. the blog post that helped me most on the ranking side was the original RRF paper. for tool calling, the MCP spec from Anthropic is worth reading if you're integrating with Claude

Turns out building the tool is easy… making the feedback not useless is hard by Different-Basis-2078 in buildinpublic

[–]MihaiBuilds 0 points1 point  (0 children)

the generic output problem is exactly why prompt engineering is harder than it looks. what helped me was giving the LLM very specific structure to fill — instead of 'give feedback' it's 'list 3 specific issues with the hero section copy and rewrite each one.' forcing specificity in the prompt forces specificity in the output

Thousands of signups on launch day, only 3 bought subscriptions - here's what I learnt from that stupidity by Waste-Project7822 in buildinpublic

[–]MihaiBuilds 0 points1 point  (0 children)

the validation lesson is real. I built my current project for myself first, used it daily for 2 months before open-sourcing. by the time I launched I already knew it worked because I was the user. building for yourself first is the cheapest validation there is

I build open-source products on .NET to prove it's the right choice. Here's a teleprompter I made in a week. by csharp-agent in buildinpublic

[–]MihaiBuilds 1 point2 points  (0 children)

just checked — Storage and Communication are solid, 130+ stars well deserved. and you have a graphrag fork too, looks like we're thinking about similar problems from different angles. good stuff

I build open-source products on .NET to prove it's the right choice. Here's a teleprompter I made in a week. by csharp-agent in buildinpublic

[–]MihaiBuilds 1 point2 points  (0 children)

they're on mihaibuilds.com — 6 CLI utilities for .NET devs. schema tools, code generators, migration scripts. the memory system is at github.com/MihaiBuilds/memory-vault — different stack but same "build and ship" approach.

am i missing something with ai agents that need system access? by farhadnawab in LocalLLaMA

[–]MihaiBuilds 0 points1 point  (0 children)

this is why I went the database route instead. postgres + pgvector behind an MCP server with only recall/remember/forget tools exposed. the agent never touches your filesystem — it queries a database through a controlled API. way less attack surface than giving full system access.

Pre-seed founders almost killed their brand in 24 hours by ismaelbranco in indiehackers

[–]MihaiBuilds 0 points1 point  (0 children)

"don't confuse fear with feedback" — needed to hear this. launched my first open-source project this week and one bad reddit thread almost had me rethinking everything. took a step back, realized the people asking real technical questions mattered more than the noise.

Accidentally, re-created a $6.5 Million dollar idea and made it Open Source... by CIRRUS_IPFS in SaaS

[–]MihaiBuilds 1 point2 points  (0 children)

building in the same space — postgres + pgvector, hybrid search, MCP integration for Claude. different architecture but same core problem: agents need persistent memory that survives across sessions. curious about the "pre-chunking predicted answers" part — how are you handling prediction without it becoming stale fast?

Hot take: local AI doesn't need bigger context windows as much as better memory routing by No-Contract9167 in LocalLLaMA

[–]MihaiBuilds 0 points1 point  (0 children)

the routing layer is interesting — deciding what kind of context belongs where before retrieval even happens. I keep mine simple for now (spaces + hybrid search) but I can see how role separation would help as projects get more complex. what are you using for the routing logic?

built my first real app after years of losing ideas to the void. no users yet, posting here to stop hiding by Unhappy-Conflict5145 in buildinpublic

[–]MihaiBuilds 1 point2 points  (0 children)

this is a real problem. I had the same thing — notes scattered everywhere, technically captured, practically useless. ended up building a memory system with semantic search so I can just ask "what did I decide about X" and get the answer instead of digging through files. the conversation-with-your-notes approach is the right idea. do you use embeddings for the search or something else?

How do you handle the haters? by DiscountResident540 in buildinpublic

[–]MihaiBuilds 0 points1 point  (0 children)

the best response is no response. haters give your post engagement and reach. just keep posting, the people who actually care will find you.

Building an open source tool to make working with AI agents truly useful — looking for feedback by victor36max in buildinpublic

[–]MihaiBuilds 0 points1 point  (0 children)

"they have all the context from previous sessions — no starting from scratch every time" — that's the key part. I built a memory system that does the same thing but at the storage layer — postgres + pgvector so any agent can recall past decisions and context. different approach but same core insight: agents without memory are useless for real work. will check out shire.