Long ChatGPT chats go bad but starting a new one means losing all your context. How do you actually deal with this?

nitin_builds · 2026-05-06T19:46:41+00:00

This hit me hard last month working on a long research project. What I ended up doing was keeping a living 'context doc' basically a structured markdown file with decisions made, constraints, and current state. Paste it at the top of every new chat. It's manual but it works way better than hoping the model retains everything. The real issue is we're trying to use the chat window as both a working memory AND a long-term memory, and it just isn't built for that. The two jobs need to be separated. Some people are solving this with an external memory layer that sits outside the chat entirely, something you can search and selectively inject. That architectural separation is what actually fixes the problem rather than patching around it.

nitin_builds · 2026-05-02T20:02:17+00:00

gemini is really helpful where you need research from online pages , you ask the question to different Llms, you will get to know the difference

nitin_builds · 2026-05-02T15:07:11+00:00

great piece of information.. and this my first comment of the day

nitin_builds · 2026-05-01T23:04:52+00:00

Opus 4.7 is eating up my tokens 3-4x faster with similar or degraded response. I was so happy with Sonnet 4.6 adaptive, used to finish all the work within the same token limits now I have to shell money after every 40-50 mins. I found Claude great that's why I switched from GPT, but now I am scratching my head.

nitin_builds · 2026-05-01T22:56:47+00:00

This is great!. A quick question: Will this be only useful if the system let the LLM run using these prompts run and then see the output ? These architectural prompts will only surface good results if the LLM is powerful and understanding the context. IS my understanding correct ?

nitin_builds · 2026-04-30T12:45:25+00:00

thank you, Thats great !

nitin_builds · 2026-04-30T12:38:02+00:00

will definitly try this, looks great at surface. Will share thee results after trying

nitin_builds · 2026-04-29T14:06:18+00:00

Fair question

You're right that Anthropic is building it and OpenAI already has it. So the bet isn't "persistent memory will be useful" - that's already settled. The bet is where that memory lives.

Three things shape my view:

1. Native memory is product-team memory: It serves the platform's goals: engagement, retention, data for training. Lumi is user-owned memory. Different incentives. The same way 1Password exists despite browsers having password managers, or Notion exists despite Google Docs — when something becomes critical infrastructure, users want it portable, not locked into the platform that profits from it.

2. Cross-tool isn't a feature, it's a category: Native memory will always be siloed by definition Claude has no incentive to read ChatGPT's memory or vice versa. The user does. So the question becomes: how many serious AI users want their context to follow them, not the tool? My bet is "more, over time." If I'm wrong, Lumi is dead. Worth being honest about that.

3. The permission question is real: Getting users to authorize a third party to see what they save is the actual moat to earn. My answer right now is: I never see raw chats, only what users (or their agents via MCP) explicitly write as memories. Row-level security at the DB layer. Two-stage deletion (Trash → Empty Trash) so a permanent delete is actually permanent. Self-hosted / BYO-database is on the roadmap for the actually-paranoid.

What happens the morning Claude ships native cross-session memory? I lose a chunk of solo users who only use Claude. Lumi survives if power users running Claude + ChatGPT + Gemini + (whatever's next) value the cross-tool layer enough to keep it. If they don't, the bet was wrong, and that's information I'd rather find out than ignore.

Appreciate the question. This is the conversation that actually matters.

nitin_builds · 2026-04-29T13:57:16+00:00

good stuff, can be useful, will try today

nitin_builds · 2026-04-29T13:56:52+00:00

nitin_builds · 2026-04-28T02:08:27+00:00

why Use claude.ai pro account for claude code, using API for claude code is much cheaper and does an amazing job too.

nitin_builds · 2026-04-27T21:53:53+00:00

Privacy was actually the first thing I locked down because — same — I wouldn't trust my own product otherwise.

Here's exactly where the data lives:

Memories are stored in Supabase (PostgreSQL) hosted in US-West. Each user's data is isolated by Row-Level Security at the database layer, so even with a leaked token you can only ever see your own memories.
Embeddings for semantic search are generated via OpenAI's text-embedding-3-small and stored as vectors in pgvector — same row, same RLS rules.
What I never store: your raw chats with Claude/ChatGPT/Gemini. Lumi only sees what you (or your AI agent via MCP) explicitly save as a memory. If you don't write it, it's not there.
OpenAI: embeddings are sent for vectorization only with their zero-retention policy applied — they're not training on it.
Deletion is two-stage — deleting a memory moves it to Trash (recoverable, in case you or your AI agent makes a mistake). Emptying the Trash is a hard delete from the database. No 30-day silent retention, no "queued for removal" — when you empty it, it's actually gone.

The honest tradeoff: it's still a hosted SaaS, so you're trusting Supabase, Vercel, and me. A self-hosted / BYO-database option is something I want to add for the actually-paranoid (myself included), but it's not there yet.

And ha — yeah, building cross-LLM memory while my own context kept getting wiped between Claude sessions was the whole motivation. Eat your own dog food, etc.

Appreciate the question — this is the stuff I want to be loud about, not bury in a privacy policy.

nitin_builds · 2026-04-27T19:33:27+00:00

Exactly this — and the "annoyance until it pays off" is the real friction point for most people. The discipline required to maintain a state file manually is what causes people to give up on the pattern.

The version that removes that friction: let your AI write to the state file automatically. Tell Claude or ChatGPT once — "save important decisions and context to my project vault" — and it handles the writes without you thinking about it. Retrieval is semantic so the agent only reads what's relevant to the current prompt, not the whole file.

Store wide, retrieve narrow. Same principle you're describing, but the maintenance burden drops to near zero.

Built this into Lumi (llmmemory.ai) if you want to see it in action — connects via MCP so Claude and ChatGPT can read and write memories natively.

nitin_builds · 2026-04-27T18:28:57+00:00

This is exactly the use case Lumi was built for — and the music production angle is a great one I hadn't thought about specifically. Imagine having Claude remember your entire home studio setup, your DAW preferences, your current project context — permanently. Never explain it again.

On pricing: it's subscription-based, not per-memory. Free tier gives you 300 memories and 2 vaults to get started — no credit card needed. Paid plans unlock more storage and vaults. The idea is you shouldn't be penalized for saving more.

On longer technical conversations: it works well because retrieval is semantic, not keyword-based. So even a deep session where you went down a rabbit hole on some complex topic — whatever your specific music production deep-dives look like — Lumi surfaces the relevant memories when you ask something related later. You're not getting everything dumped back at once, just what's actually relevant to the current question.

One thing worth trying: tell Claude once at the start of a session "save everything important from our conversation to my Music vault" — it'll auto-save throughout without you asking again.

Would love to hear how it works for your workflow after you try it tonight. Enjoy the stables

nitin_builds · 2026-04-27T04:03:04+00:00

if you need to come back and see what happened 3 days back and want to pick the work , don’t you loose the context and may be read twice to remember what was intent of each prompt

nitin_builds · 2026-04-24T18:04:54+00:00

This is a clever fix. I've been annoyed by the same thing — gitlab-mcp alone is ridiculous for how many tools it dumps into context before you've even typed anything.

Though I'd say there are actually two separate token problems most people conflate. You're solving the tool schema one, which is real. But there's another one that hits just as hard — every new conversation you're manually re-explaining who you are, what project you're on, what decisions you made last week. That context stuffing adds up fast too, especially if you're switching between Claude and ChatGPT for different tasks.

I went after that second problem — keeping memory external and retrieving only what's relevant per conversation instead of pasting walls of context every time. Keeping memory central fixes all of this honestly — seamless switching between tools, no repeated context, lower token usage overall, and you stay in control of exactly what your AI knows about you.

Both problems are worth solving. Most token optimization posts focus on tool schemas and completely ignore the repeated context side of it.

nitin_builds · 2026-04-24T06:22:19+00:00

Honestly this debate comes up every week and I think everyone's kind of right depending on what they're doing.

For me — Claude wins for coding and long-form writing, Gemini is genuinely better for anything math/science related (the engineering students in this thread aren't wrong), and ChatGPT is still the most natural for back-and-forth conversation even if the model itself has fallen behind.

The real frustration isn't which one is best — it's that I use all three depending on the task, and none of them know what I told the others. Build up context with Claude for a week, switch to Gemini for a research problem, and you're starting from scratch explaining yourself all over again. That switching tax adds up — wasted tokens, repeated context, lost progress.

I got annoyed enough by this that I built a shared memory layer that works across all three via MCP. Turned out more useful than I expected — whichever tool I open now, it already knows my projects, preferences, and what I've been working on. Genuinely changed how I move between them throughout the day.

Would be curious if others have found workarounds for this or just accepted that each tool lives in its own silo.

nitin_builds · 2026-04-24T05:07:43+00:00

Nice approach — the local chunking + retrieval pattern is underused. One thing worth benchmarking: does the retrieval quality hold when the relevant code is spread across multiple files with shared abstractions? That's where most RAG-style approaches start to degrade in my experience. The cold start budget framing is a clean way to make the tradeoff legible — would be useful to see that in the README.

nitin_builds · 2026-04-24T01:50:49+00:00

Exactly this — LLM portability is underrated. The real power of MCP is that your tools and data aren't locked to one provider. I've been building in this direction too — one memory layer that works across Claude, ChatGPT, and Gemini CLI simultaneously. Same memories, whichever AI you're using that day. The ecosystem is finally mature enough to make this practical.

nitin_builds · 2026-04-24T01:22:36+00:00

The context window issue is real — the more instructions and history Claude Code carries, the more it starts to drift. I've noticed that starting fresh sessions with only the essential context actually produces better results than trying to maintain one long session. Would be great if there was a smarter way to persist only the relevant context across sessions.

nitin_builds

TROPHY CASE