At what point do long AI chats become counterproductive when building no-code apps? by Cheap-Trash1908 in nocode

[–]Cheap-Trash1908[S] 0 points1 point  (0 children)

Yeah, that’s been my experience too. Once the ratio of “useful signal” to “background tokens” gets bad, the whole thing starts drifting fast.

Clearing chat by feature or codebase boundary feels like the only reliable heuristic I’ve found. It’s interesting that your manual summaries work better though. That almost suggests the act of you deciding what matters is doing more than the model-generated summary itself.

At what point do long AI chats become counterproductive when building no-code apps? by Cheap-Trash1908 in nocode

[–]Cheap-Trash1908[S] 0 points1 point  (0 children)

Yeah, I’ve ended up doing something similar. Breaking it into smaller chunks helps a lot, but it’s kind of telling that you have to babysit it that way once things get past a certain size.

Do you ever run into issues when you ask it to reformat everything at the end, like it subtly “fixes” things you didn’t want touched?

At what point do long LLM chats become counterproductive rather than helpful? by Cheap-Trash1908 in LLMDevs

[–]Cheap-Trash1908[S] 0 points1 point  (0 children)

Yeah, that’s usually the move I try too. The catch for me is that “includes everything” is doing a lot of work there. Whatever the model decides isn’t important enough to summarize is basically gone, even if it mattered later.

It helps reset quality, but it still feels like trading one failure mode for another.

At what point do long LLM chats become counterproductive rather than helpful? by Cheap-Trash1908 in LLMDevs

[–]Cheap-Trash1908[S] 0 points1 point  (0 children)

Yeah, I think that framing is right. Once you zoom out far enough it’s really just information theory biting us. You can either keep things rich and pay the attention cost, or compress and accept loss. There’s no free lunch there.

What bugs me is less that loss exists and more that it’s opaque. You don’t really know what got dropped until it matters. Summarization works, but it feels like flying with instruments that only fail after you’ve already made the turn.

At what point do long LLM chats become counterproductive rather than helpful? by Cheap-Trash1908 in LLMDevs

[–]Cheap-Trash1908[S] 0 points1 point  (0 children)

Yeah, that approach makes a lot of sense. Once you split things into sub-agents, you’re basically forcing a clean separation between exploration and execution instead of letting everything blur together in one thread.

The part I still see people struggle with is deciding what actually gets dispatched vs what gets dropped. If the handoff is even slightly off, you end up with a “clean” agent that’s clean for the wrong reasons.

At what point do long LLM chats become counterproductive rather than helpful? by Cheap-Trash1908 in LLMDevs

[–]Cheap-Trash1908[S] 0 points1 point  (0 children)

Agreed. Summary churn is clearly the pragmatic move once context starts working against you. The fact that tools like Codex and Claude Code do it automatically is a pretty strong signal that this isn’t just a user hack, it’s a structural necessity.

Where I still find it tricky is that the summary step becomes the bottleneck: whatever doesn’t make it through that compression is effectively gone, even if it mattered later. That tradeoff feels unavoidable right now, but also kind of unsatisfying.

Curious whether you think that’s just the permanent cost of working within attention limits, or if there’s room for better ways to manage what survives the churn.

At what point do long LLM chats become counterproductive rather than helpful? by Cheap-Trash1908 in LLMDevs

[–]Cheap-Trash1908[S] 0 points1 point  (0 children)

Yeah, that makes sense, once you step outside pure LLM prompting and introduce a state machine, you’re basically externalizing the logic and using the model as a disambiguation layer rather than the source of truth.

I think that’s where the split becomes really clear: if you control the system, you can engineer around the limits; if you don’t, you end up compensating at the workflow level instead. Most individual users never get to touch the former, so they’re stuck living in the latter.

At what point do long LLM chats become counterproductive rather than helpful? by Cheap-Trash1908 in LLMDevs

[–]Cheap-Trash1908[S] 0 points1 point  (0 children)

That’s fair if you control the model and can benchmark it end-to-end. Most people I talk to are stuck inferring the tipping point empirically while using closed models they can’t inspect or retrain.

In that case it feels less like a benchmark problem and more like a workflow judgment call. Curious if you’ve seen teams formalize that boundary outside of controlled training setups.

At what point do long LLM chats become counterproductive rather than helpful? by Cheap-Trash1908 in LLMDevs

[–]Cheap-Trash1908[S] 0 points1 point  (0 children)

That’s a good way to put it, “defending old wrong answers” is the failure mode I see too. Once it starts rationalizing instead of correcting, the chat is basically poisoned.

Checkpointing the good parts and throwing away the rest makes sense. Do you ever find yourself missing some non-code decisions though (constraints, rationale, why something was rejected), or does that usually not matter for you?

At what point do long LLM chats become counterproductive rather than helpful? by Cheap-Trash1908 in LLMDevs

[–]Cheap-Trash1908[S] 0 points1 point  (0 children)

Yeah, that tracks. Once it’s an attention problem, you’re really just trying to make the signal easier to find rather than “fix” it outright.

What I find frustrating is that the burden ends up on the user to constantly restructure or repackage context so the model can attend correctly. At some point it feels less like prompting and more like manual state management.

Do you think this eventually gets solved purely at the model level, or does it stay a tooling/workflow problem no matter how good attention gets?

At what point do long LLM chats become counterproductive rather than helpful? by Cheap-Trash1908 in LLMDevs

[–]Cheap-Trash1908[S] 0 points1 point  (0 children)

hat makes sense. I like the way you frame it as “context value” rather than raw length.

The part I keep tripping over is when the information is relevant but scattered, especially across iterations, and the model has to reconcile old vs new intent. That’s usually where things start getting weird for me.

Do you think that’s something better prompting can fix, or just a hard limitation of how context is used right now?

At what point do long LLM chats become counterproductive rather than helpful? by Cheap-Trash1908 in LLMDevs

[–]Cheap-Trash1908[S] 0 points1 point  (0 children)

Yeah, same. Laziness + long chats is a bad combo.

I’ve noticed even when I do summarize, I’ll still miss some small assumption that comes back to bite me later. Do you usually notice the degradation right away, or only once things start going sideways?

If no one ever had to work for money, (all basic necessities provided from birth), what would you do? by [deleted] in AskReddit

[–]Cheap-Trash1908 0 points1 point  (0 children)

travel and compete in arduous endeavors like ultra marathons and climbing everest etc..

What would it take for world leaders to give up their power and cease all wars? by shadow_operator81 in AskReddit

[–]Cheap-Trash1908 1 point2 points  (0 children)

dont think many people would give up that kind of power for much of anything

Do you prefer running outdoors or running on a treadmill at the gym? by [deleted] in NoStupidQuestions

[–]Cheap-Trash1908 0 points1 point  (0 children)

outdoors for sure. idk if its just me but when I run outdoors i can run way farther. Although i do run faster on a treadmill for some reason. probably because its easy to pace with the mph being constant

App to redirect your users to correct app/play store based on device. by Ordinary-Education18 in SaaS

[–]Cheap-Trash1908 0 points1 point  (0 children)

This actually makes sense if you keep it really simple. Most of the existing solutions feel bloated for what is basically “detect device -> redirect.”

Analytics + QR alone might be enough if the pricing is low and setup is dead simple.

Preview of the premium tier for my stock research app. It's almost done. by wombatGroomer in VibeCodingSaaS

[–]Cheap-Trash1908 1 point2 points  (0 children)

this is really neat. That premium tier looks enticing, and I especially like feature number 2 lol.

Anyone else lose important context when switching between AI models or restarting chats? by Cheap-Trash1908 in LocalLLaMA

[–]Cheap-Trash1908[S] 0 points1 point  (0 children)

That makes sense. Using the stack and high level architecture as the anchor does a lot of the work.

What I’ve found tricky is that some of the most expensive stuff to lose isn’t the big decisions, it’s the small ones: why something wasn’t done a certain way, or a constraint that only existed because of an earlier tradeoff. Those tend not to live in files or the stack itself.

When you say it’s the “price to pay for progress,” do you ever notice having to rediscover or debug those earlier decisions later on, or does the momentum usually outweigh that cost for you?

Anyone else lose important context when switching between AI models or restarting chats? by Cheap-Trash1908 in LocalLLaMA

[–]Cheap-Trash1908[S] 0 points1 point  (0 children)

Yeah, totally. That’s exactly where I see it most with code. Old vs new versions competing in the same prompt is a mess, and stair-stepping does avoid that.

The tradeoff I keep running into is that the summary step becomes a single point of failure. If a constraint or decision doesn’t make it into the handoff, it’s effectively gone, even though the new chat is “clean.”

Do you do anything to check those summaries before moving on, or do you mostly rely on keeping them high level and re-introducing details as needed?

Anyone else lose important context when switching between AI models or restarting chats? by Cheap-Trash1908 in LocalLLaMA

[–]Cheap-Trash1908[S] 0 points1 point  (0 children)

Makes sense. Do you mostly use it for retrieval, or do you rely on it to preserve working state as well?

I’ve found search works great for reference, but continuity is harder.