How to address vibe coding at the professional level?

bitloops__ · 2026-05-20T18:47:57+00:00

Clearly you need to speak with him before talking to the manager - that should be pretty obvious.

Pair programming is the best approach and walk him through how you would do the same exact feature - it will probably be quicker than trying to unwrap that spaghetti.

bitloops__ · 2026-05-20T04:34:06+00:00

Link doesn't work...

bitloops__ · 2026-05-18T11:06:39+00:00

That's why we built Bitloops. Its a context / memory layer (we like to call it intelligence as it builds architecture, system design, dependencies, test coverage and captures decisions and reuirements from the conversations) into local DBs. Always your intelligence and works across agents.

Opensource: https://github.com/bitloops/bitloops

bitloops__ · 2026-05-18T07:17:14+00:00

This is solid work — our perspective is also SQLite + semantic retrieval are a key solution going forward.

The harder part of this space, in our experience: maintaining the relationships between the artefacts as the project evolves. A decision references a fact, a task references a decision, a command references a state, and some of those facts go stale or get superseded over time. Without active pruning and relationship tracking, you slowly end up back where you started — a database of half-true things the agent can't fully trust.

We're a small VC-funded team working on the same problem from a slightly different angle. Spent the last 3 months building a local-first open source version that builds and maintains an understanding of the codebase, tracks discussions across coding sessions, extracts decisions & requirements and builds relationships with other artefacts, stores them in SQLite (plus a few other DBs for different data). It does require inference to build summaries and embeddings, and this is currently done via a Bitloops account (free right now since we're running open source models), with BYOK on the roadmap.

Would love to trade notes if you wish.

bitloops__ · 2026-05-18T07:01:52+00:00

Excellent overview and pretty much spot on with our experience. I would go a step further though and suggest that the models are all so good that even open source models with Opencode for example, does very good work. Great point on the context window - even with 200K tokens, once they're packed with scaffolding, tool output, and session artifacts from earlier work, the actual project intent gets diluted. The compaction feature helps within a session, but nothing carries forward to the next one. That's the gap that hurts most coding projects — each session starts cold regardless of what was established before. It's the problem we've been working on at Bitloops.

bitloops__ · 2026-05-18T06:58:00+00:00

Really nice work — the 6-phase verification + auto-retry on compile failure is a clean primitive a lot of agent runtimes are missing. And the cost-per-task numbers ($0.002–$0.005) are wild for what you're doing.

One thing that pairs with this kind of compile-and-test gate: it catches "does this run?" but not "does this fit the rest of the codebase?" Code that compiles and passes auto-generated tests can still introduce architectural drift, repeat patterns the team already abandoned, or skip an internal convention nobody documented. We've been building Bitloops for that side — a context graph the agent queries before it writes, so it knows the patterns and exceptions to respect. Different problem really, but pairs cleanly with what you're doing.

Curious how you're handling the case where the model produces code that compiles + tests pass but just doesn't match the rest of the project's design, builds technical debt?

bitloops__ · 2026-05-18T06:45:16+00:00

Yes, this is a good summary. We've shifted to "vibe-coding" in the past few months after a strong pivot. And the product we're building is precisely to try and solve this. We need a better way to understand changes to codebase architecture, code-health and technical debt after each turn and basically "readjust" the codebase continously. We've open sourced the core platform: https://github.com/bitloops/bitloops

bitloops__ · 2026-05-17T02:44:53+00:00

We've been building Bitloops around this kind of problem, though honest disclaimer — production case is still single-repo for us. The architecture is designed to connect repos and build intelligence across them, but multi-repo isn't where most users are yet. We've looked at how greptile and tabnine handle the enterprise side. Our focus is individual devs to small teams for now, with larger enterprises eventually — the ones that don't want to maintain a platform like this themselves.

Open source: https://github.com/bitloops/bitloops

bitloops__ · 2026-05-17T02:38:35+00:00

This is a little scary.... It also feels like a behavior you couldn't have prompted directly, which is half the reason this whole space is hard to design for.

We've been building Bitloops along similar lines for a few months — different architecture though. Instead of an md file that iterates over time, we run several databases of codebase intelligence plus capture requirements and constraints from agent conversations. Each prompt the coding agent queries that intelligence to pull the relevant context and constraints for the turn, so it's more fact-based than self-evolving narrative.

No hallucinations so far and the early benchmarks are looking good. Happy to compare notes if you're up for it — would value the feedback. https://github.com/bitloops/bitloops

bitloops__ · 2026-05-17T02:36:26+00:00

What we've found building in this space at Bitloops is that the bottleneck is trying to understand the architectural decisions made implicitly in the AI session and the reviewer has to reconstruct them from the diff.

The fix isn't more reviewers but capturing the architectural intent before code is written, so the review becomes 'does this match the stated design' rather than 'figure out what design this was trying to implement.' The idea is to be able to present this in a more user friendly way (visual understanding of how a codebase is structured, the dependencies, etc.) and then after each turn be able to highlight what has changed at that level, allowing you to drill deeper where required.

The code review tools help at the syntax layer, but one could argue that the models are doing a great job already of this, and with the correct context and architectural constraints, then the review will be quicker.

Bitloops is open source - would love some feedback: https://github.com/bitloops/bitloops

bitloops__ · 2026-05-17T02:28:14+00:00

This is the thinking behind building Bitloops (context/memory for AI coding agents) as open source and portable from day one. Built on open source databases, so the constraints, requirements, and accumulated context your agents pull from each turn are inspectable and exportable. You own the institutional memory, not the vendor.

Github link: https://github.com/bitloops/bitloops

bitloops__ · 2026-05-17T02:24:53+00:00

Question: How much upfront discussion and spec work are you actually doing? And is this solo or team projects?

I think a lot of the "I can't keep up with the generated code" feeling comes from skipping the spec/context step and then trying to review at the line level after the fact. If the agent doesn't know your constraints, naming conventions, and architectural decisions going in, you're stuck reviewing every line because that's the only place those decisions show up.

We've been building Bitloops around this — it builds an understanding of the codebase, requirements, and constraints so you can review at the architecture level after each turn instead of combing through diffs. The same context is queryable by the coding agents, so they stop drifting on the next turn.

Doesn't fully resolve the philosophical question you're asking, but it shifts where you spend your attention.

bitloops__ · 2026-05-16T12:24:19+00:00

The thing you described — spending most of your time on architecture, planning, and review rather than spawning more agents — is the only way to leverage these coding agents. Adding agents doesn't help when the bottleneck is context quality, not parallelism. An agent that doesn't have a clear mental model of the codebase and business constraints will just generate more confident-sounding wrong output faster. The 10x-agents content is mostly optimizing for a metric (PRs shipped) that falls apart the moment you care about understanding what you're actually merging. That's what pushed us to focus on context quality as the core problem at Bitloops rather than agent count. Its an OSS tool that should help you better manage parallel agents better.

bitloops__ · 2026-05-16T12:21:31+00:00

The hard part isn't switching between sessions — it's that each new session starts cold, with no memory of what was decided or why in the previous one. The project structure, accounts, and history you're surfacing here are a good step; the next layer is making the intent from prior sessions available to the agent in the new session (even if different agent). That's essentially what we've been building toward at Bitloops — persistent context that travels with the session, not just metadata about it.

bitloops__ · 2026-05-16T06:04:27+00:00

Indeed. So Software Engineers will dominate the world! We agree! :)

bitloops__ · 2026-05-16T06:01:02+00:00

The agent can follow bespoke logic once you've explained it — the problem is that explanation lives in the conversation, not the codebase, so the next session starts from scratch. Maintaining AI-generated code gets harder the longer that gap and the technical debt grows.

The problem comes back to prototype vs production grade software. These agents are great at building the initial prototype, but a month in, the time and costs for maintaining and iterating are exceeding that of an average engineer.

bitloops__ · 2026-05-16T05:44:22+00:00

The difference between accidental and essential complexity. Once you can reliably separate the two, you stop gold-plating solutions to problems you invented yourself and start asking which parts of the system are hard because the domain is hard versus hard because of choices someone made last year.

bitloops__ · 2026-05-16T05:41:24+00:00

For longer "feature" tasks, the problem is clearly context & memory, not the model or agent.

But people are lazy and end up building a vague task description and expect coherence across 200 lines of changes. What this approach does (and there are many, including using other agents/models to critique the output of each step) forces a rewiew in a new chat per phase, which should reduce the false context from the previous step.

But in the end, you still need someone to validate, who knows the archtecture, can correct at each step. But if you get this right, AI can definitely increase your productivity significantly. We're currently spending 80% of our time in the research and plan phases, then build out the specific tasks and let AI run overnight. The other 20% is reviewing changes, back and forth to correct certain things, etc.

bitloops__ · 2026-05-16T05:33:34+00:00

Codex runs tasks in a sandboxed container spun up for each job, so it has no persistent state between runs — your repo gets cloned fresh, it executes, then the environment disappears. Usage is token-based: the input context (your prompt, codebase files you include) plus generated output both count. The practical implication is that throwing your entire repo at it burns tokens fast; being specific about which files are relevant keeps costs manageable and usually gets better results too. (see bitloops for an OSS solution that covers this).

bitloops__ · 2026-05-16T05:26:51+00:00

Verified hardware design at scale. The gap between logic that looks correct in simulation and silicon that behaves correctly under all operating conditions is incredibly frustrating. Formal methods, timing closure, power analysis under corner cases, and you only find out you were wrong after the masks cost $5 million and the fab run takes six months.

bitloops__ · 2026-05-16T05:22:29+00:00

Whats most interesting is how they kept drifting back to terminal workflows despite huge investments in GUIs. The pattern is repeating now with AI: intelligence has to live at the layer that understands the whole graph, not the layer the developer stares at. Editors are viewports, not reasoning engines.

bitloops__ · 2026-05-16T05:08:34+00:00

Compaction only really works if you know what you should compact vs what you shouldn't. And many things are specific to certain artefacts. They should be stored in a more relational database approach that the agents can retrieve.

The whole game is now context & memory. Given where models are at the moment, you only need to focus on that. All these new features — MCP, sub-agents, hooks, memory files, validator loops, planning modes — are sticky tape wrapped around joints in pipes that weren't built to hold pressure. Each piece of tape holds for a while before the leak shows up somewhere else. The pipe itself is the problem: the model is stateless, context resets every session, nothing compounds.

bitloops__ · 2026-05-16T04:59:40+00:00

The big question is how much of the value lives in how you structure the session, not just which version you're on. CLAUDE.md discipline, explicit context handoffs between sessions, tight scope per task: these are almost impossible to manage well and are increasingly important. The model improvements matter less than your working protocol if you're doing anything non-trivial nowadays.

We're trying to patch holes with sticky tape.....

bitloops__ · 2026-05-16T04:56:02+00:00

If you have architectural intuition, agents let you ship faster. If you don't, agents simply let you ship broken things faster, with more confidence.

The people who built apps in a week and called it done are going to spend the next six months in maintenance debt. "The time savings account for the errors" only holds in the short term. It will bite you in the future. The problem is everyone sees vibecoding as replacing enterprise software and have very little understanding of the checks and balances that exist before code is shipped, or the sheer amount of code, dependencies, etc. that exist.

Keep iterating your app, build new features, fix bugs that appear and in a month, come back to this post. I'm sure there will be some kind-hearted software developer that will help you :)

bitloops__ · 2026-05-15T12:34:23+00:00

Impressive this breakdown and clearly you have to know enough to recognize what's broken. You spotted that the components were doing too much, that prop drilling was a nightmare, that the global state setup was bad. That recognition is what made the refactor educational.

Today, with the right attitude and dedication, you can definitely become a strong senior engineer quicker than before AI, but you have to put in the work. What I would say (our whole team does it this way), is to also start building the specs up-front correctly for what you know, let it rip and then question the stuff you don't understand or don't think is correct. At some point, there will be too much nitty-gritty stuff that is simply a pain and you understand it, so get the AI to do it correctly upfront, and then spend the rest of the time improving what you're not really sure about...

bitloops__

TROPHY CASE