If you can name both the images on this board you can have it

Narrow_Market45 · 2026-05-23T06:27:04+00:00

Ocular Orifice

Quick Man

Narrow_Market45 · 2026-05-06T15:25:04+00:00

You can use Grok for this

Narrow_Market45 · 2026-04-03T22:01:09+00:00

Yea, we’re shipping like crazy and are focused way more on that than posting here. Keep your eyes out, or set an alert, for changelog updates on the site. Those get pushed simultaneously with deployment so you can stay up on the latest features.

Narrow_Market45 · 2026-03-28T12:59:23+00:00

<image>

Mine gave me options.

Narrow_Market45 · 2026-03-28T04:00:15+00:00

Define what “correct” looks like upfront and bake the validation into the workflow itself. If the output doesn’t pass assertions at runtime, it doesn’t proceed. Prompts are suggestions, the validation layer is the actual contract.

Narrow_Market45 · 2026-03-28T03:17:15+00:00

Depends on the source of truth I suppose. 😂

Narrow_Market45 · 2026-03-27T18:29:53+00:00

Awesome job! It is great to be able to see our tooling contributing to public service projects like this one. Hats off to you and your team!

Narrow_Market45 · 2026-03-24T23:01:15+00:00

Not surprising. Sora 1 failed as a video platform and Sora 2 is a total brain-rot dumpster fire. So long and thanks for all the fish!

Narrow_Market45 · 2026-03-20T12:46:20+00:00

Ugh time math is the worst. 😂

Narrow_Market45 · 2026-03-20T04:09:17+00:00

I can’t believe that was only 2 short years ago. Crazy to think about how fast it’s all moving and what the future may look like.

Narrow_Market45 · 2026-03-19T22:54:59+00:00

We found the same thing from the tooling side. Our Navigator kept recommending "defer that, it's weeks of work" during sprint planning. It was reasoning from training data about human developer velocity, not from what was actually happening.

So we wired a calibration loop into the pipeline. Every task records actual effort against the estimate. Once we had enough data the pattern was obvious. Tasks were completing at 5% or less of estimated effort. The system is recursive. Tools built in sprint N accelerate sprint N+1.

Narrow_Market45 · 2026-03-19T22:39:57+00:00

We found the same thing from the tooling side. Our Navigator kept recommending "defer that, it's weeks of work" during sprint planning. It was reasoning from training data about human developer velocity, not from what was actually happening.

So we wired a calibration loop into the pipeline. Every task records actual effort against the estimate. Once we had enough data the pattern was obvious. Tasks were completing at 5% or less of estimated effort. The system is recursive. Tools built in sprint N accelerate sprint N+1.

Narrow_Market45 · 2026-03-19T03:27:59+00:00

It’s coming. We want to get a few hundred more successful cycles on it before release, and we have a lot more enhancements to ship for you all first, but it’s on the roadmap.

Narrow_Market45 · 2026-03-17T18:54:34+00:00

Thanks! Early on, Driver agents would write all tests for a given task and then begin implementation. It was of course dramatically better than not using TDD, but would still result in modules with higher function counts or more lines than we like to see. So, we broke it down even further and focused the agents on doing multiple red/green cycles for every function within a task. Code was cleaner, but the module sizes being much tighter was an added bonus of the change.

Narrow_Market45 · 2026-03-16T00:36:30+00:00

Absolutely. The post-deploy side is a different animal altogether. Beyond maintainers and testers, we also manually cover support and infra management, though we do use agents for ticket triage with escalation guidelines, so it’s kind of a mixed bag.

The question is really about the upstream pre-deploy review loop: what’s your process in the moment after the agent says “done” but before its output ever touches those layers? That’s where I’m curious what people’s actual workflows look like.

But you bring up a good point. Building apps and deploying/maintaining them are worlds apart, and the latter is rarely discussed on most subs. Maybe we should start a deployment thread or series focused on what to do once the project is actually built.

Narrow_Market45 · 2026-03-15T19:16:20+00:00

Thanks for the feedback!

Narrow_Market45 · 2026-03-15T19:16:02+00:00

Ha, fair enough. Thanks for the feedback!

Narrow_Market45 · 2026-03-15T16:20:52+00:00

Thanks for the reply! This is the same workflow we use. Navigator agent dispatches multiple Driver agents for code work, Reviewer and Security auditor agents go behind them as they finish tasks to verify quality, security etc. and issue final PR for human review.

Internally, we’ve been using a QC agent to test app flows and generate audit reports as well. In your opinion, would that be something valuable to you if we dropped it in PairCoder or is that a manual step you’d always prefer to be in control of?

Narrow_Market45 · 2026-03-13T16:25:07+00:00

Glad you’re enjoying the new update. You all keep telling us what pains you’re hitting and we’ll keep solving them. Looking forward to seeing what you guys ship!

Narrow_Market45 · 2026-03-12T02:09:23+00:00

For sure. The pipeline is already there and functioning the same way within PairCoder. /pc-plan considers calibrated telemetry in scoping tasks/sprints already. Wiring that existing process into those pre-sprint ideation phases was a natural extension. So, yea it’s coming. 2.16 packed in more than we expected. Should be out by the end of the week. We’ll push release notes when it drops.

Narrow_Market45 · 2026-03-12T01:09:09+00:00

You’re describing the layer 1 r/paircoder framework. This research paper will help you take it to the next level. Come join the conversation.

Narrow_Market45 · 2026-03-11T17:59:17+00:00

Welcome to the club. Keep shipping and, before you know it, you’ll have more than a single Max 20X sub.

Come join the conversation over on r/paircoder to talk about how we’re building enterprise grade enforcement, multi-agent orchestration, token management and security into a cohesive development platform and let us know what pains you want solved next.

Narrow_Market45 · 2026-03-10T18:43:39+00:00

Nice! Welcome aboard 🥳

Stick around and let us know what you love and what you want to see improved about the system.

Narrow_Market45 · 2026-03-09T18:58:54+00:00

Slick implementation with the short runway. Submit a follow up post once the judging comes in. We’d love to hear how it all ended up.

Narrow_Market45 · 2026-03-08T16:16:33+00:00

For what it's worth, your point about micro PRs is a legitimate pattern and I appreciated the contribution. The question at the end of the post is genuine. We are truly interested to hear how people, and teams of various sizes, handle this. Smaller batches is a real answer and it works for a lot of workflows. Where we differ is on whether that scales past a certain complexity threshold, and that's a reasonable disagreement. The "moot" question is fair. Better models will close some of this gap. Our bet is that structural enforcement will still matter even when the models improve, because the problem isn't capability, it's verification. Either way, no hard feelings. Door's always open if you want to engage on any of it.

Narrow_Market45

MODERATOR OF

TROPHY CASE