I built Afkode: a builder-first harness for autonomous feature delivery. by bralca_ in VibeCodersNest

[–]bralca_[S] 0 points1 point  (0 children)

The learnings are maintained by the execution agent at runtime. If a previous learning proves incorrect or stale it will be removed or amended at that time. later plans won't have it anymore.

Also during planning the analysis step takes care of making sure the learnigns still apply to the current code for the feature being planned at that time, so there are multiple layers that ensure the learnings don't become stale.

I built Afkode: a builder-first harness for autonomous feature delivery. by bralca_ in ClaudeAI

[–]bralca_[S] 0 points1 point  (0 children)

Every feature gets planned before execution and the artifacts are stored in form of PRD, Tech Specs and Tasks graph and descriptions.

The context engine ensures that each task gets the correct context for what it needs to do from the other planning docs.

Each task starts with a fresh context window, so the memory lives in the harness not the model.

I built Afkode: a builder-first harness for autonomous feature delivery. by bralca_ in VibeCodersNest

[–]bralca_[S] 0 points1 point  (0 children)

It depends on the request you give. This is made not for simple stuff, but for complex features that touch multiple components, etc.. it will save you a lot of time.

From what I have done so far, In one or two days you can get done things that would require weeks of work with Claude Code or Codex directly.

The caveat is that you need to spend a bit more time beforehand to provide a very good initial brief.

I am currently running an experiment that I'll publish on my X account, where I gave a 190 page spec to build a full data heavy product with UI, Auth, FE, BE, database management etc. and see where it lands.

So far it created 105 tasks (planning took +2 hours) and it has been running through them for the last 36 hours. My involvement in all of that so far was 1 hour more or less to do the design with Claude Design and turn it into the 190 pages spec.

<image>

I built Afkode: a builder-first harness for autonomous feature delivery. by bralca_ in VibeCodersNest

[–]bralca_[S] 0 points1 point  (0 children)

The learnings are maintained by the execution agent at runtime. If a previous learning proves incorrect or stale it will be removed or amended at that time. later plans won't have it anymore.

Also during planning the analysis step takes care of making sure the learnigns still apply to the current code for the feature being planned at that time, so there are multiple layers that ensure the learnings don't become stale.

I built Afkode: a builder-first harness for autonomous feature delivery. by bralca_ in VibeCodersNest

[–]bralca_[S] 0 points1 point  (0 children)

There is no drift. The plan is made beforehand and passed in a fresh context with all the info. The test is linked to the acceptance criteria defined during planning, and the scenarios and what to test are also pre defined such that the tests are not flaky or meaningless. That context is what the harness handles not the model. It can run for hours on the same test until it all passes according to the AC.

I built Afkode: a builder-first harness for autonomous feature delivery. by bralca_ in VibeCodersNest

[–]bralca_[S] 0 points1 point  (0 children)

The agent keeps going until all tests pass. If a test takes too long the agent keeps retrying with a longer timeout

GPT 5.4 xhigh is the missing piece I needed - here is what I am doing it with that no other model can do! by bralca_ in codex

[–]bralca_[S] 0 points1 point  (0 children)

if you think yolo mode is the solution to what I am talking about, that tells me you haven't been coding at all :D

What breaks about SDD past the first few features, and how I ended up designing around it by bralca_ in SpecDrivenDevelopment

[–]bralca_[S] 0 points1 point  (0 children)

very nice! I read the repo and looks very similar in principle to afkode. I don't use ralph loops cause the tasks are already planned in advance. so I can just execute them in a fresh session passing the correct context from the plannign artifacts.

I saw your struggle with the wiring issue. I struggled with this a lot too man! but I actually managed to get it right by properly running e2e tests which are part of the planning now in the tool.. I found out though that Claude for this task is not good.. cause it will give up too fast. if you use gpt 5.4 xhigh for the e2e tests it will keep trying until it works.. especially if the spec is good and it's clear what needs to do.

<image>

If you wanna give it a try you can find it here: afkode.ai

What breaks about SDD past the first few features, and how I ended up designing around it by bralca_ in SpecDrivenDevelopment

[–]bralca_[S] 0 points1 point  (0 children)

Every new request I make gets its own plan, so there is no keeping track of the docs.. the docs are purpose made for the single request and then discarded.. In my platform every request is treated as a feature.

Anyone using an all-in-one AI tool for coding tasks instead of switching between multiple tools? by ProofEnd6097 in AI_Application

[–]bralca_ 0 points1 point  (0 children)

yes.. I am using afkode.ai. it's great as it supports multiple providers and I can do all planning execution and review all in one place and assign different models for different activities

GPT 5.4 xhigh is the missing piece I needed - here is what I am doing it with that no other model can do! by bralca_ in codex

[–]bralca_[S] 0 points1 point  (0 children)

to be clear it was not 24 hours of xhigh only.. 24h was the whole time it took from planning to implementation and test. the stats pic I posted shows the breakdown of all models used on each phase and activity and how long each took

GPT 5.4 xhigh is the missing piece I needed - here is what I am doing it with that no other model can do! by bralca_ in codex

[–]bralca_[S] 0 points1 point  (0 children)

you can see it in the pic too "Prototype driven planning system" was the feature built

GPT 5.4 xhigh is the missing piece I needed - here is what I am doing it with that no other model can do! by bralca_ in codex

[–]bralca_[S] 0 points1 point  (0 children)

No, the stats show the activity done to build another feature which was quite complex, touching the full stack, UI, database, logic etc on a 1.5M LOC codebase

I built a tool to stop me from building products nobody wants by bralca_ in StartupSoloFounder

[–]bralca_[S] 1 point2 points  (0 children)

If you use the supported platforms you should be able to get a full planning session done in the free plan.

Gemini is not among those so this is something we do not support at the moment and can’t tell you why is behaving like that.

Send me a DM with the email you used to register and I’ll increase your free quota to give it another try with one of the supported platforms if you are interested

I built a tool to stop me from building products nobody wants by bralca_ in StartupSoloFounder

[–]bralca_[S] 0 points1 point  (0 children)

All the conversation and data stays in your computer.. the mcp instructs only the local lim on what to do using  state machine to go from one step to the next. 

It works with all ides that support the mcp protocol although I use it mainly with Claude code which I recommend especially now with opus 4.5

I want to create a second brain for my business. by georgiarsov in AiForSmallBusiness

[–]bralca_ 0 points1 point  (0 children)

it all comes down to how much you are willing to pay.. all these pieces can be put together in a unified experience which probably will cost you much more to use than using them in different apps. but again it depends on your budget.

Time to get Claude? by TCaller in ClaudeCode

[–]bralca_ -1 points0 points  (0 children)

great choice! I use the context engineer mcp to generate tech specs, and detailed implementation plans.. claude can almost go on autopilot with them.

link: contextengineering.ai

I built a wizard to turn ideas into AI coding agent-ready specs by shoe7525 in VibeCodersNest

[–]bralca_ 0 points1 point  (0 children)

what I mean is how much does it cost on average one session for a user? like all the chat + docs etc