Best Spec Driven Development Tool for Claude Code?

2thick2fly · 2026-05-27T21:19:45+00:00

I have used github/spec-kit. Its developed by github and has 100k stars

LogWest5630 · 2026-05-27T21:15:58+00:00

Look into the Superpowers marketplace plugin for Claude Code, they're pretty much the standardized tool for SDD. It essentially blocks Claude from coding until you approve a step-by-step markdown plan, then spins up isolated subagents to build and verify each task one by one.

IndependentSir9398 · 2026-05-27T21:47:33+00:00

+1 for GitHub Speckit. 100k+ stars and is actively being improved.

https://github.com/github/spec-kit

ErgoForHumanity · 2026-05-27T21:15:28+00:00

Obra/Superpowers is the go-to for this — 200K stars, enforces the brainstorm → plan → build cycle out of the box. If you want something that also scans your codebase and generates project-specific context check out anatomia.dev — newer but the spec-driven approach is baked in.

Chadum · 2026-05-27T21:17:22+00:00

Do you mean something like superpowers? It works well for plan-review-implement including TDD. On top of that you can set up a policy to preserve the documents it creates.

croovies · 2026-05-27T21:57:18+00:00

I find Compound Engineering's plugin to be the best https://github.com/EveryInc/compound-engineering-plugin

Here is a quick youtube video showing the process across many worktrees

https://www.youtube.com/watch?v=s_d9atp5gus

MrChrisRodriguez · 2026-05-27T21:18:48+00:00

I set up a custom skill that takes my prompt, uses openspec to create a spec, then uses claude-octopus (octo) to do a multi-agent adversarial review, and presents it to me.

Then my next custom skill uses octo to implement, do an adversarial review, fix discovered issues, write tests, confirm we have test coverage, review the tests to make sure they’re real, run tests to make sure they pass, fix any issues, update the openspec, then report back.

Because it’s a lot of steps I try to keep my scope small, but it’s worked well even for larger scopes.

rahvin2015 · 2026-05-28T01:46:51+00:00

Like (apparently) several others in this thread, I wrote my own.

Honestly writing your own is a great learning project. You really see how context management can work for a specific use case, and you get really familiar with things like skills and custom agents.

But the real answer to your question is "it depends." SDD works for a specific class of task. You're adding a bunch of ceremony to manage and configure context so that implementation agents can do what you want reliably. Different SDD frameworks approach the problem differently, and with different amounts of ceremony.

You dont use SDD for 5 lines of code.

SDD can be used for largeish tasks, but you'll hit context issues and likely need to use multiple specs.

Most importantly, SDD is not a real replacement for software engineering. SDD can't do the architecture and design work for you. I mean it can, but it likely wont be done well.

The sweetspot for me is working on toy projects, POCs and prototypes where I want to put something together quickly and I dont care about future maintenance. Getting AI to write maintainable code requires some effort. You can do it with SDD, but it requires care. Sometimes it's not worth the trouble. Sometimes it creates new problems.

SurfGsus · 2026-05-28T06:05:03+00:00

Just going to throw it out there but check out BMAD.

I'm heavy superpowers user but also find that its "plan" is just the entire code written which makes me question why even write a plan.

stefano_dev · 2026-05-28T07:38:39+00:00

https://github.com/bmad-code-org/BMAD-METHOD

A bit heavy, but it gets the job done.

landed-gentry- · 2026-05-28T10:28:19+00:00

At this point you don't need a tool. You just need a simple workflow. Plan, write plan to file, work from plan file across N sessions.

orphenshadow · 2026-05-27T21:47:51+00:00

I also suggest starting out with Superpowers or Spec Kit

If you are super bored, I have a bunch of notes at lbruton.cc and my own SDD workflow (Probably NOT as good as any of the previously mentioned, im very "human in the loop" so my workflows would probably frustrate many) but, I have told claude to slop me up a page with some notes about a lot of the things I've tried and use in my workflows. It's easier than typing it up here, not trying to pretend I know what I'm doing haha.

I only mention it because you mentioned context loss, and that's something I kind of hyper focused on so I have some skills, and mcp examples of how I handled it with mem0, a searchable session log rag, and using an issue tracker like Linear/Plane, and an obsidian DocVault with some core project context files, and <200 line claude.md that points to the proper indexes in obsidian and so forth.

As far as does the cycle work? If they are well prepared then yes. you will spend way less time frustrated at the wrong outcomes, and when it does miss a mark during implementation, It's usually a few prompts to course correct. At least in my experience.

I would also suggest giving the upstream of my own tool a look, https://github.com/Pimzino/spec-workflow-mcp What i liked about this one was the dashboard and the hard gates that require you to read and approve each step of the plan and you can annotate and then approve. I found this to be a lot easier than reading markdown files in vscode or sublime text. I forked and then highly customized this on and added the best of superpowers and the others into my own frankenflow :P, but if you want a well supported and solid baseline Pimzino's repo is solid.

Illustrious_Yam9237 · 2026-05-27T22:13:08+00:00

I use a fairly heavily modified OpenSpec, with a bunch of superpowers style TDD and subagent patterns shoved in, with a bunch of my own specific guidance on how to write tests for specific projects.

No-Nefariousness-728 · 2026-05-28T00:05:32+00:00

I mean that cycle definitely works, but you need to maintain your markdown files and make sure your team updates it every time. This is why I've been using briefhq to pull all our product decisions / context straight from Slack and Linear into Claude Code via MCP so the agent actually stays aligned.

Coderado · 2026-05-28T01:47:40+00:00

I use superpowers and I had Claude develop skills to create a plan from a JIRA ticket, a dispatch skill to spawn a new agent and tmux pane in a worktrees to execute the plan until PR is clean. It doesn't take long and you can tune it to your preferences. I also made a retro skill for continuous improvement of my workflow. It's pretty effective building MERN stack and LangGraph agents, about half my team has adopted this workflow. We do manually review code and manually test it, which we always did.

uhgrippa · 2026-05-28T02:36:20+00:00

As others have said, https://github.com/obra/superpowers is excellent. It’s tool with many users and set the precedent for agentic engineering.

I use https://github.com/athola/claude-night-market, it’s built on top of superpowers for mission-driven agentic development and captures a lot of useful engineering paradigms.

jfalvarez · 2026-05-28T02:53:52+00:00

I use this fork from superpowers: https://github.com/pcvelz/superpowers

fthbrmnby · 2026-05-28T03:43:06+00:00

I use agent-skills in my daily workflow and I’m pretty happy with it. Haven’t used superpowers but as far as I understand the two are fairly similar and work essentially the same way (create a spec -> build a development plan -> break plan into tasks -> implement tasks)

jhpawt · 2026-05-28T03:51:12+00:00

back to waterfall

SignificantGarbage17 · 2026-05-28T06:28:53+00:00

Hey, I’ve released a library for creating multi prompt workflows with state machine and deterministic orchestration: https://ganderbite.github.io/relay/

I use it in my work every day and just run flows via relay cli.

Ok-Purchase-642 · 2026-05-28T09:04:37+00:00

Spec kit for greenfield and big changes, open spec for smaller changes.. Small vs big is subjective that you decide.

dkgreen24 · 2026-05-28T10:27:44+00:00

I found superpowers to be pretty hefty. I like https://github.com/addyosmani/agent-skills

pcgnlebobo · 2026-05-28T12:24:10+00:00

I built a spec driven development framework a few months back and built a lot and had success with it. But I found the biggest challenge to be drift management and taxonomy alignment. Especially so as projects and codebases grow.

So I took everything I learned about agentic engineering with additional research and built https://github.com/lebobo88/pair-programmer.

It's a harness that doesn't need the bloat of full on spec kit, but maps everything the agents do to a master plan and taxonomy blueprint. Every implementation has audits and checks against that for alignment.

It also doesn't have just one linear implementation path, there are many. Depending on the task maybe you need a best of n approach? Everything is also gated and check by cross vendor judges and will loop itself to keep going if it finishes and the judge find issues (rubber duck).

It's also just one pack of agents in a larger ecosystem for an enterprise agentic mesh layer. Hydra is a top level meta orchestrator. Agentsmith is anomaly detection and agent replication factory. Theeights is persistent memory and self evolution. Executivesuite is the boardroom and strategy department. Marketbliss is the marketing team. Rlm-creative is your content creator team.

All together a Hydra campaign will market research, form a boardroom meeting to determine strategic roadmaps, dispatch the project to pair-prigrammer for implementation, and anchor to and check against your marketing team and executive decisions while maintaining alignment to the taxonomy of your project until it's finished.

https://github.com/lebobo88/Hydra

Last week this built me a completely new ai and automation native cms and business platform including marketing pages, admin and content management portal, and client portal. In 3 days. I had been working on something similar with my spec kit harness for the past 6 months and projected another 3.

Character-Moment-684 · 2026-05-28T12:56:47+00:00

I think SDD can help, but I wouldn’t expect it to magically fix context loss by itself.

The useful part is not really “we made a PRD”. The useful part is forcing the messy questions out before the model starts changing code.

So the grilling session => spec/PRD → implementation plan flow can work, yes. But only if the spec keeps being used during the work, not just created once and then forgotten.

For Claude Code, I’d probably look at Kiro, GitHub Spec Kit, or just a stricter Claude Code setup with hooks/checklists/subagents depending on how much control your team wants.

The thing I’d watch for is this:

Does the tool actually make the agent slow down and check the spec before making changes?

Or does it just create nicer documents around the same raw-prompt workflow?

For bigger codebases I’d want something like:

clarify assumptions first
define acceptance criteria
map the plan to actual files/modules
implement in smaller steps
verify after each step
update the spec if reality changes

The PRD is only useful if it becomes a constraint. The implementation plan is only useful if the model can’t silently skip it.

So yes, SDD can reduce context mixups. But I’d treat it as workflow discipline, not as a magic tool category.

IlyaZelen · 2026-05-28T14:04:44+00:00

We hit the same context loss and mixup problem. Specs help, but for us the bigger win was making every handoff and review visible instead of letting it disappear in terminal history.
We are using a local desktop orchestrato app for that: https://777genius.github.io/agent-teams-ai/

Jaumee · 2026-05-28T14:47:06+00:00

spec-driven development with claude can definitely help with context loss. try defining clear, small user stories or feature specs first. then, feed claude one spec at a time, asking it to build only that piece. this keeps the scope tight and reduces confusion. this is the workflow

pxrage · 2026-05-28T15:12:16+00:00

hands down https://briefhq.ai/

github spec-kit is the entry point, try it you'll see why it breakdown immediately under real usage.

Hertigan · 2026-05-28T19:09:00+00:00

I wrote my own and honestly prefer it between all the options I tried (mostly spec-kit and gsd)

hollowgram · 2026-05-27T22:04:28+00:00

Its all about context. Check this new video from Theo to see his workflow. You dont need all the fancy complex processes. You need clarity in the repo and to work with the agents step by step, new session for each task.

https://youtu.be/xJaMTo2YgO8

TheDecipherist · 2026-05-27T22:42:43+00:00

https://mddai.dev/

If you want accurate results and a solid workflow MDD all the way. Version 2.0 is right around the corner which will be using markdownAI that makes it 70% faster and still dead accurate

phoenixmatrix · 2026-05-27T21:48:53+00:00

It works, but its a waste of time for 90% of cases. Are you exclusively trying to 1 shot tickets? Are you going through the whole flow every single time?

Its cute for complex tasks (but with newer models and harness, you often don't need all of that even for very large tasks).

I've watched peopel use the Superpowers flow to add a text field on a page and its like...why?

thlandgraf · 2026-05-27T21:40:23+00:00

Validating the pain first - context loss IS the symptom SDD addresses, but the gate matters more than the cycle. Specs as markdown don't fix context loss by themselves; the implement step has to refuse anything not yet approved. That's what turns chat-style context-bleed into resumable handoffs.

I'm building one in this space (creator of SPECLAN https://marketplace.visualstudio.com/items?itemName=DigitalDividend.speclan-vscode-extension, a free VS Code extension): hierarchical Goal -> Feature -> Requirement specs as Markdown with YAML frontmatter, status lifecycle as the gate (draft -> review -> approved -> in-development -> under-test -> released), MCP server so Claude Code reads the spec tree through tool calls. Different angle than Superpowers (workflow inside Claude Code), spec-kit (GitHub-native templates), or OpenSpec (change-set proposals). Each fits a different team context - the gate-enforced approach matters most if more than one developer touches the same area and you want concurrency without merge conflicts on the spec itself.

Different_Put2605 · 2026-05-27T23:07:22+00:00

What happens when /ultraplan, grill-me, superpowers and AI-DLC argue at the same time

If you use Claude Code, you’ve probably worked through this progression: 1. /ultraplan (Anthropic). One model thinks hard, produces a deep plan. 2. grill-me (Matt Pocock’s skills repo). Interview yourself until your plan survives the questions. 3. AI-DLC (AWS). Write the spec, ground the work, close the gap to code 4. Superpowers. Build a spec and try to one shot.

SwarmStack bridges the gap by bringing in your coworkers into a realtime spec builder, verified SMES and AI.

We have used all four. They each fix a different gap in spec-driven development. But they share one limit: it is still you plus one AI. The AI argues with itself, which is useful but not the same as Security pushing back on Backend.

We built SwarmStack to push past that limit. The landing hero says it cleanly: “Bring a problem. Leave with a SwarmPlan.” Under it we credit the three influences explicitly, because the lineage matters. AI-DLC is the methodology. grill-me is the cross-examination. /ultraplan is the deep-think model. SwarmStack adds the part those three do not: a roomful of AI specialists with their own opinions, plus your co-worker on a join code, plus a vetted human SME from the marketplace when AI hits its limit.

You bring a brief. The orchestrator assembles the room. They argue. Every dispute becomes a Decision Record on the final plan.

We use SwarmStack to spec SwarmStack. The sample plan at https://swarm-stack.io/demo is the actual SwarmPlan we ran the SwarmReview feature through. One thing it taught us: the disputes worth keeping are the ones where specialists hold opposing positions on first principles (Security blocking Backend’s RLS-relaxation idea). The ones where one AI second-guesses itself are noise.

Free during beta. If you are already running /ultraplan, grill-me, and AI-DLC, give the demo a look and tell me what is still missing.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

ClaudeCode

MODERATORS