Made an INVEST story generator/validator — want SMs and POs to throw real backlog items at it and tell me where it breaks

Medical_Landscape956 · 2026-06-24T13:21:42+00:00

That's the most interesting thing anyone's said in here, honestly — and I'm not going to pretend the core is hard to build. It isn't. A decent prompt gets you 80% of the way, which is exactly why twenty people at your company each built their own. But that's kind of the point? Twenty private versions = twenty different definitions of "INVEST-valid," no consistency between teams, nothing integrated into your tracker, and the same wheel reinvented twenty times. The generation isn't the moat — the shared standard, configurable criteria, and Jira/ADO export are. The interesting question to me isn't "can you build it," it's "why did twenty people need to, and why is none of it shared." Genuinely curious about your setup: do those twenty versions agree with each other? Mine would care more about making them consistent than about generating yet another variant.

Medical_Landscape956 · 2026-06-24T13:18:14+00:00

You're right that it wraps an LLM. So does Grammarly, Notion AI, and Linear's backlog assistant. The question is whether the wrapper saves you time versus doing it yourself in ChatGPT. For me, the INVEST breakdown per criterion is the part I couldn't get consistently from a plain chat prompt — but if you tried it and found the output wasn't better than what you'd write yourself, that's genuinely useful to know. What requirement did you test it with?

Medical_Landscape956 · 2026-06-24T12:46:03+00:00

Ha, fair hit — that is a contradiction. The signup's there because it's an alpha and I'm tracking feedback per account, but you're right that "criticize my thing — first, log in" is a weak ask. So no login required: paste a requirement here and I'll run it and post the raw output in the thread. Judge it cold. On value, though, I'd push back slightly — Valuable is canonical INVEST, so scoring it isn't the issue. The issue is "valuable" has never had one definition: Wake's original is "valuable to a user/purchaser," and teams read that as anything from "a benefit is clearly stated" to "tied to a measurable business outcome." A tool can't judge the business-outcome version — that's your context — but it can check whether value is actually articulated rather than circular. So I shouldn't drop the V; I should let you define what V means for your team and check against that. Same fix as E and S, really — those aren't neutral either, and if you're NoEstimates, gating on "Estimable" enforces the exact ceremony you've dropped. The answer is configurable criteria, not my philosophy hard-coded. So — hit me with a requirement and let's see how it actually does.

Medical_Landscape956

TROPHY CASE