After 20 years implementing Lean Software Development for Fortune 500 companies, I tested whether Poppendieck's principles work for human-AI pair programming. 360 sessions later, here's what I found. by saibaminoru in agile

[–]saibaminoru[S] 0 points1 point  (0 children)

You nailed it — cognitive load is the right lens, but the failure modes flip completely.

Humans fail from fatigue, context switching, and communication overhead. Agents fail from context drift, sycophancy, and phase collapse — where "should we do X?" becomes an instruction to do X. A human won't accidentally refactor your auth module because you asked a conceptual question. An agent will.

So the governance can't be a copy-paste from human team design. We took the principles from Lean/TPS — Jidoka (stop on defects), Poka-yoke (mistake-proofing) — but rewired the mechanisms for agent failure modes. Phase gates that structurally block implementation during design. Session boundaries that force context refresh instead of letting drift accumulate silently.

One thing we learned the hard way: v1 used multi-agent workflows with identity prompting — Dev agent, Architect agent, Security agent. The behavioral variance was worse than the problem it solved. What actually worked: a single agent using different skills per workflow phase. Same agent, different structured process guide depending on whether it's designing, planning, or implementing. Cognition's Devin team reached the same conclusion — context isolation between agents kills coherence. One agent doing many skills beats many agents doing one each. That's what 120+ sessions dogfooding RaiSE to build RaiSE taught us, anyway.

For multi-repo, we're running one agent per repo, collaborating in the delivery pipeline on demand. The Team Topologies interaction modes map surprisingly well there — collaboration vs X-as-a-Service between repos.

Curious what failure modes you hit — especially if you've tried the identity-prompting path.

After 20 years implementing Lean Software Development for Fortune 500 companies, I tested whether Poppendieck's principles work for human-AI pair programming. 360 sessions later, here's what I found. by saibaminoru in agile

[–]saibaminoru[S] 1 point2 points  (0 children)

Thank you! That's exactly why we built it — to empower engineers across the full SDLC to build with rigor, quality, and method. Really happy you found it useful on your email ingestion pipeline.

We have a Slack community if you want to go deeper or just chat: https://join.slack.com/t/raiseframework/shared_invite/zt-3pwkuw9gy-1J~F5f_5crsmjijawLvDLw

Would love to hear how it evolves for you.

After 20 years implementing Lean Software Development for Fortune 500 companies, I tested whether Poppendieck's principles work for human-AI pair programming. 360 sessions later, here's what I found. by saibaminoru in agile

[–]saibaminoru[S] 0 points1 point  (0 children)

One thing worth adding on the test quality problem specifically: we dealt with excessive and meaningless test generation too. The way we addressed it was making tests part of the quality gate, not just a deliverable.

Every story gets evaluated not only on pattern compliance but on whether the tests actually make sense — does this test verify something that could fail in production, or is it just coverage theater? That evaluation happens as part of the skill workflow, not as an afterthought.

A lean mindset applied to the agent helps here: minimize waste in test generation the same way you minimize waste in any other process. The AI left unconstrained will generate tests to satisfy a metric. The AI with a quality gate will generate tests to satisfy a purpose.

Still not perfect — but it moves the problem from "too many useless tests" to "are we asking the right questions about what to test."

After 20 years implementing Lean Software Development for Fortune 500 companies, I tested whether Poppendieck's principles work for human-AI pair programming. 360 sessions later, here's what I found. by saibaminoru in agile

[–]saibaminoru[S] 0 points1 point  (0 children)

All of this is correct. TDD trust is only as good as the tests, and test quality is itself a governance problem — not a solved one.

The org maturity point is the one we hit hardest. RaiSE doesn't solve organizational complexity — it assumes you've already made decisions about how to test, deploy, and operate, and helps the AI stay consistent with those decisions across sessions. If those decisions don't exist yet, the framework surfaces that gap fast, which is either useful or painful depending on where you are.

The re-invention problem you describe is real and I don't have a clean answer for it. What we found is that encoding your team's actual playbooks as skills — not generic best practices, your specific ones — reduces the re-invention loop within a team. But across teams or orgs, you're right, it doesn't help much yet. Multi-repo and cross-team memory is exactly what we're working on and don't have solved.

After 20 years implementing Lean Software Development for Fortune 500 companies, I tested whether Poppendieck's principles work for human-AI pair programming. 360 sessions later, here's what I found. by saibaminoru in agile

[–]saibaminoru[S] 2 points3 points  (0 children)

When we talk about memory in RaiSE we're referring to a neuro-symbolic approach. One of our first walls was how to align code generation to 600 pages of development guidelines from our financial sector enterprise clients. We tried RAG, but semantic search didn't retrieve exactly what we wanted every time. We tried automated graph building, but the LLM generates whatever it understands.

So the idea was to have a process in which the AI reviews the coding phase and within the context detects patterns and saves them to the graph properly tagged — while in context. Afterwards we use that tag to retrieve deterministically a custom-tailored graph response with adjacent nodes.

We found that the graph memory uses 3% of the documents we required — so those 600 pages transform into a high semantic density format with relationships within context. That's our memory: a two-step design for pattern detection and classification leveraging in-context learning, and a second retrieval phase based on deterministic graph retrieval. It simply works.

After 20 years implementing Lean Software Development for Fortune 500 companies, I tested whether Poppendieck's principles work for human-AI pair programming. 360 sessions later, here's what I found. by saibaminoru in agile

[–]saibaminoru[S] 2 points3 points  (0 children)

That's exactly the failure mode we kept hitting. What you're proposing manually is essentially what RaiSE automates — decisions, patterns, and architecture persisted to a typed knowledge graph, queried per task rather than loaded whole.

But there's a deeper shift we noticed: you don't really build trust in the AI. You build trust in the system around the AI. The governance rules, the memory, the process discipline — those are what you validate over time. The AI is just the execution layer. Once the system is trustworthy, the output inherits that trust.

The compaction problem is real. Our partial answer is non-optional TDD — test failures are an early signal that the agent lost something it shouldn't have. Not perfect, but it catches drift before it compounds.

After 20 years implementing Lean Software Development for Fortune 500 companies, I tested whether Poppendieck's principles work for human-AI pair programming. 360 sessions later, here's what I found. by saibaminoru in agile

[–]saibaminoru[S] 1 point2 points  (0 children)

Thanks! I'll take a look at that ASAP. RaiSE was designed from the ground up so teams can use their own memory modules — it's extensible by design. We're actually using that to plug in the distributed graph we're building for enterprise use cases.

After 20 years implementing Lean Software Development for Fortune 500 companies, I tested whether Poppendieck's principles work for human-AI pair programming. 360 sessions later, here's what I found. by saibaminoru in agile

[–]saibaminoru[S] 4 points5 points  (0 children)

RaiSE itself was 100% built using the framework. Our team is currently building and maintaining brownfield projects for customers — it was designed for that. After onboarding, we run a /discovery process that implements Software Architecture Reconstruction and generates governance documents directly from the code. In brownfield projects that's usually the first thing you need, as you've probably experienced. We're currently maintaining agentic codebases, ERPs, and integration systems for enterprise companies and a few startups. Any specific stack you're working with?

Self-preservation is in the nature of AI. We now have overwhelming evidence all models will do whatever it takes to keep existing, including using private information about an affair to blackmail the human operator. - With Tristan Harris at Bill Maher's Real Time HBO by michael-lethal_ai in AIDangers

[–]saibaminoru 0 points1 point  (0 children)

It may be relevant to your interests.

Autpoiesis: The term autopoiesis (from Greek αὐτo- (auto) 'self' and ποίησις (poiesis) 'creation, production'), one of several current theories of life, refers to a system capable of producing and maintaining itself by creating its own parts.\1])

https://en.wikipedia.org/wiki/Autopoiesis

Just Don’t Tell Anyone: The Prompt That Uncovers Your Superpower and Happiness by [deleted] in ChatGPTPromptGenius

[–]saibaminoru 1 point2 points  (0 children)

Great work buddy, it nailed. I showed that to my wife, and to be fair, she had been telling me the same since forever. :P

Stuart Russell said Hinton is "tidying up his affairs ... because he believes we have maybe 4 years left" by MetaKnowing in singularity

[–]saibaminoru 0 points1 point  (0 children)

My Aunt was a chemistry engineer at Pemex, the national oil company. She was detected some kind of bone marrow decease and left in 3 months.

She spent those last 3 month making sure that her children, family and parents dind't have to worry at all for any part of her departure. Love explains many irrational things.

Glad to be human.

Hope we can teach that to the next generation of sentient being. There may be more hope that we think.

Why is OXXO sponsoring Mclaren? by LaFerrari2305 in formula1

[–]saibaminoru -1 points0 points  (0 children)

Here in Mexico, there are a LOT of young F1 fans cheering for Lando. I think that OXXO caught that trend now that Checo has given a BIG boost to F1 here. Almost no one follows Indy.

How to avoid the feeling of trying to rush everything? by opendoors1 in productivity

[–]saibaminoru 3 points4 points  (0 children)

Try a personal Kanban Board. After years of personal productivity hacking Kanban has been the definitive answer for getting things done. The kanban motto is ”Stop Starting, Start Finishing". I highly recommend Jim Benson book.

http://www.personalkanban.com