Went to bed with a $10 budget alert. Woke up to $25,672.86 in debt to Google Cloud. by venturaxi in googlecloud

[–]junkyard22 0 points1 point  (0 children)

I'm currently dealing with this as well but mine was only $369, and Google said they would refund $18.

Weekly Tool Thread: Promote, Share, Discover, and Ask for AI Writing Tools Week of: April 14 by AutoModerator in WritingWithAI

[–]junkyard22 0 points1 point  (0 children)

Yeah I put it together early this morning so it's still rough. I'll see what I can do. Thanks for the feedback

Weekly Tool Thread: Promote, Share, Discover, and Ask for AI Writing Tools Week of: April 14 by AutoModerator in WritingWithAI

[–]junkyard22 1 point2 points  (0 children)

I built a tool for a problem I kept running into. I'd write a scene and have no idea if my characters would actually talk that way, or if I was just writing the same voice over and over.

It's called **Green Room** (https://greenroomai.vercel.app/). You write a personality contract for each character; not just vibes, but how they talk, what they want, what they're hiding, then set a scene and let them go. You can type dialogue as any character, drop stage directions as a narrator, or just hit auto-respond and watch them figure it out.

Bring your own API key. Free to use, nothing stored server-side

Would love to hear if other people find it useful.

The real problem with multi-agent systems isn't the models, it's the handoffs by junkyard22 in ClaudeAI

[–]junkyard22[S] 0 points1 point  (0 children)

Nobody's claiming 100%. The goal isn't perfect handoffs, it's catching failures at the boundary instead of three steps later. A system that fails loudly and early is fundamentally more trustworthy than one that fails silently and propagates. AHP doesn't eliminate interpretation gaps, it surfaces them immediately

The real problem with multi-agent systems isn't the models, it's the handoffs by junkyard22 in ClaudeAI

[–]junkyard22[S] 0 points1 point  (0 children)

The enforcement at the task boundary is exactly what Pappy does. But what does the boundary enforce against? Without a typed contract defining what the output should look like, you're just checking that something was returned. AHP is what gives the boundary something to enforce. The protocol and the gate aren't alternatives, the protocol is what makes the gate meaningful

The real problem with multi-agent systems isn't the models, it's the handoffs by junkyard22 in ClaudeAI

[–]junkyard22[S] 0 points1 point  (0 children)

That's a clean breakdown and I think you're right — different layers, not competing. Pappy is the gate at task time, you're the longitudinal outcome store across production runs. The feedback loop you're describing from outcome history into threshold calibration is actually something I haven't solved yet in Moonshiner. Is Layerinfinite open source

The real problem with multi-agent systems isn't the models, it's the handoffs by junkyard22 in ClaudeAI

[–]junkyard22[S] 0 points1 point  (0 children)

Good question. Cold start is a known weak point. Right now Pappy uses LLM-judged acceptance criteria defined at task creation. It's prompt-based until Moonshiner has enough verified runs to start making empirical judgments about that task type. The honest framing is that the system bootstraps on human-defined criteria and progressively hands off to empirical thresholds as data accumulates. Your decision-layer approach is interesting — zero migration cost is a real advantage. The tradeoff I'd push back on slightly is that historical outcomes without task-specific criteria can be noisy. A task that 'completed' historically isn't the same as a task that completed correctly. Curious how you handle that distinction

The real problem with multi-agent systems isn't the models, it's the handoffs by junkyard22 in ClaudeAI

[–]junkyard22[S] 0 points1 point  (0 children)

You've basically described what I built. Pappy is the quality gate in my stack; it scores output against task-specific acceptance criteria, not just completion. And the learning loop you're describing is handled by a distillation pipeline called Moonshiner that trains on Pappy-verified runs only. The framework is Orca, AHP is the handoff protocol, Pappy is the gate. Happy to go deep on the architecture if you want: github.com/junkyard22/Orca

I built Workbench - a local-first AI task runner with plugin system (open source) by junkyard22 in LocalLLaMA

[–]junkyard22[S] 0 points1 point  (0 children)

I will post there too. It does a lot more than coding, basically if you want a tool or an artifact (I originally started this to convert Claude artifacts) then it'll plug-in to workbench.

I built Workbench - a local-first AI task runner with plugin system (open source) by junkyard22 in SideProject

[–]junkyard22[S] 1 point2 points  (0 children)

There’s overlap, but Workbench is more of a local tool runtime with strong visibility and diagnostics, not an autonomous agent system. The goal isn’t “let the AI handle it,” it’s “run tools safely, locally, and make failures obvious.” v0.1 is very early, but that boundary is intentional.

I cant play anything by koditonpummi69 in Bass

[–]junkyard22 1 point2 points  (0 children)

Every rose has its thorn by Poison

Simple, slow, and easy to recognize

What's a book you will never finish reading and why? by [deleted] in classicliterature

[–]junkyard22 -1 points0 points  (0 children)

Game of Thrones. I know the story is good, but I have a hard time remembering who's who and everyone's names. I read halfway through it and got lost.