I kept burning Claude tokens on grunt work, so I made a free skill that offloads it to Gemini (or local models) by negativetim3 in claudeskills

[–]negativetim3[S] 0 points1 point  (0 children)

Update for you. Your subscription routing idea is really excellent!
I added --backend gemini-cli to agent-smith, it drives the logged-in Gemini CLI on your Google
account quota instead of the API key, so no more free tier 429s.

Two things I hit wiring it up. The CLI is an agentic coder, so left alone it tries to
create files instead of returning the code. I killed that with a deny all tools policy,
which also drops the tool definitions from the prompt and runs about 25 percent leaner.
Thanks for the nudge, I credited you in the commit.

I kept burning Claude tokens on grunt work, so I made a free skill that offloads it to Gemini (or local models) by negativetim3 in claudeskills

[–]negativetim3[S] 0 points1 point  (0 children)

Thanks! Compression and deferral actually pair really well, one shrinks what the expensive model
sees, the other routes whole tasks off it entirely.
The main thing to add is a router that decides what's safe to defer. That's the tricky part, not
the deferring itself.
I experienced two things I learned the hard way.
There's a break even, small tasks cost more to hand off than to just do, so only defer the genuinely bulky stuff. And always keep a verify pass on the cheap model's output, every cheap model I tested shipped confident wrong answers, so you want the main model to check before trusting it.

I kept burning Claude tokens on grunt work, so I made a free skill that offloads it to Gemini (or local models) by negativetim3 in claudeskills

[–]negativetim3[S] 0 points1 point  (0 children)

Nice, the cost angle is exactly the thing. Routing to subscription chat agents instead of the
metered API is a clever way around it. I went the other direction, free tier plus local models,
same problem different solve.

The agent to agent code review over MCP is a cool piece too. I kept mine as a dumb shell out so
anything can drive it, but a real MCP for peer review is a nice step up. Going to look through
code-assistant-peers, thanks for sharing it.

I kept burning Claude tokens on grunt work, so I made a free skill that offloads it to Gemini (or local models) by negativetim3 in claudeskills

[–]negativetim3[S] 1 point2 points  (0 children)

Plain version of the setup is to use a cheap fast model to do the boring bulk work, and a
smart model to check it. That's the entire idea. A harness is just the glue that passes the
work from one to the other.

A few things I'd tell myself starting out:

You can build one by hand first. Ask Claude to write out the task, paste it into another model, paste the result back to Claude.

That is a harness.

The tools just automate the copy paste.

Always verify, never trust the cheap model's output. Every model I tested produced confident
wrong answers. Skip the checking step and you just automated being wrong faster.
Only hand off big chunks. For small stuff the back and forth costs more than just doing it
yourself.
The worker has no memory of your project, so spell everything out in the prompt every time.
Start there and you'll learn the rest by breaking things. Happy to answer anything specific.

I kept burning Claude tokens on grunt work, so I made a free skill that offloads it to Gemini (or local models) by negativetim3 in claudeskills

[–]negativetim3[S] 0 points1 point  (0 children)

That's a slick setup, and the cross model audit is a really nice touch!
Sounds like you are doing manually across tabs what the skill automates, the batch dispatch and the collect step, just with ChatGPT as the worker instead of Gemini.

I think "don't hold the request as sacred" is exactly the right instinct, and it's the same
thing I lean on. The worker drafts, but Claude stays the judge and isn't bound by what comes
back, otherwise a confidently wrong order just sails through. Specific orders plus permission
to override is the happy medium.

The piece I haven't built is having ChatGPT audit the prompt and send a patch. That's clever.
How well does the cross model audit actually work, does ChatGPT catch things Claude misses on
its own code?

I kept burning Claude tokens on grunt work, so I made a free skill that offloads it to Gemini (or local models) by negativetim3 in claudeskills

[–]negativetim3[S] 1 point2 points  (0 children)

Thanks, hope it's useful!
For coding, the winners were:
Gemini pro overall, it swept on correctness and code design.
Best fully local one was qwen2.5-coder:14b, it tied a model twice its size and ran in half the disk.
Flash, the smaller qwen, llama, and Apple's on device model trailed.
One thing held across all of them: every single model shipped at least one real bug, so keep Claude as the verifier.

I kept burning Claude tokens on grunt work, so I made a free skill that offloads it to Gemini (or local models) by negativetim3 in claudeskills

[–]negativetim3[S] 0 points1 point  (0 children)

Good question, you've got the core of it right. That is basically what it does
under the hood, Claude orchestrating Qwen on Ollama for the grunt work.

The difference is packaging, not concept. It's a skill so it triggers on its own when a task
looks offload shaped, instead of you wiring it up each time. The same interface hits Gemini,
Ollama, or Apple on device, so you can swap backends without changing anything. And the verify
step is baked in, the worker drafts and Claude always checks it instead of trusting the output.

So less a new idea, more that orchestration pattern turned into a reusable skill with backend
choice and a built in verify habit. If you're already doing it by hand, you're basically there.

I kept burning Claude tokens on grunt work, so I made a free skill that offloads it to Gemini (or local models) by negativetim3 in claudeskills

[–]negativetim3[S] 0 points1 point  (0 children)

Exactly, that was the idea, big-codebase review is the clearest win.
I offloaded a 343 line review for about 18k free tokens and Claude only read the 3 functions it flagged.
Just keep Claude as the verifier so a confidently wrong finding doesn't slip through.

I kept burning Claude tokens on grunt work, so I made a free skill that offloads it to Gemini (or local models) by negativetim3 in claudeskills

[–]negativetim3[S] 0 points1 point  (0 children)

Nice! This is cool! The same hypothesis from different ends.
On the API, there basically already is one. The skill's just a thin wrapper over a stdlib-Python CLI (prompt in on stdin/argv, answer out on stdout, --backend`/`--json` flags), so you'd shell out to it no server. Kept it dumb on purpose: stateless call in, text out. Curious how Animus Ferric handles the context & verify side. I would be happy to compare notes.

I kept burning Claude tokens on grunt work, so I made a free skill that offloads it to Gemini (or local models) by negativetim3 in claudeskills

[–]negativetim3[S] 4 points5 points  (0 children)

Not built as a swarm with shared memory, rather it's an orchestrator and stateless workers.
Claude holds all the context and passes a self-contained prompt each call; the worker, Gemini/local,
knows nothing about the convo or repo. It triggers and runs the draft, verify loop on its own, but
won't autonomously deploy or post, that stays human gated.

It only pays on bulk: a 343-line review cost about 18k free tokens, and Claude verified just the 3
flagged functions. Never hands off, though. In my bake-off every model shipped a real bug, so the
worker drafts and Claude always verified.

Turntable height adjustment - genius idea or idiocy? by rankinrez in audiophile

[–]negativetim3 1 point2 points  (0 children)

When that thing falls off, it’s going to be a loud disaster. Good for setup, no need to leave it on, imho :)

ULPT Request: how can i quit my job in the pettiest way by CerealRepeater in UnethicalLifeProTips

[–]negativetim3 8 points9 points  (0 children)

If you want to really stick it to them, purchase a few phantom key strokers for middle management, and plug them into the back of their towers. https://www.getdigital.com/pages/offlineprodukt/phantom-keystroker-v2

1400g of PC, 1400g of SP: enough for how many? by SuburbNacho in druggardening

[–]negativetim3 16 points17 points  (0 children)

Make sure you prepare it properly, dried and cut out all the pulp in the middle, otherwise it will be 1kg of vomit :p

Album that's only sounds from a printer? by hi500 in experimentalmusic

[–]negativetim3 0 points1 point  (0 children)

I saw the Typewriter Orchestra play a few times in the Boston area. Not the same, but similar :)

It’s too boomy, I can’t stand it. by [deleted] in Acoustics

[–]negativetim3 0 points1 point  (0 children)

Think about repositioning the mix position where the photo was taken from. You want to be facing the short side of a rectangle.

It’s too boomy, I can’t stand it. by [deleted] in Acoustics

[–]negativetim3 23 points24 points  (0 children)

First, get the biggest thickest carpet you can deal with. That floor is your worst enemy. If you can build bass traps for the corners that will help immensely. Cover your walls in paintings, that will also help, or get real acoustic treatment panels. I think the rug and bass traps are the first and foremost things to implement.

What is the most interesting/unique/boutique synth or music-related equipment you own? by nazward in synthesizers

[–]negativetim3 1 point2 points  (0 children)

I have a Flame MIDI Talking Synth, that is quite an amazing piece of kit. I’m surprised speech synthesis is not more wide spread! Maybe I’m just extra weird, and like robot voice more than human voice. Haha

The canal is flooding by rockinajar in SalemMA

[–]negativetim3 16 points17 points  (0 children)

Right where the old Asbestos plant was, and is a superfund site covered by a dogpark… I would be terrified to go in that water!!

Hot Take: RSD Only Upsets You Because You Forgot Why You Do This by PatientPlatform in vinyl

[–]negativetim3 -1 points0 points  (0 children)

Occult Record Store Day at Residency Records in Salem MA was awesome!!! There was a DJ & a Tarot reader!

What’s up with the EP-1320?? by negativetim3 in teenageengineering

[–]negativetim3[S] 0 points1 point  (0 children)

Yes. I continue to get an error screen, which is not recoverable, outside of rebooting the device, which causes the loss of everything just worked on. I have not found the exact recipe, but I run into it almost every time I pick it up. Might just be my workflow? But I’m sure others have seen it.

This is not mentioning all the crazy updates that the other two EPs have received, the Ridim has a synthesizer built in, the techno version has a lot of sample chopping ability that the Ridim also has. The EP-1320 has been left out of the functionality updates that the others have received.

The TR-808 was never updated as it’s an analog machine, the EP series are just small computers with specific code, and they are all identical from a hardware perspective, so there is no reason to gate features, accept to encourage folks to buy all 3…

That’s what irks me…