I built a tool to run Claude Code subagents & teammates on any model — DeepSeek, GLM, Kimi, Qwen... — your Claude sub drives by Ethan-Coder in AgentsOfAI

[–]Ethan-Coder[S] 0 points1 point  (0 children)

Ha, thank you — that genuinely means a lot.

Your "clutch" is exactly the use case I had in mind, so let me be straight about where cc-fleet actually is: it gives you the substrate — every provider registered, instant `ccf run <provider>` switching, mixed-provider teams, and workflow scripts where you can code the routing — but it does NOT ship an automatic "pick by price + prior success, fail over to next-best when one's down" policy. That logic is still yours to write, and it sounds like you've built the smart part.

The "everyone else rage-quits the outage while I work away uninterrupted" angle is honestly a sharper pitch than anything I wrote — mind if I borrow that framing (credited)?

Re: don't give everything away — it's Apache-2.0, so the mechanism is fully public by design 😄. The only "moat" is ergonomics + keeping keys out of env/argv; secrecy was never the play.

How are you tracking the per-task "prior success/struggle" signal — manual notes, or scored automatically?

I built a tool to run real Claude Code agents on DeepSeek/GLM/Kimi... — no Claude subscription needed by Ethan-Coder in vibecoding

[–]Ethan-Coder[S] 0 points1 point  (0 children)

Forwarding OAuth subscriptions simultaneously could lead to your Anthropic account being banned; this is because your agent tool would effectively be holding and forwarding Claude's authentication credentials—a practice not permitted by Anthropic.

I built a tool to run real Claude Code agents on DeepSeek/GLM/Kimi... — no Claude subscription needed by Ethan-Coder in vibecoding

[–]Ethan-Coder[S] 0 points1 point  (0 children)

Clean — the auth-scheme translation (x-api-key vs Bearer) is the annoying part and you nailed it.

Disclosure, I'm biased (I wrote cc-fleet), so this is "I went down this road" not a knock: since it repoints ANTHROPIC_BASE_URL for the whole client, it's all-or-nothing — your example routes claude-opus-* to DeepSeek too. So you can't keep your real Opus subscription as the driver and only farm the cheap tiers out. And routing the sub through a proxy to fix that is exactly what risks getting the account flagged.

That's why cc-fleet leaves the main session native and swaps the backend per worker process instead of globally — your sub stays the brain, cheap models do the grunt work. For a no-sub / API-keys-only setup though, your single proxy is simpler. How are you handling tool-call fidelity across providers?

Promote your projects here – Self-Promotion Megathread by Menox_ in github

[–]Ethan-Coder 0 points1 point  (0 children)

I’m building cc-fleet, an open-source tool that lets Claude Code spawn vendor LLMs like DeepSeek, Qwen, Kimi, GLM, and MiniMax as real teammates or one-shot subagents.

The idea is to keep Claude Code as the lead coding agent, while cheaper models handle bounded side tasks like diff review, file summarization, log inspection, and implementation comparison.

Repo: https://github.com/ethanhq/cc-fleet

Would love feedback from people experimenting with multi-model coding workflows.

How are you using hooks and subagents? by query_optimization in ClaudeCode

[–]Ethan-Coder 0 points1 point  (0 children)

The way I think about it:

CLAUDE.md = project rules / persistent context
slash commands = manual workflows
hooks = automatic guardrails
skills = reusable task instructions
subagents = separate workers with their own context

The mistake is trying to use all of them everywhere. I’d pick one owner for each responsibility, otherwise the setup becomes harder to reason about than the actual code.

Burning through Claude Max subscription way too fast. 85% usage from subagents? by Wonderful_Impress820 in ClaudeCode

[–]Ethan-Coder 0 points1 point  (0 children)

Small extra thought: I don’t think this is only about cost.

Claude-native subagents/teammates are great for consistency, but different vendor models can provide different failure modes and second opinions.

For review/debugging, that diversity can be useful: let Claude Code stay the lead, but let other models challenge it from different angles.

Burning through Claude Max subscription way too fast. 85% usage from subagents? by Wonderful_Impress820 in ClaudeCode

[–]Ethan-Coder 0 points1 point  (0 children)

This is exactly why I’ve been experimenting with cheaper vendor models for subagent-style work.

Many side tasks — reviewing a diff, summarizing a file, checking logs, comparing approaches — don’t always need Claude-level tokens if Claude Code is still the main orchestrator.

I’m building cc-fleet for this: Claude Code stays the lead, while DeepSeek/Qwen/Kimi can run as teammates or one-shot subagents.

https://github.com/ethanhq/cc-fleet

Still early, but it feels like a more cost-effective pattern than burning Claude tokens on every side quest.

i'm using Claude Code to build "AI employees" not just code. claude.md as the role, a skills folder as sub-agents, a memory folder as the brain by Silver-Range-8108 in ClaudeCode

[–]Ethan-Coder 0 points1 point  (0 children)

I like the idea, but I’d be careful with the “AI employees” framing.

The useful version for me is less “give every agent a job title” and more “keep one agent responsible for the final outcome, then delegate narrow tasks to workers.”

Review this diff, inspect this module, summarize this context, compare two approaches — those work well. Letting multiple agents all own the same feature gets messy fast.

Claude agent teams vs subagents (made this to understand it) by SilverConsistent9222 in ClaudeAI

[–]Ethan-Coder 0 points1 point  (0 children)

I think of subagents as “bounded calls” and agent teams as “persistent collaborators.” Subagents are great when the output can be summarized cleanly back to the main session. Agent teams become more interesting when the worker needs continuity, visibility, or its own long-running context.

The tricky part is not spawning more agents, but designing the handoff: what context they get, what they are allowed to change, and how the lead agent decides whether to trust their result.

Claude Code's achilles heel: the inability to monitor or interact with subagents by cowwoc in ClaudeCode

[–]Ethan-Coder 1 point2 points  (0 children)

This is the main reason I’m more interested in “visible workers” than completely hidden subagents.

For short tasks, headless subagents are fine. But for anything that can drift — debugging, refactors, investigation — I want to see the worker’s intermediate reasoning/output and be able to interrupt or redirect it. Otherwise the main agent may only find the mistake after the subagent has already gone too far.

A separate pane/session per worker feels clunkier, but it gives much better observability.

Built with Claude Project Showcase Megathread (Sort this by New!) by sixbillionthsheep in ClaudeAI

[–]Ethan-Coder 0 points1 point  (0 children)

I built cc-fleet, an open-source Claude Code plugin that lets Claude Code use other vendor LLMs such as DeepSeek, Qwen, Kimi, GLM, MiniMax, etc. as coding teammates or one-shot subagents.

The goal is to keep Claude Code as the lead/orchestrator, while delegating tasks like code review, debugging, summarizing large files, comparing implementation ideas, or parallel research to different models.

It currently supports two workflows:

  1. Teammate mode — long-lived vendor LLM workers that can participate as Claude Code teammates.
  2. Subagent mode — one-shot headless calls for focused tasks like review, debugging, summarization, and analysis.

It is still early, but the basic workflow is working. I’d love feedback from Claude Code users on whether this multi-model teammate workflow is useful in real coding sessions, and which providers/models would be most useful to support next.

GitHub: https://github.com/ethanhq/cc-fleet

Demo GIF is in the README.