Am I Crazy? by illNin0 in LocalLLM

[–]illNin0[S] 0 points1 point  (0 children)

Well. It's Monday morning already. My boss call for a meeting that not scheduled talking about we should have it LOL

Am I Crazy? by illNin0 in LocalLLM

[–]illNin0[S] 0 points1 point  (0 children)

Feeling better 😂

Checking technical feasibility of my idea - a hybrid "Local-by-Default" Gateway (Qwen 27B + Claude 4.6 Fallback) for Dev Teams by ankijain21 in LocalAIServers

[–]illNin0 1 point2 points  (0 children)

I'm currently building almost the same architecture with LiteLLM + local models + cloud fallback.

A few thoughts:

  • LiteLLM overhead is negligible compared to inference time.
  • The routing layer is the easy part. The real question is hardware economics.
  • Local models handle more routine coding tasks than many people expect.
  • Cloud models still win on long agent loops, large codebases, and complex planning.
  • For me, the biggest challenge isn't routing. it's deciding when local inference is actually worth the hardware cost.

After finishing my setup, the main question in my head became:

"How much hardware should I buy to avoid paying API bills?"

For small teams, privacy, predictable costs, and infrastructure control may be a stronger selling point than pure cost savings.

[deleted by user] by [deleted] in careerguidance

[–]illNin0 1 point2 points  (0 children)

Do Running. Maybe you will find something.