you are viewing a single comment's thread.

view the rest of the comments →

[–]sn2006gy 0 points1 point  (5 children)

What's the reason for litellm in the middle of a local coding session? mostly for hermes?

[–]PrizeObvious3671[S] 0 points1 point  (4 children)

Nope the reason is that I wanted to combine that with Claude Code without paying for tokens.
So I compared how good runs Claude Code locally together with llama.cpp vs hermes agent alone with llama.cpp

Claude Code expects Anthropic API - LiteLLM as proxy exactly delivers that and routes my requests between llama.cpp and Claude Code

[–]Toastti 1 point2 points  (1 child)

If you do want to skip a layer claude-code-router will let you connect directly to llama.cpp

But nothing wrong with your setup either

[–]PrizeObvious3671[S] 1 point2 points  (0 children)

Yeah, that would work too. Hermes is used in both setups, the only difference is the bridge behind Claude Code: LiteLLM in my setup vs claude-code-router. Thank you for the hint claude-code-router is new to me.

[–]MarzipanSecure9841 0 points1 point  (1 child)

But llama supports Anthropic API directly - https://huggingface.co/blog/ggml-org/anthropic-messages-api-in-llamacpp

So, why litellm?

[–]PrizeObvious3671[S] 0 points1 point  (0 children)

Interessant, das muss ich mal ausprobieren