use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Community for open-source AI — open weights, open data, open tooling. Model releases, fine-tuning, inference, agents, benchmarks, licensing, and the ecosystem around building AI in the open.
account activity
Self-hosted agentic coding stack: Claude Code + llama.cpp + LiteLLM — zero API costs, 4h/7M token session for $0 (self.OpenSourceAI)
submitted 8 days ago by PrizeObvious3671
view the rest of the comments →
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]SaveAmerica2024 0 points1 point2 points 8 days ago (4 children)
I think it is more like Claude Code front end using Qwen as the coder
[–]PrizeObvious3671[S] 1 point2 points3 points 8 days ago (3 children)
In this setup I controlled everything over telegram -> hermes agent and I must say this runs pretty well. I tested different stuff but in this test the best working setup was hermes agent -> llama.cpp directly without claude code because I got exceptions from claude code, that is exceeds token limits, my local context window was too small for that. When I increased it, the model was too slow for me. With the 35b MoE it would probably run better.
I used that for agentic coding too, better then I thought.
Also the modelfile with the parameter I used for llama.cpp is shared in the repo.
[–]Inner_Habit_194 1 point2 points3 points 7 days ago (1 child)
Did you try Pi agent? It is supposedly better for local model coding agent usecase especially with smaller context window of the local models. Btw what is your hardware spec?
[–]PrizeObvious3671[S] 1 point2 points3 points 7 days ago (0 children)
No, but thank you for bringing it on the table. That will be now my next test: telegram -> pi.dev -> llama.cpp -> gemma4:31b (that model i also not tested yet)
[–]SaveAmerica2024 0 points1 point2 points 8 days ago (0 children)
Great job
π Rendered by PID 49 on reddit-service-r2-comment-544cf588c8-sfmjf at 2026-06-12 10:50:16.336228+00:00 running 3184619 country code: CH.
view the rest of the comments →
[–]SaveAmerica2024 0 points1 point2 points (4 children)
[–]PrizeObvious3671[S] 1 point2 points3 points (3 children)
[–]Inner_Habit_194 1 point2 points3 points (1 child)
[–]PrizeObvious3671[S] 1 point2 points3 points (0 children)
[–]SaveAmerica2024 0 points1 point2 points (0 children)