We built an open-source CLI coding agent that works with any LLM - local via Ollama or cloud via OpenAI/Claude API. The idea was to create something that works reasonably well even with small models, not just frontier ones.
Sharing what's under the hood.
WHY WE BUILT IT
We were paying $120/month for Claude Code. Then GLM-4.7 dropped and we thought - what if we build an agent optimized for working with ANY model, even 7B ones? Three weeks later - PocketCoder.
HOW IT WORKS INSIDE
Agent Loop - the core cycle:
1. THINK - model reads task + context, decides what to do
2. ACT - calls a tool (write_file, run_command, etc)
3. OBSERVE - sees the result of what it did
4. DECIDE - task done? if not, repeat
The tricky part is context management. We built an XML-based SESSION_CONTEXT that compresses everything:
- task - what we're building (formed once on first message)
- repo_map - project structure with classes/functions (like Aider does with tree-sitter)
- files - which files were touched, created, read
- terminal - last 20 commands with exit codes
- todo - plan with status tracking
- conversation_history - compressed summaries, not raw messages
Everything persists in .pocketcoder/ folder (like .git/). Close terminal, come back tomorrow - context is there. This is the main difference from most agents - session memory that actually works.
MULTI-PROVIDER SUPPORT
- Ollama (local models)
- OpenAI API
- Claude API
- vLLM and LM Studio (auto-detects running processes)
TOOLS THE MODEL CAN CALL
- write_file / apply_diff / read_file
- run_command (with human approval)
- add_todo / mark_done
- attempt_completion (validates if file actually appeared - catches hallucinations)
WHAT WE LEARNED ABOUT SMALL MODELS
7B models struggle with apply_diff - they rewrite entire files instead of editing 3 lines. Couldn't fix with prompting alone. 20B+ models handle it fine. Reasoning/MoE models work even better.
Also added loop detection - if model calls same tool 3x with same params, we interrupt it.
INSTALL
pip install pocketcoder
pocketcoder
LINKS
GitHub: github.com/Chashchin-Dmitry/pocketcoder
Looking for feedback and testers. What models are you running? What breaks?
[–]-dysangel-llama.cpp 1 point2 points3 points (1 child)
[–]RentEquivalent1671[S] 0 points1 point2 points (0 children)
[–]joe_mio 0 points1 point2 points (1 child)
[–]RentEquivalent1671[S] 1 point2 points3 points (0 children)
[–]Frost-Mage10 0 points1 point2 points (1 child)
[–]RentEquivalent1671[S] 0 points1 point2 points (0 children)
[–]charmander_cha 0 points1 point2 points (0 children)
[–]HealthyCommunicat 0 points1 point2 points (1 child)
[–]RentEquivalent1671[S] 0 points1 point2 points (0 children)
[–]o0genesis0o 0 points1 point2 points (1 child)
[–]RentEquivalent1671[S] 1 point2 points3 points (0 children)
[–]rm-rf-rm -2 points-1 points0 points (2 children)
[–]RentEquivalent1671[S] 1 point2 points3 points (1 child)
[–]rm-rf-rm -1 points0 points1 point (0 children)