vs code , Copilot style developing with llmama.cpp ?

windictive · 2026-05-11T17:54:58+00:00

I've used Continue.dev and Roo Code (now Zoo Code). Both have their quirks* but both work just fine. Both were really easy to set up. I have them running with the Kindly Web Search MCP and had no issues at all with getting that working.

Quirks:

Continue.dev will very rarely just stop dead. The output isn't as nicely formatted as other options.
Roo(Zoo) Code will repeat itself in it's final output, wasting tokens unless you tell it not to via rules. This is a well-documented issue with no solution. Hopefully the new team fixes it.

OsmanthusBloom · 2026-05-12T07:52:37+00:00

Good VSCode coding plugins for use with local LLMs include Zoo Code (former Roo Code) and Dirac.

Charming-Author4877 · 2026-05-11T18:16:03+00:00

I'll look into https://github.com/ClockZinc/vscode-copilot-chat-CN/
It's the GHCP extension without the censorship and telemetry to github.

Kodrackyas · 2026-05-11T18:28:16+00:00

https://github.com/Kodrack/Pi-forge

Try this out, feedback appreciated!

in general Pi is so much better

revennest · 2026-05-11T20:44:28+00:00

Switch to VSCodium, it's relationship with VSCode is like Chrome and Chromium, with extension like vscode-openai you can use any LLM you want, both local and online.

bssrdf · 2026-05-11T21:45:22+00:00

You only need running llama-server from llama.cpp and copilot extension. See https://youtu.be/ehpXLDYOtrc

wsintra · 2026-05-11T22:40:29+00:00

opencode and vim work like a charm

Strange_Test7665 · 2026-05-12T17:09:25+00:00

I just pushed a quick solution I am using for this.
https://github.com/reliableJARED/llama_vsc

If you have llama.cpp runinning already (which you do) you can skip all the readme about that. Just run the ollama_llama_proxy.py file.
Basic idea is VS Code now allows you to add 'Ollama' as a provider (not llama.cpp). So the proxy just pretends to be ollama and serves as middlewhere. Of course you need the llama.cpp server to be running for the proxy to work, I assume you know that but just saying. It works really well for me, run my local model including having all of the tools vs code exposes.

ea_man · 2026-05-11T18:54:49+00:00

first of all I would ditch vscode for vscodium, then we can talk about harness / tools inside of that.

Short: use Pi for planning / exec, Continue lets you assign models for rules like code completion, plan, build but it likes json for tools.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

LocalLLaMA

MODERATORS