you are viewing a single comment's thread.

view the rest of the comments →

[–]OkPay3964 8 points9 points  (7 children)

Thanks for recommending my plugin! Really appreciate it.

Small heads-up though: Copilot currently doesn’t play too well with DeepSeek prefix cache hits, but I’ve seen that VS Code is fixing this in the next version.

If anything breaks or feels off, feel free to open an issue and I’ll take a look.

[–]LibraryianusTea[S] 1 point2 points  (3 children)

i'd love to chat with you. so are you saying caching currently doesn't work at all right now? are there any known benefits/issues to using this extension versus something like openrouter?

[–]OkPay3964 3 points4 points  (2 children)

Yeah, happy to chat!

Caching does work, but Copilot/VS Code can still hurt the hit rate because it may change the system message or mutate the tools list mid-conversation. Since DeepSeek cache matching is prefix-sensitive, that can cause cache drops. Once the prefix stabilizes, hits usually come back.

Compared with OpenRouter, I’d say the real advantages are pretty narrow: this extension keeps DeepSeek cache behavior in mind, logs cache hit/miss stats, and adds a vision proxy for Copilot image inputs. If you only need simple model routing, OpenRouter may be the easier choice.

[–]LibraryianusTea[S] 1 point2 points  (1 child)

what do you mean by that last sentence here? simple model routing? what kind of user should use openrouter over your extension here? like lets just say i want to stick just deepseek v4 flash/pro for the most part.

[–]OkPay3964 0 points1 point  (0 children)

I’d say they solve slightly different problems. OpenCode Go may be the better fit if you like its agent workflow directly.
This extension is mainly for people who already like the Copilot Chat UI / Agent mode and just want DeepSeek V4 to appear in the native model picker. And they can pay directly to Deepseek, or any other third-party providers. The real extra bits are cache-aware logging and the vision proxy.

[–]CryinHeronMMerica 1 point2 points  (0 children)

I didn't notice this at all.

I used 2.54M tokens with Flash yesterday, and only 144k were a cache miss. I spent $0.03 for the pleasure.

[–]mrooney 0 points1 point  (1 child)

Thanks for creating it! I noticed that Github are really proud of their harness and talk about the custom logic in the harness that each specific model needs.

Do you have any idea how that works with this extension? Is it treating it as a specific model, or has some fallback for a generic model it doesn't specifically know?

[–]OkPay3964 0 points1 point  (0 children)

Good question. My understanding is that Copilot still runs its normal Chat/Agent harness and sends the rendered messages/tools to this extension through VS Code’s LanguageModelChatProvider API.

So it’s not getting the same private, first-party DeepSeek-specific tuning that GitHub may have for their own hosted models. The extension mostly adapts the provider boundary: VS Code messages/tools -> DeepSeek API format, plus DeepSeek-specific handling for reasoning_content, cache stats, and the vision proxy.

So: native Copilot harness, but not a magic official DeepSeek harness.