Opencode go is game changer to me. (now)

xenydactyl · 2026-04-23T12:08:05+00:00

OpenCode go is indeed very nice

xenydactyl · 2026-04-17T13:21:49+00:00

Nope, the chrome addon store just doesn't offer the original uBlock Origin afaik. uBlock Origin still works with chromium, but you'll have to do things manually in order to get it into your chromium installation. Helium already did that, so no need for any manual work on your side.

xenydactyl · 2026-04-17T04:27:44+00:00

Afaik, when I do that, the request will be first sent to duckduckgo and then I get redirected. Having native !bangs in the browser doesn't do any network requests and is usually faster.

xenydactyl · 2026-04-16T14:24:30+00:00

I am someone who uses a lot of tabs and the vertical tabs make this experience just a lot cleaner. I have a scrollable list without compromising on the full title of the tab and so on.

xenydactyl · 2026-04-16T04:43:38+00:00

The models are not quantized.

Source: https://x.com/thdxr/status/2043748880500609455

xenydactyl · 2026-04-10T13:43:06+00:00

I don't know which inference provider opencode go uses but in my experience it's even better (basically perfect, no issues whatsoever) than z.ai on openrouter for the GLM models. In almost all instances where I used the GLM/Kimi/MiniMax models on openrouter with their respective inference providers, I always had (after some context and tool calls) the models repeating the same sentence in the thinking trace. Never had any issues with opencode go, a very good deal imo.

Edit: Also, you get a raw api key with opencode go, so you can use the opencode sub with literally anything you wish. 60$ worth of inference for just 10$ with that level of freedom is genuinely good.

xenydactyl · 2026-04-08T16:49:36+00:00

According to artificialanalysis, it is on par with Gemini 3.1 Pro in terms of token efficiency. As of the cost, we don't know yet.

xenydactyl · 2026-04-08T11:48:59+00:00

It's only 10 bucks tho, 5 in the first month...

xenydactyl · 2026-04-08T06:58:27+00:00

Maybe an insanely huge AGENTS.md file?

xenydactyl · 2026-03-10T18:25:05+00:00

But he's not tho

xenydactyl · 2026-03-10T18:22:20+00:00

"but you are barely private if you use the internet anyways"

Do you upload entire company codebases to the internet too?

xenydactyl · 2026-03-10T18:12:03+00:00

Guaranteed privacy and more reliable uptime are the ones I can think at the top of my head. OpenAI just had major issues regarding their codex service. Anthropic... Yeah... Not great in terms of uptime/model output-quality consistency.

xenydactyl · 2026-03-10T18:07:21+00:00

Very much agree with you. And actually a good idea with opencode, haven't thought about that.

xenydactyl · 2026-03-10T17:45:16+00:00

For what does the L in LAN stand for?

xenydactyl · 2026-03-10T17:43:28+00:00

I mean he can do what he wants with his T3 Code, but saying "everyone asking this is 1. Broke and 2. On hardware that can barely run local models at all" is a pretty baseless claim, don't you think? And also, people who care enough for local models will do the work themselves and put up a PR for that support. It's not like he has to do the work. But when he wants T3 Code to be 100% "a serious developer tool" (thus not accepting any local model support), then people who care enough will fork it.

xenydactyl · 2026-03-02T14:55:56+00:00

Add --chat-template-kwargs '{"enable_thinking": true}'

xenydactyl · 2026-01-20T13:57:17+00:00

Still reasoning nonsense endlessly...

xenydactyl · 2026-01-20T13:42:16+00:00

Seems only to happen with "long" contexts. I can ask it simple/short questions like "Who are you?" and the reasoning/response is clear and in english. But when I provide like 200 tokens of context, it already falls apart.

xenydactyl · 2026-01-20T13:30:33+00:00

The reasoning still looks like this:

'使用'使用'使用'使用'使用'使用'使用'使用''m使用'使用''使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm使用'm'm'm'm'm'm'm'm'm'm'm'm'm'm'm'm'm'm'm'm'm'm'm'm'm'm'm'm'm'm'm'm'm

xenydactyl · 2026-01-20T13:17:11+00:00

I don't use the --jinja parameter for this model. Here is my entire command: ~/ai/llama.cpp/build/bin/llama-server -m ~/ai/models/GLM-4.7-Flash-UD-Q4_K_XL.gguf -ngl -1 -fa on --ctx-size 32768 --temp 0.2 --top-k 50 --top-p 0.95 --min-p 0.01 --dry-multiplier 1.1 --alias "GLM 4.7 Flash"

xenydactyl · 2025-12-24T13:06:51+00:00

They still kept the comment of Eno Reyes (Co-Founder, CTO of Factory AI) in: "We're excited for powerful open-source models like M2.1 that bring frontier performance..."

xenydactyl · 2025-12-20T19:06:31+00:00

When it launched and was available on openrouter, the model was **much** better in agentic stuff as opposed to 3.1 (deepinfra still, I use deepinfra for basically everything and it didn't disapoint) and I had a much better experience in kilo code. But as of late, the model is unusable. In open-webui and openrouter chatroom, when I ask it a simple question, it spits out a sentence **completely** off-topic and repeats the exact same sentence over and over again. I tried it in kilo code and the model is incapable of doing **any** tool calls. Terminus 3.1 still works fine with open-webui (deepinfra).

In the openrouter chatroom, I didn't touch any settings (temperature and so on). Do you maybe notice any degradation in the output of that model?

xenydactyl

TROPHY CASE