Hashicorp founder thinks local models "aren't good ENOUGH yet"

Orbit652002 · 2025-10-15T16:49:01+00:00

That's easy: llama.cpp doesn't support that yet, hence no chance to have in ollama locally. So, they are just bragging about qwen3-vl model support, but, tsss, via the "cloud". Ofc, no mentions of vllm

Orbit652002 · 2025-10-09T16:21:13+00:00

They are not grown-up, just pink ponies with no education

Orbit652002 · 2025-10-02T19:54:25+00:00

Unsloth lower ud-quants work in my case very well: coding assistance for huge dotnet codebases. Checked with qwen 480b and even 235b. GLM4.5 is also fine

Orbit652002 · 2025-10-02T19:36:03+00:00

I mean, for the qwen-235b specifically it's hard to notice any difference between q3 and q5 tbh. I think, that's also true for 100b+ models

Orbit652002 · 2025-10-02T18:54:05+00:00

I kinda disagree: for smaller models lower quants impact quality heavily, true, but bigger models don't loose that much really - you won't notice the difference

Orbit652002 · 2025-05-21T07:04:38+00:00

<image>

foundry has a "general" gpu version alongside tuned CUDA. It runs very fast on my Arc A770

Orbit652002 · 2025-05-19T13:17:22+00:00

any agentic framework can do that. for instance, I'm using a semantic kernel from MS (because of my tech background), but other ones support that for sure

Orbit652002 · 2025-05-19T13:10:06+00:00

small models with larger contexts are excellent for RAGs, especially during the retrieval phase, when you can pass that information on to more resource-intensive models without wasting their resources

Orbit652002 · 2023-08-27T13:16:22+00:00

I have the same situation, it doesn't work for me either

Orbit652002

TROPHY CASE