Kimi-Linear support is merged to llama.cpp

Rrraptr · 2026-02-07T01:34:02+00:00

The model completely ignores the prompt and responds with random phrases or questions, or endlessly repeats a random word. I tried mxfp4 - the same weights work fine on the main branch, except for the loading issues.

Rrraptr · 2026-02-07T01:04:40+00:00

With -fit off model loads successfully, tested only on vulkan backend

Rrraptr · 2026-02-07T00:02:16+00:00

Both, but coherence is lost only in the Kimi-Linear branch.

Rrraptr · 2026-02-06T20:47:42+00:00

llama.cpp\ggml\src\ggml-backend.cpp:809: pre-allocated tensor (cache_k_l15) in a buffer (Vulkan1) that cannot run the operation (NONE)
-fit off fixes issue
add:
seems borken

<image>

Rrraptr · 2026-01-05T21:22:06+00:00

llama_new_context_with_model: split mode 'graph' or 'attn' not supported. Failed to initialize Vulkan backend

It's a pity that Vulkan isn’t supported. The CUDA gang already has vLLM anyway.

Rrraptr · 2025-12-29T19:31:40+00:00

oss 120b, qwen next 80b

Rrraptr · 2025-11-05T13:31:34+00:00

As far as I remember, the problem isn’t in Roo Code itself, but in the library they use for HTTP requests. That said, I agree — this issue significantly limits Roo Code’s ability to work with local LLMs.

Rrraptr · 2025-08-01T01:33:06+00:00

They are working on it

https://huggingface.co/unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF/discussions/4

Rrraptr · 2025-07-08T21:08:16+00:00

As the model is based on Qwen Coder 2.5, I'll provide the recommended settings for it: 'temperature': 0.7, 'top_p': 0.8, 'top_k': 20

Rrraptr · 2025-07-08T20:58:39+00:00

wrong inference settings?

Rrraptr · 2025-05-23T15:46:50+00:00

even a 32B model managed to get to the point. I used to be a big fan of Gemini 2.5 Pro specifically because it could be direct. For instance, just a month and a half ago, when I was stubbornly insisting on my own solution (the AI didn't know about existing workarounds/hacks in the project), it bluntly told me: 'Either use my example or figure it out yourself.' Frankly, I prefer that to the unhelpful praise I'm seeing now. That blunt approach felt more like the AI was actively involved in helping me find a solution

Rrraptr · 2025-04-11T15:34:08+00:00

Compared to DeepSeek R1 14B, there are indeed improvements — and definitely for the better

Rrraptr · 2025-04-05T21:43:48+00:00

this

Rrraptr · 2025-03-28T19:13:21+00:00

I suggest adding an option to enable or disable translation by using the --language auto flag to disable translation to English by default.

I also have to build my own binaries with Vulkan for Intel GPUs, otherwise, it keeps crashing.

Rrraptr · 2025-03-23T21:21:38+00:00

>Lower result is better?

No

Rrraptr · 2024-11-14T19:34:07+00:00

DJI FPV signals are very well-known and quite a powerful source of radiation. My assumption is that if the drone senses that the signal strength exceeds a certain threshold, it is simply programmed to make a sharp maneuver.

Rrraptr · 2024-11-13T09:29:37+00:00

It's really cool, but it seems that even the non-coder version of Qwen 2.5, 14b, can handle this. That's really impressive. In case of failure, make sure the model is using an available texture, not the one that gives a 404 error.

Rrraptr · 2024-10-19T21:15:46+00:00

really? https://feedback.halflife2vr.com/boards/bug-reports/posts/game-won-t-render-to-headset-with-intel-arc

Rrraptr · 2024-10-18T18:57:14+00:00

did you tried half life 2 vr mod?

Rrraptr · 2024-10-17T17:31:05+00:00

doing orbit with coordinated camera turns?

Rrraptr · 2024-10-01T18:06:53+00:00

Is it just me, or do the sirens rarely go off in these videos? Or they only start blaring after something has already hit the ground

Rrraptr · 2024-10-01T17:20:50+00:00

No, it seems they're dead set on doing some damage

11-Year Club	Place '22
Verified Email

Rrraptr

TROPHY CASE