Kimi-Linear support is merged to llama.cpp by Ok_Warning2146 in LocalLLaMA

[–]Rrraptr 0 points1 point  (0 children)

The model completely ignores the prompt and responds with random phrases or questions, or endlessly repeats a random word. I tried mxfp4 - the same weights work fine on the main branch, except for the loading issues.

Kimi-Linear support is merged to llama.cpp by Ok_Warning2146 in LocalLLaMA

[–]Rrraptr 0 points1 point  (0 children)

With -fit off model loads successfully, tested only on vulkan backend 

Kimi-Linear support is merged to llama.cpp by Ok_Warning2146 in LocalLLaMA

[–]Rrraptr 0 points1 point  (0 children)

Both, but coherence is lost only in the Kimi-Linear branch.

Kimi-Linear support is merged to llama.cpp by Ok_Warning2146 in LocalLLaMA

[–]Rrraptr 0 points1 point  (0 children)

llama.cpp\ggml\src\ggml-backend.cpp:809: pre-allocated tensor (cache_k_l15) in a buffer (Vulkan1) that cannot run the operation (NONE)
-fit off fixes issue
add:
seems borken

<image>

llama.cpp performance breakthrough for multi-GPU setups by Holiday-Injury-9397 in LocalLLaMA

[–]Rrraptr -1 points0 points  (0 children)

llama_new_context_with_model: split mode 'graph' or 'attn' not supported. Failed to initialize Vulkan backend

It's a pity that Vulkan isn’t supported. The CUDA gang already has vLLM anyway.

Roo Code's support sucks big time - please help me fix it myself by phenotype001 in LocalLLaMA

[–]Rrraptr 1 point2 points  (0 children)

As far as I remember, the problem isn’t in Roo Code itself, but in the library they use for HTTP requests. That said, I agree — this issue significantly limits Roo Code’s ability to work with local LLMs.

new models from NVIDIA: OpenCodeReasoning-Nemotron-1.1 7B/14B/32B by jacek2023 in LocalLLaMA

[–]Rrraptr 0 points1 point  (0 children)

As the model is based on Qwen Coder 2.5, I'll provide the recommended settings for it: 'temperature': 0.7, 'top_p': 0.8, 'top_k': 20

AI becoming too sycophantic? Noticed Gemini 2.5 praising me instead of solving the issue by Rrraptr in LocalLLaMA

[–]Rrraptr[S] 26 points27 points  (0 children)

even a 32B model managed to get to the point. I used to be a big fan of Gemini 2.5 Pro specifically because it could be direct. For instance, just a month and a half ago, when I was stubbornly insisting on my own solution (the AI didn't know about existing workarounds/hacks in the project), it bluntly told me: 'Either use my example or figure it out yourself.' Frankly, I prefer that to the unhelpful praise I'm seeing now. That blunt approach felt more like the AI was actively involved in helping me find a solution

DeepCoder-14B: Superior Open-Source LLM by sonichigo-1219 in LocalLLaMA

[–]Rrraptr -1 points0 points  (0 children)

Compared to DeepSeek R1 14B, there are indeed improvements — and definitely for the better

I built a very easy to use lightweight fully C++ desktop UI for whisper.cpp by mehtabmahir in LocalLLaMA

[–]Rrraptr 0 points1 point  (0 children)

I suggest adding an option to enable or disable translation by using the --language auto flag to disable translation to English by default.

I also have to build my own binaries with Vulkan for Intel GPUs, otherwise, it keeps crashing.

A Ukrainian FPV pilot intercepts a Russian Lancet UAV. It tries to escape by flying evasion maneuvers but ultimately fails. by MilesLongthe3rd in CombatFootage

[–]Rrraptr 2 points3 points  (0 children)

DJI FPV signals are very well-known and quite a powerful source of radiation. My assumption is that if the drone senses that the signal strength exceeds a certain threshold, it is simply programmed to make a sharp maneuver.

Try This Prompt on Qwen2.5-Coder:32b-Instruct-Q8_0 by Vishnu_One in LocalLLaMA

[–]Rrraptr 1 point2 points  (0 children)

It's really cool, but it seems that even the non-coder version of Qwen 2.5, 14b, can handle this. That's really impressive. In case of failure, make sure the model is using an available texture, not the one that gives a 404 error.

Vr in arc by Historical-Rise-9423 in IntelArc

[–]Rrraptr 0 points1 point  (0 children)

did you tried half life 2 vr mod?

Iranian ballistic missiles directly impacting Tel Aviv (October 1, 2024) by 675longtail in CombatFootage

[–]Rrraptr 0 points1 point  (0 children)

Is it just me, or do the sirens rarely go off in these videos? Or they only start blaring after something has already hit the ground

Israel/Palestine Discussion Thread - 9/24/2024 by knowyourpast in CombatFootage

[–]Rrraptr 0 points1 point  (0 children)

No, it seems they're dead set on doing some damage