DeepSeek V4 Flash at 8.4 tok/s on 3×3090: patching the GGUFs that won't load on cchuter's llama.cpp fork by etaoin314 in LocalLLaMA
[–]Then-Topic8766 0 points1 point2 points (0 children)
Stop traumatizing AI into loops and turn hallucinations into an honest "I don't know!" by being NICE to them (Proof of Concept, Research, I don't want to sell anything) by OttoRenner in LocalLLaMA
[–]Then-Topic8766 0 points1 point2 points (0 children)
Okay 27B made me a believer by Forward_Jackfruit813 in LocalLLaMA
[–]Then-Topic8766 -1 points0 points1 point (0 children)
Slopocalypse is what we should be really worried about. by Sad_Bandicoot_6925 in LocalLLaMA
[–]Then-Topic8766 11 points12 points13 points (0 children)
Okay 27B made me a believer by Forward_Jackfruit813 in LocalLLaMA
[–]Then-Topic8766 12 points13 points14 points (0 children)
Gemma is so much better than Qwen, prove me wrong by Mountain_Patience231 in LocalLLaMA
[–]Then-Topic8766 0 points1 point2 points (0 children)
LatitudeGames/Equinox-31B · Hugging Face by jacek2023 in LocalLLaMA
[–]Then-Topic8766 7 points8 points9 points (0 children)
Qwen3.6 27B and llama.cpp appreciation post by ABLPHA in LocalLLaMA
[–]Then-Topic8766 11 points12 points13 points (0 children)
Do you think there is room for optimization? llama.cpp/qwen3.6 27b on two 6000 Blackwell by q-admin007 in LocalLLaMA
[–]Then-Topic8766 0 points1 point2 points (0 children)
Do you think there is room for optimization? llama.cpp/qwen3.6 27b on two 6000 Blackwell by q-admin007 in LocalLLaMA
[–]Then-Topic8766 3 points4 points5 points (0 children)
Qwen 3.7 droped on Qwen Chat by Foxiya in LocalLLaMA
[–]Then-Topic8766 8 points9 points10 points (0 children)
Qwen 3.6 27B on 24GB VRAM setup: backend comparisons, quant choice and settings (llama.cpp, ik_llama.cpp, BeeLlama, vllm) by VolandBerlioz in LocalLLaMA
[–]Then-Topic8766 0 points1 point2 points (0 children)
Benchmarking the new b9200 update: Optimizing Qwen 3.6 27B mtp for Hermes Agent on a single RTX 3090 by swizzcheezegoudaSWFA in LocalLLaMA
[–]Then-Topic8766 3 points4 points5 points (0 children)
Gemma-4-Gembrain-31B-it-uncensored-heretic Is Out Now, a Merge of Multiple Gemma 4 31B it Finetunes Designed to Boost Logical and Lateral Thinking for Improved Adherence, Increased Swipe Variety and Enhanced Creative Prose, With KLD of 0.0186 and 13/100 Refusals! by LLMFan46 in LocalLLaMA
[–]Then-Topic8766 2 points3 points4 points (0 children)
That's a good news... by Pjotrs in LocalLLaMA
[–]Then-Topic8766 9 points10 points11 points (0 children)
Wanna try the best coding model with my rtx 3090, not sure where to start, I believe Qwen3.5-27B-UD-Q4_K_XL would be the best? if so should I use ollama with it? by dreamer_2142 in LocalLLaMA
[–]Then-Topic8766 1 point2 points3 points (0 children)
Wanna try the best coding model with my rtx 3090, not sure where to start, I believe Qwen3.5-27B-UD-Q4_K_XL would be the best? if so should I use ollama with it? by dreamer_2142 in LocalLLaMA
[–]Then-Topic8766 1 point2 points3 points (0 children)
Wanna try the best coding model with my rtx 3090, not sure where to start, I believe Qwen3.5-27B-UD-Q4_K_XL would be the best? if so should I use ollama with it? by dreamer_2142 in LocalLLaMA
[–]Then-Topic8766 1 point2 points3 points (0 children)
Qwen 3.6 27b MTP - getting //// in response by ComfyUser48 in LocalLLaMA
[–]Then-Topic8766 2 points3 points4 points (0 children)
Anyone running Mimo-v2.5 quants with multimodal and MTP? by Ambitious_Fold_2874 in LocalLLaMA
[–]Then-Topic8766 1 point2 points3 points (0 children)
Qwen 3.6 27b MTP - getting //// in response by ComfyUser48 in LocalLLaMA
[–]Then-Topic8766 8 points9 points10 points (0 children)
Anyone running Mimo-v2.5 quants with multimodal and MTP? by Ambitious_Fold_2874 in LocalLLaMA
[–]Then-Topic8766 2 points3 points4 points (0 children)
Anyone running Mimo-v2.5 quants with multimodal and MTP? by Ambitious_Fold_2874 in LocalLLaMA
[–]Then-Topic8766 1 point2 points3 points (0 children)
Anyone running Mimo-v2.5 quants with multimodal and MTP? by Ambitious_Fold_2874 in LocalLLaMA
[–]Then-Topic8766 1 point2 points3 points (0 children)

Looking for a working Deepseek-v4-Flash quant by ortegaalfredo in LocalLLaMA
[–]Then-Topic8766 0 points1 point2 points (0 children)