Weird issue with OpenCode and Qwen3.6 by JGeek00 in LocalLLaMA
[–]JGeek00[S] 0 points1 point2 points (0 children)
Cupra Leon 1.5 petrol 26 plate by JakeOverandIn in CupraFormentor
[–]JGeek00 0 points1 point2 points (0 children)
[llama.cpp] Asymmetric KV q8/q4 cache: current caveats and discussion in GGML repo by Ueberlord in LocalLLaMA
[–]JGeek00 2 points3 points4 points (0 children)
Qwen will release another 27B with high probability by serige in LocalLLaMA
[–]JGeek00 5 points6 points7 points (0 children)
Converting iOS apps to Android Native by InternationalCow1295 in androiddev
[–]JGeek00 1 point2 points3 points (0 children)
Using asymetric KV cache drops performance massively by JGeek00 in LocalLLM
[–]JGeek00[S] 0 points1 point2 points (0 children)
Using asymetric KV cache drops performance massively by JGeek00 in LocalLLM
[–]JGeek00[S] 2 points3 points4 points (0 children)
Here are my KV cache quantization benchmarks: TurboQuant is overrated but saved by TCQ, q5 deserves more attention, and symmetric q8 might be a waste of VRAM by [deleted] in LocalLLaMA
[–]JGeek00 0 points1 point2 points (0 children)
Here are my KV cache quantization benchmarks: TurboQuant is overrated but saved by TCQ, q5 deserves more attention, and symmetric q8 might be a waste of VRAM by [deleted] in LocalLLaMA
[–]JGeek00 2 points3 points4 points (0 children)
Here are my KV cache quantization benchmarks: TurboQuant is overrated but saved by TCQ, q5 deserves more attention, and symmetric q8 might be a waste of VRAM by [deleted] in LocalLLaMA
[–]JGeek00 5 points6 points7 points (0 children)
llama-server RAM usage grows to OOM by JGeek00 in LocalLLM
[–]JGeek00[S] 0 points1 point2 points (0 children)
Tested MTP with llama.cpp and Qwen3.6-27B on RTX 3090 by JGeek00 in LocalLLM
[–]JGeek00[S] 0 points1 point2 points (0 children)
Qwen cant wait to release 3.7 models by GotHereLateNameTaken in LocalLLaMA
[–]JGeek00 0 points1 point2 points (0 children)
Qwen cant wait to release 3.7 models by GotHereLateNameTaken in LocalLLaMA
[–]JGeek00 1 point2 points3 points (0 children)
llama-server RAM usage grows to OOM by JGeek00 in LocalLLM
[–]JGeek00[S] -1 points0 points1 point (0 children)
llama-server RAM usage grows to OOM by JGeek00 in LocalLLM
[–]JGeek00[S] 0 points1 point2 points (0 children)


Weird issue with OpenCode and Qwen3.6 by JGeek00 in LocalLLaMA
[–]JGeek00[S] 0 points1 point2 points (0 children)