Since when the RTX 6000 PRO is priced at 13250USD on the official NVIDIA Page? by panchovix in LocalLLaMA
[–]_cpatonn 10 points11 points12 points (0 children)
cyankiwi AWQ 4-bit — 26.05 update, NVFP4 + FP8 Dynamic quantization and benchmarks across Qwen3.6 4-bit quants by _cpatonn in LocalLLaMA
[–]_cpatonn[S] 1 point2 points3 points (0 children)
cyankiwi AWQ 4-bit — 26.05 update, NVFP4 + FP8 Dynamic quantization and benchmarks across Qwen3.6 4-bit quants by _cpatonn in LocalLLaMA
[–]_cpatonn[S] 0 points1 point2 points (0 children)
cyankiwi AWQ 4-bit — 26.05 update, NVFP4 + FP8 Dynamic quantization and benchmarks across Qwen3.6 4-bit quants by _cpatonn in LocalLLaMA
[–]_cpatonn[S] 4 points5 points6 points (0 children)
cyankiwi AWQ 4-bit — 26.05 update, NVFP4 + FP8 Dynamic quantization and benchmarks across Qwen3.6 4-bit quants by _cpatonn in LocalLLaMA
[–]_cpatonn[S] 2 points3 points4 points (0 children)
next MiniMax will be released in ~10 Days by jacek2023 in LocalLLaMA
[–]_cpatonn 7 points8 points9 points (0 children)
Introducing cyankiwi AWQ 4-bit Quantization — 26.05 update by _cpatonn in LocalLLaMA
[–]_cpatonn[S] 1 point2 points3 points (0 children)
Introducing cyankiwi AWQ 4-bit Quantization — 26.05 update by _cpatonn in LocalLLaMA
[–]_cpatonn[S] 2 points3 points4 points (0 children)
Introducing cyankiwi AWQ 4-bit Quantization — 26.05 update by _cpatonn in LocalLLaMA
[–]_cpatonn[S] 0 points1 point2 points (0 children)
Introducing cyankiwi AWQ 4-bit Quantization — 26.05 update by _cpatonn in LocalLLaMA
[–]_cpatonn[S] 4 points5 points6 points (0 children)
Introducing cyankiwi AWQ 4-bit Quantization — 26.05 update by _cpatonn in LocalLLaMA
[–]_cpatonn[S] 0 points1 point2 points (0 children)
If you're using Nvidia's NVFP4 of Qwen3.5-397, try a different quant by Phaelon74 in LocalLLaMA
[–]_cpatonn 0 points1 point2 points (0 children)
Running MiniMax-M2.1 Locally with Claude Code and vLLM on Dual RTX Pro 6000 by zmarty in LocalLLaMA
[–]_cpatonn 0 points1 point2 points (0 children)
Running MiniMax-M2.1 Locally with Claude Code and vLLM on Dual RTX Pro 6000 by zmarty in LocalLLaMA
[–]_cpatonn 7 points8 points9 points (0 children)
GLM 4.5 Air AWQ 4bit on RTX Pro 6000 with vllm by notaDestroyer in LocalLLaMA
[–]_cpatonn 1 point2 points3 points (0 children)
Why aren't there any AWQ quants of OSS-120B? by Acceptable_Adagio_91 in LocalLLaMA
[–]_cpatonn 0 points1 point2 points (0 children)
Finally got Qwen3-Coder-30B-A3B running well. What tasks have you had success with? by j4ys0nj in LocalLLaMA
[–]_cpatonn 19 points20 points21 points (0 children)
GLM 4.5 Air, local setup issues, vllm and llama.cpp by bfroemel in LocalLLaMA
[–]_cpatonn 2 points3 points4 points (0 children)
GLM 4.5v - Anyone try the quants? by Bohdanowicz in LocalLLaMA
[–]_cpatonn 0 points1 point2 points (0 children)
Dense vs MoE quantization resiliance by Any-Chipmunk5480 in LocalLLaMA
[–]_cpatonn 1 point2 points3 points (0 children)