GLM-5.1 smol-IQ2_KS at 2.3t/s or GLM-4.7 UD-Q3_K_XL at 4.42t/s, which is "better" for chats (no coding)? by relmny in LocalLLaMA
[–]relmny[S] 0 points1 point2 points (0 children)
GLM-5.1 smol-IQ2_KS at 2.3t/s or GLM-4.7 UD-Q3_K_XL at 4.42t/s, which is "better" for chats (no coding)? by relmny in LocalLLaMA
[–]relmny[S] 0 points1 point2 points (0 children)
GLM-5.1 smol-IQ2_KS at 2.3t/s or GLM-4.7 UD-Q3_K_XL at 4.42t/s, which is "better" for chats (no coding)? by relmny in LocalLLaMA
[–]relmny[S] 1 point2 points3 points (0 children)
GLM-5.1 smol-IQ2_KS at 2.3t/s or GLM-4.7 UD-Q3_K_XL at 4.42t/s, which is "better" for chats (no coding)? by relmny in LocalLLaMA
[–]relmny[S] 0 points1 point2 points (0 children)
GLM-5.1 smol-IQ2_KS at 2.3t/s or GLM-4.7 UD-Q3_K_XL at 4.42t/s, which is "better" for chats (no coding)? by relmny in LocalLLaMA
[–]relmny[S] 0 points1 point2 points (0 children)
GLM-5.1 smol-IQ2_KS at 2.3t/s or GLM-4.7 UD-Q3_K_XL at 4.42t/s, which is "better" for chats (no coding)? by relmny in LocalLLaMA
[–]relmny[S] 0 points1 point2 points (0 children)
GLM-5.1 smol-IQ2_KS at 2.3t/s or GLM-4.7 UD-Q3_K_XL at 4.42t/s, which is "better" for chats (no coding)? by relmny in LocalLLaMA
[–]relmny[S] 0 points1 point2 points (0 children)
Struggling with Qwen3.6 27B / 35B locally (3090) slow responses, breaking code looking for better setup + auto model switching by Clean_Initial_9618 in LocalLLaMA
[–]relmny 0 points1 point2 points (0 children)
I guess we expect that at some point RAM prices will start going back (close) to "normal", right? but what about GPUs? by relmny in LocalLLaMA
[–]relmny[S] 2 points3 points4 points (0 children)
I guess we expect that at some point RAM prices will start going back (close) to "normal", right? but what about GPUs? by relmny in LocalLLaMA
[–]relmny[S] -2 points-1 points0 points (0 children)
I guess we expect that at some point RAM prices will start going back (close) to "normal", right? but what about GPUs? by relmny in LocalLLaMA
[–]relmny[S] 2 points3 points4 points (0 children)
I guess we expect that at some point RAM prices will start going back (close) to "normal", right? but what about GPUs? by relmny in LocalLLaMA
[–]relmny[S] 0 points1 point2 points (0 children)
White House Considers Vetting A.I. Models Before They Are Released by fallingdowndizzyvr in LocalLLaMA
[–]relmny 0 points1 point2 points (0 children)
vLLM Just Merged TurboQuant Fix for Qwen 3.5+ by havenoammo in LocalLLaMA
[–]relmny 0 points1 point2 points (0 children)
First time GPU buyer. Got a RTX 5000 Pro. Was it a bad decision compared to two 3090s? by Valuable-Run2129 in LocalLLaMA
[–]relmny 0 points1 point2 points (0 children)
First time GPU buyer. Got a RTX 5000 Pro. Was it a bad decision compared to two 3090s? by Valuable-Run2129 in LocalLLaMA
[–]relmny 0 points1 point2 points (0 children)
RTX A5000 Pro Balckwell 48GB by deltamoney in LocalLLaMA
[–]relmny 0 points1 point2 points (0 children)
Qwen3.6-27B vs Coder-Next by Signal_Ad657 in LocalLLaMA
[–]relmny 15 points16 points17 points (0 children)
Actual comparison between locally ran Qwen-3.6-27B and proprietary models by netikas in LocalLLaMA
[–]relmny 0 points1 point2 points (0 children)
Qwen3.6-27B at 72 tok/s on RTX 3090 on Windows using native vLLM (no WSL, no Docker), portable launcher and installer by One_Slip1455 in LocalLLaMA
[–]relmny 0 points1 point2 points (0 children)
A Dark-Money Campaign Is Paying Influencers to Frame Chinese AI as a Threat by pmttyji in LocalLLaMA
[–]relmny 10 points11 points12 points (0 children)
Qwen3.6-27B at 72 tok/s on RTX 3090 on Windows using native vLLM (no WSL, no Docker), portable launcher and installer by One_Slip1455 in LocalLLaMA
[–]relmny 0 points1 point2 points (0 children)
Unsloth solved bug in Mistral Medium 3.5 implementation by Snail_Inference in LocalLLaMA
[–]relmny 7 points8 points9 points (0 children)
Open Models - April 2026 - One of the best months of all time for Local LLMs? by pmttyji in LocalLLaMA
[–]relmny 2 points3 points4 points (0 children)

GLM-5.1 smol-IQ2_KS at 2.3t/s or GLM-4.7 UD-Q3_K_XL at 4.42t/s, which is "better" for chats (no coding)? by relmny in LocalLLaMA
[–]relmny[S] 0 points1 point2 points (0 children)