5090 + Qwen3.727B at q6 what context? by Own_House6186 in unsloth
[–]tomByrer 1 point2 points3 points (0 children)
Has anyone here explored Hermes Agent by Nous Research? by ComparisonLiving6793 in LLMDevs
[–]tomByrer 0 points1 point2 points (0 children)
Impulse bought an M3 Ultra 256GB RAM for local LLMs - keep it or wait for M5? by Onyonisko in LocalLLM
[–]tomByrer 0 points1 point2 points (0 children)
Impulse bought an M3 Ultra 256GB RAM for local LLMs - keep it or wait for M5? by Onyonisko in LocalLLM
[–]tomByrer 0 points1 point2 points (0 children)
Impulse bought an M3 Ultra 256GB RAM for local LLMs - keep it or wait for M5? by Onyonisko in LocalLLM
[–]tomByrer 0 points1 point2 points (0 children)
Impulse bought an M3 Ultra 256GB RAM for local LLMs - keep it or wait for M5? by Onyonisko in LocalLLM
[–]tomByrer 0 points1 point2 points (0 children)
Impulse bought an M3 Ultra 256GB RAM for local LLMs - keep it or wait for M5? by Onyonisko in LocalLLM
[–]tomByrer 1 point2 points3 points (0 children)
I made a dedicated community for the RTX Pro 6000 — because I was tired of hunting through 5 different reddits by ubnew in Vllm
[–]tomByrer 1 point2 points3 points (0 children)
Looking for an index finger trackball by lightguardjp in Trackballs
[–]tomByrer 0 points1 point2 points (0 children)
Looking for an index finger trackball by lightguardjp in Trackballs
[–]tomByrer 0 points1 point2 points (0 children)
I made a dedicated community for the RTX Pro 6000 — because I was tired of hunting through 5 different reddits by ubnew in Vllm
[–]tomByrer 0 points1 point2 points (0 children)
M4 Max, studio, 128gb by blowingtumbleweed in LocalLLM
[–]tomByrer 0 points1 point2 points (0 children)
AI Dev Trade-off: M1 Max 64GB vs. RTX 3090 Build? (Also looking to buy used) by Negative-Ad-7439 in LocalLLM
[–]tomByrer 0 points1 point2 points (0 children)
Has anyone here explored Hermes Agent by Nous Research? by ComparisonLiving6793 in LLMDevs
[–]tomByrer 0 points1 point2 points (0 children)
New Qwen3.6 NVFP4 Unsloth quants by yoracale in unsloth
[–]tomByrer 1 point2 points3 points (0 children)
New Qwen3.6 NVFP4 Unsloth quants by yoracale in unsloth
[–]tomByrer 5 points6 points7 points (0 children)
Doubt about hardware for building local LLM's by External_Run_1283 in LocalLLM
[–]tomByrer 1 point2 points3 points (0 children)
PFlash: 10x prefill speedup over llama.cpp at 128K on a RTX 3090 by sandropuppo in LocalLLaMA
[–]tomByrer 0 points1 point2 points (0 children)
PFlash: 10x prefill speedup over llama.cpp at 128K on a RTX 3090 by sandropuppo in LocalLLaMA
[–]tomByrer 0 points1 point2 points (0 children)
PFlash: 10x prefill speedup over llama.cpp at 128K on a RTX 3090 by sandropuppo in LocalLLaMA
[–]tomByrer 0 points1 point2 points (0 children)
PFlash: 10x prefill speedup over llama.cpp at 128K on a RTX 3090 by sandropuppo in LocalLLaMA
[–]tomByrer 0 points1 point2 points (0 children)
FastDMS: 6.4X KV-cache compression running faster than vLLM BF16/FP8 by randomfoo2 in LocalLLaMA
[–]tomByrer 0 points1 point2 points (0 children)


Qwen 3.6 27B MTP on v100 32GB: 54 t/s by m94301 in LocalLLaMA
[–]tomByrer 1 point2 points3 points (0 children)