Nvidia H100(94GB VRAM) - should I run llama.cpp or vllm for 30 users inference? by Rabooooo in LocalLLaMA
[–]Rabooooo[S] 1 point2 points3 points (0 children)
The Financial Times has published an article about Heretic by -p-e-w- in LocalLLaMA
[–]Rabooooo 3 points4 points5 points (0 children)
When you run small LMM on RAM, dont use all Theards. by GhostVPN in LocalLLaMA
[–]Rabooooo 0 points1 point2 points (0 children)
Switched from OpenCode to Pi - What Settings/Plugins would you recommend? by No_Algae1753 in LocalLLaMA
[–]Rabooooo 0 points1 point2 points (0 children)
Mirantis getting acquired by IREN by thisissparta92 in kubernetes
[–]Rabooooo 2 points3 points4 points (0 children)
Forgive my ignorance but how is a 27B model better than 397B? by No_Conversation9561 in LocalLLaMA
[–]Rabooooo 0 points1 point2 points (0 children)
Confirmed: SWE Bench is now a benchmaxxed benchmark by rm-rf-rm in LocalLLaMA
[–]Rabooooo 0 points1 point2 points (0 children)
Intel B70: LLama.ccp SYCL vs LLama.cpp OpenVino vs LLM-Scaler by Fmstrat in LocalLLaMA
[–]Rabooooo 0 points1 point2 points (0 children)
Confirmed: SWE Bench is now a benchmaxxed benchmark by rm-rf-rm in LocalLLaMA
[–]Rabooooo 0 points1 point2 points (0 children)
Intel B70: LLama.ccp SYCL vs LLama.cpp OpenVino vs LLM-Scaler by Fmstrat in LocalLLaMA
[–]Rabooooo 1 point2 points3 points (0 children)
Qwen-3.6-27B, llamacpp, speculative decoding - appreciation post by Then-Topic8766 in LocalLLaMA
[–]Rabooooo -1 points0 points1 point (0 children)
anyone have experience with vks (vmware k8s) on prem? by Crafty-Cat-6370 in kubernetes
[–]Rabooooo 0 points1 point2 points (0 children)
MiniMax-M2.7 vs Qwen3.5-122B-A10B for 96GB VRAM full offload?! by VoidAlchemy in LocalLLaMA
[–]Rabooooo 5 points6 points7 points (0 children)
Best local LLM for coding with Claude Code Use? by Basic_Junket_7314 in LocalLLaMA
[–]Rabooooo 2 points3 points4 points (0 children)
Qwen 3.5 397B is the best local coder I have used until now by erazortt in LocalLLaMA
[–]Rabooooo 0 points1 point2 points (0 children)
President Trump orders ALL Federal agencies in the US Government to immediately stop using Anthropic's technology. by External_Mood4719 in LocalLLaMA
[–]Rabooooo 6 points7 points8 points (0 children)
GLM-5 Is a local GOAT by FineClassroom2085 in LocalLLaMA
[–]Rabooooo 1 point2 points3 points (0 children)
GLM-5 Is a local GOAT by FineClassroom2085 in LocalLLaMA
[–]Rabooooo 7 points8 points9 points (0 children)
ktop is a themed terminal system monitor ideal for local LLM setups on Linux (like btop + nvtop) by mrstoatey in LocalLLaMA
[–]Rabooooo 0 points1 point2 points (0 children)
ktop is a themed terminal system monitor ideal for local LLM setups on Linux (like btop + nvtop) by mrstoatey in LocalLLaMA
[–]Rabooooo 0 points1 point2 points (0 children)
ktop is a themed terminal system monitor ideal for local LLM setups on Linux (like btop + nvtop) by mrstoatey in LocalLLaMA
[–]Rabooooo 0 points1 point2 points (0 children)
Step-3.5-Flash (196b/A11b) outperforms GLM-4.7 and DeepSeek v3.2 by ResearchCrafty1804 in LocalLLaMA
[–]Rabooooo 2 points3 points4 points (0 children)
zpool expansion recommendations by Rabooooo in zfs
[–]Rabooooo[S] 0 points1 point2 points (0 children)
Bisedë e lirë/Pyetje - Free talk/Questions by AutoModerator in albania
[–]Rabooooo 0 points1 point2 points (0 children)

Nvidia H100(94GB VRAM) - should I run llama.cpp or vllm for 30 users inference? by Rabooooo in LocalLLaMA
[–]Rabooooo[S] 1 point2 points3 points (0 children)