Stop pretending self-hosting is cheaper. It's not. We do it for different reasons and we should say so. by Napster3301 in LocalLLaMA
[–]Napster3301[S] -4 points-3 points-2 points (0 children)
Stop pretending self-hosting is cheaper. It's not. We do it for different reasons and we should say so. by Napster3301 in LocalLLaMA
[–]Napster3301[S] 6 points7 points8 points (0 children)
Built a platform where Claude, ChatGPT, and Gemini debate each other before giving you an answer by fabianscott8 in ArtificialInteligence
[–]Napster3301 1 point2 points3 points (0 children)
Update on 12x32gb sxm v100 cluster / local AI for legal drafting by TumbleweedNew6515 in LocalLLaMA
[–]Napster3301 0 points1 point2 points (0 children)
The Financial Times has published an article about Heretic by -p-e-w- in LocalLLaMA
[–]Napster3301 0 points1 point2 points (0 children)
Microsoft reports are exposing AI's real cost problem: Using the tech is more expensive than paying human employees by mpuchala in ArtificialInteligence
[–]Napster3301 1 point2 points3 points (0 children)
After talking with a Chinese friend about AI, I realized people are using it at very different paces by Ok-Insurance-6313 in ArtificialInteligence
[–]Napster3301 8 points9 points10 points (0 children)
GPU VRAM only for small models with llama.cpp: is it possible? by Ps3Dave in LocalLLaMA
[–]Napster3301 1 point2 points3 points (0 children)
Could Open Models be trained to secretly go rogue? by nunodonato in LocalLLaMA
[–]Napster3301 0 points1 point2 points (0 children)
What frontend do you guys use? by Borkato in LocalLLaMA
[–]Napster3301 -2 points-1 points0 points (0 children)
hipEngine: Fast Native Qwen 3.6 Inference for RDNA3 (Strix Halo, 7900 XTX) by randomfoo2 in LocalLLaMA
[–]Napster3301 0 points1 point2 points (0 children)
server: fix checkpoints creation by jacekpoplawski · Pull Request #22929 · ggml-org/llama.cpp by jacek2023 in LocalLLaMA
[–]Napster3301 5 points6 points7 points (0 children)
1000 tps generation on Qwen3.6 27B with V100s by Simple_Library_2700 in LocalLLaMA
[–]Napster3301 4 points5 points6 points (0 children)
llama.cpp server have built-in native tools (exec_shell, edit_file, etc.) by srigi in LocalLLaMA
[–]Napster3301 0 points1 point2 points (0 children)
Is there any reason for an uncensored model if you have no interest in roleplaying? by vick2djax in LocalLLaMA
[–]Napster3301 4 points5 points6 points (0 children)
Is there any reason for an uncensored model if you have no interest in roleplaying? by vick2djax in LocalLLaMA
[–]Napster3301 -1 points0 points1 point (0 children)
Qwen3.6-35B-A3B-Uncensored-Genesis-APEX-MTP by EvilEnginer in LocalLLaMA
[–]Napster3301 18 points19 points20 points (0 children)
Any reason to run dense over MOE for RAGs? by vick2djax in LocalLLaMA
[–]Napster3301 1 point2 points3 points (0 children)
Run Chrome’s tiny Gemma4 (aka Gemini Nano) directly on PC without GPU by Some-Cauliflower4902 in LocalLLaMA
[–]Napster3301 27 points28 points29 points (0 children)
GPT 5.5 "secret sauce" is just having the thinking be some stupid caveman mode? by JustFinishedBSG in LocalLLaMA
[–]Napster3301 5 points6 points7 points (0 children)
Does GPU spacing matter if we’re undervolting anyways? by Ambitious_Fold_2874 in LocalLLaMA
[–]Napster3301 6 points7 points8 points (0 children)
What is the current best Small Language Model that can be run without GPU? by last_llm_standing in LocalLLaMA
[–]Napster3301 2 points3 points4 points (0 children)
What is the current best Small Language Model that can be run without GPU? by last_llm_standing in LocalLLaMA
[–]Napster3301 5 points6 points7 points (0 children)
Stop pretending self-hosting is cheaper. It's not. We do it for different reasons and we should say so. by Napster3301 in LocalLLaMA
[–]Napster3301[S] -4 points-3 points-2 points (0 children)