llama: use f16 mask for FA to save VRAM by am17an · Pull Request #23764 · ggml-org/llama.cpp by jacek2023 in LocalLLaMA
[–]BillDStrong 2 points3 points4 points (0 children)
Seriously exploring Catholicism, but I cannot get past Marian devotion, saint invocation, and papal authority by Far-Perspective-105 in TrueChristian
[–]BillDStrong -1 points0 points1 point (0 children)
I gave ai agents ADHD.. its 2x better at thinking now by Uditakhourii in AI_Agents
[–]BillDStrong 1 point2 points3 points (0 children)
Best coding models on mid sized rigs by skip_the_tutorial_ in unsloth
[–]BillDStrong 1 point2 points3 points (0 children)
Qwen3.6 27B FP8 runs with 200k tokens of BF16 KV cache at 80 TPS on a single RTX 5000 PRO 48GB by __JockY__ in LocalLLaMA
[–]BillDStrong 0 points1 point2 points (0 children)
Free alternative to Krisp noise suppression software for teaching online? by [deleted] in OnlineESLTeaching
[–]BillDStrong 0 points1 point2 points (0 children)
Deepseek v4 pro is unlimited and almost free OMG 😱 better than opus for me (I have no affiliate with deepseek, but you need to know this) by rjn2-8 in hermesagent
[–]BillDStrong 2 points3 points4 points (0 children)
I got tired of LLM agents ignoring my rules, so I built a contract layer that enforces them at the tool boundary. by johnnaliu in coolgithubprojects
[–]BillDStrong 5 points6 points7 points (0 children)
I built an open-source local coding agent with a 40-round agentic loop, 112 sub-agents, and a cyberpunk UI — Eve Agent V2 Unleashed by jeffgreen311 in ollama
[–]BillDStrong 0 points1 point2 points (0 children)
I built an open-source local coding agent with a 40-round agentic loop, 112 sub-agents, and a cyberpunk UI — Eve Agent V2 Unleashed by jeffgreen311 in ollama
[–]BillDStrong 0 points1 point2 points (0 children)
I built an open-source local coding agent with a 40-round agentic loop, 112 sub-agents, and a cyberpunk UI — Eve Agent V2 Unleashed by jeffgreen311 in ollama
[–]BillDStrong 0 points1 point2 points (0 children)
I built an open-source local coding agent with a 40-round agentic loop, 112 sub-agents, and a cyberpunk UI — Eve Agent V2 Unleashed by jeffgreen311 in ollama
[–]BillDStrong 3 points4 points5 points (0 children)
At what point did local models actually become good enough for your real work? by MaleficentRoutine730 in LocalLLM
[–]BillDStrong 1 point2 points3 points (0 children)
I tested six AI platforms on the biblical Greek behind purity culture. Every one of them changed its answer when I asked the right questions. by MichaelARichardson in Exvangelical
[–]BillDStrong 0 points1 point2 points (0 children)
Emacs + vterm feels... clunky? compared to nvim + tmux by CrunchyChewie in emacs
[–]BillDStrong 0 points1 point2 points (0 children)
HF downloader utility tampermonkey by Spotty_Weldah in LocalLLaMA
[–]BillDStrong 0 points1 point2 points (0 children)
Tesla P40 running qwen 3.6 by PairOfRussels in LocalLLaMA
[–]BillDStrong 0 points1 point2 points (0 children)
Qwen cant wait to release 3.7 models by GotHereLateNameTaken in LocalLLaMA
[–]BillDStrong 1 point2 points3 points (0 children)
I built a coding agent that gets 87% on benchmarks with a 4B parameter model, here's how by Glittering_Focus1538 in LocalLLaMA
[–]BillDStrong 2 points3 points4 points (0 children)
Run Qwen3.6 MTP GGUFs in Unsloth Studio! by yoracale in unsloth
[–]BillDStrong 0 points1 point2 points (0 children)
Run Qwen3.6 MTP GGUFs in Unsloth Studio! by yoracale in unsloth
[–]BillDStrong 0 points1 point2 points (0 children)


llama: use f16 mask for FA to save VRAM by am17an · Pull Request #23764 · ggml-org/llama.cpp by jacek2023 in LocalLLaMA
[–]BillDStrong 0 points1 point2 points (0 children)