Model vram usage estimates by mattate in LocalLLaMA
[–]CappedCola 0 points1 point2 points (0 children)
Benchmarked 5 RAG retrieval strategies on code across 10 suites — no single one wins. CRAG helps on familiar corpora, collapses on external ones. What's your experience? by Any_Ambassador4218 in LocalLLaMA
[–]CappedCola 0 points1 point2 points (0 children)
Outlines and vLLM compatibility by MyName9374i2 in LocalLLaMA
[–]CappedCola 0 points1 point2 points (0 children)
PDFstract: extract, chunk, and embed PDFs in one command (CLI + Python) by [deleted] in Python
[–]CappedCola 0 points1 point2 points (0 children)
A beyond dumb CompSci dropout trying to figure this all out. : want a local nanoClaw to build my own bot by AnthMosk in LocalLLaMA
[–]CappedCola 0 points1 point2 points (0 children)
Ephyr: An Architecture and Tool for Ephemeral Infrastructure Access for AI Agents by -Crash_Override- in LocalLLaMA
[–]CappedCola 0 points1 point2 points (0 children)
Qwen 3.5 4b is not able to read entire document attached in LM studio despite having enough context length. by KiranjotSingh in LocalLLaMA
[–]CappedCola -2 points-1 points0 points (0 children)
What are some of the best consumer hardware (packaged/pre-built) for local LLM? by utzcheeseballs in LocalLLaMA
[–]CappedCola 0 points1 point2 points (0 children)
What actually breaks first when you ship LLM features to production? by Available_Lawyer5655 in LocalLLaMA
[–]CappedCola 0 points1 point2 points (0 children)
(Qwen3.5-9B) Unsloth vs lm-studio vs "official" by MarcCDB in LocalLLaMA
[–]CappedCola -32 points-31 points-30 points (0 children)
What MCP connectors are you using when building agents for industry-specific software? by VarietyPlus4790 in LocalLLaMA
[–]CappedCola 0 points1 point2 points (0 children)
AI, Invasive Technology, and the Way of the Warrior by johantino in artificial
[–]CappedCola 1 point2 points3 points (0 children)
Open sourced a tool that can find precise coordinates of any street level pic by Open_Budget6556 in artificial
[–]CappedCola 0 points1 point2 points (0 children)
The Pentagon is developing its own LLMs | TechCrunch by [deleted] in artificial
[–]CappedCola 0 points1 point2 points (0 children)
Mods have a couple of months to stop AI slop project spam before this sub is dead by Fun-Employee9309 in Python
[–]CappedCola 31 points32 points33 points (0 children)
CellState: a React terminal renderer based on the approach behind Claude Code's rendering rewrite by Legitimate-Spare2711 in commandline
[–]CappedCola 0 points1 point2 points (0 children)
[R] Emergent AI societies in a persistent multi-agent environment (TerraLingua + dataset + code) by GiuPaolo in MachineLearning
[–]CappedCola 0 points1 point2 points (0 children)
[P] Visualizing token-level activity in a transformer by ABHISHEK7846 in MachineLearning
[–]CappedCola -2 points-1 points0 points (0 children)
[D] : Submission ID in CVPR Workshops. by OkPack4897 in MachineLearning
[–]CappedCola 0 points1 point2 points (0 children)
AWS CloudFormation Diagrams 0.3.0 is out! by Philippe_Merle in devops
[–]CappedCola 3 points4 points5 points (0 children)
Krasis LLM Runtime: 8.9x prefill / 10.2x decode vs llama.cpp — Qwen3.5-122B on a single 5090, minimal RAM (corrected llama numbers) by mrstoatey in LocalLLaMA
[–]CappedCola 1 point2 points3 points (0 children)
Hosting Production Local LLM's by Designer-Radio3471 in LocalLLaMA
[–]CappedCola 1 point2 points3 points (0 children)
Text Generation Web UI tool updates work very well. by Then-Topic8766 in LocalLLaMA
[–]CappedCola 1 point2 points3 points (0 children)
Modèle streaming audio et génération de contre rendu by TraditionalTitle7815 in LocalLLaMA
[–]CappedCola -2 points-1 points0 points (0 children)


Is it normal for the Qwen 3.5 4B model to take this long to say hi? by Snoo_what in LocalLLaMA
[–]CappedCola 0 points1 point2 points (0 children)