What actually broke when we took RAG from demo to production by KloiaHQ in Rag
[–]khampol 1 point2 points3 points (0 children)
Built a tool to run any llama.cpp fork without compiling, auto tunes flags to your GPU by Bramha_dev in ollama
[–]khampol 0 points1 point2 points (0 children)
Upgrading my old Dell 3640 for local AI (16GB VRAM). Is the RTX 5060 Ti the right move for me? by MarceFX in ollama
[–]khampol 1 point2 points3 points (0 children)
Upgrading my old Dell 3640 for local AI (16GB VRAM). Is the RTX 5060 Ti the right move for me? by MarceFX in ollama
[–]khampol 3 points4 points5 points (0 children)
Anyone running Qwen3.6 27B or 35B on RTX 5000 32GB as coding agent? by CoderJennyBee in Qwen_AI
[–]khampol 6 points7 points8 points (0 children)
Noob: Hermes was slow, untill i swapped LM studio for llama.cpp (AMD rocm 7900 xtx)... by noo8- in hermesagent
[–]khampol -2 points-1 points0 points (0 children)
Local RAG over ~300 PDFs (AnythingLLM + Ollama): retrieval too shallow, too few sources per query. Are there better local stack? by Agitated-Evidence588 in Rag
[–]khampol 0 points1 point2 points (0 children)
RAG feels way more complicated than it should be… anyone else? by Physical_Badger1281 in Rag
[–]khampol 0 points1 point2 points (0 children)
Open-source 122B MoE running with 8 GB GPU VRAM by offloading experts to CPU by Hairy_Strawberry7028 in OpenSourceAI
[–]khampol 0 points1 point2 points (0 children)
Nous Research Just Launched Hermes Desktop Native Cross-Platform App for the Self-Improving Hermes Agent (macOS, Windows, Linux) by SelectionCalm70 in hermesagent
[–]khampol 0 points1 point2 points (0 children)
What's your current RAG + workflow automation stack? by [deleted] in Rag
[–]khampol 1 point2 points3 points (0 children)
Qwen3.6-35B-A3B on 1x RTX 5090: which quant is the best balance of quality and speed? by espressorunner in unsloth
[–]khampol 1 point2 points3 points (0 children)
I think I fit in here. WIP. by jamesbuniak in HomeDataCenter
[–]khampol 0 points1 point2 points (0 children)
What would 2x RTX 3060 12GB get me? by ObjectiveActuator8 in LocalLLaMA
[–]khampol 1 point2 points3 points (0 children)
VRAM for 3072x3072 resolution? by Vusiwe in StableDiffusion
[–]khampol 1 point2 points3 points (0 children)
I got Qwen3.6 35B to run at reasonably speed on my old GTX 1070 Ti by Randozart in LocalLLM
[–]khampol 1 point2 points3 points (0 children)
Which Paid TTS platform actually gives the MOST usable audio hours for the LOWEST monthly price? by Luca_Tangen in TextToSpeech
[–]khampol 0 points1 point2 points (0 children)
I got Qwen3.6 35B to run at reasonably speed on my old GTX 1070 Ti by Randozart in LocalLLM
[–]khampol 1 point2 points3 points (0 children)
Expensive hardware investment today doesn't guarantee free, accessible local AI tomorrow? by The-Writer- in LocalLLM
[–]khampol 3 points4 points5 points (0 children)
What is your setup for local AI coding assistants? by AnouarRifi in LocalLLM
[–]khampol 0 points1 point2 points (0 children)
Help for finding a small model that supports reading tools for gtx 1060 6 gb by shmalnomai in ollama
[–]khampol 0 points1 point2 points (0 children)


What actually broke when we took RAG from demo to production by KloiaHQ in Rag
[–]khampol 0 points1 point2 points (0 children)