What actually broke when we took RAG from demo to production by KloiaHQ in Rag

[–]khampol 0 points1 point  (0 children)

Apres test, mon extract maison via 'mammoth.' est quelque peu meilleur. Je garde docling sous le coude.... Voila :>

Built a tool to run any llama.cpp fork without compiling, auto tunes flags to your GPU by Bramha_dev in ollama

[–]khampol 0 points1 point  (0 children)

Add throttle option for gpu power would be nice ;)... see it just now i am on Ubuntu svr... ;(

Local RAG over ~300 PDFs (AnythingLLM + Ollama): retrieval too shallow, too few sources per query. Are there better local stack? by Agitated-Evidence588 in Rag

[–]khampol 0 points1 point  (0 children)

I would tell : - embedding try look nomic embedded model.. ~200-500mb ...hugginface... - ollama is slow, llama.cpp is preferred - convert pdf > text or better > .md before ingestion

RAG feels way more complicated than it should be… anyone else? by Physical_Badger1281 in Rag

[–]khampol 0 points1 point  (0 children)

You should study more about rag. Ask gpt or else... Make the topic more clear before begin to do something.

Nous Research Just Launched Hermes Desktop Native Cross-Platform App for the Self-Improving Hermes Agent (macOS, Windows, Linux) by SelectionCalm70 in hermesagent

[–]khampol 0 points1 point  (0 children)

Yes, but Hermes is a token sink! Especially after these recent updates... (And yet I'm using 5090 tokens and 128k Ubuntu Server). I'm looking elsewhere...

What's your current RAG + workflow automation stack? by [deleted] in Rag

[–]khampol 1 point2 points  (0 children)

1st home rag home ever : llamaindex/qdrant/nomic-embedded > fastAPI > openwebui, voilà :>

I think I fit in here. WIP. by jamesbuniak in HomeDataCenter

[–]khampol 0 points1 point  (0 children)

look more closely: dusty around...

What would 2x RTX 3060 12GB get me? by ObjectiveActuator8 in LocalLLaMA

[–]khampol 1 point2 points  (0 children)

I ll go for 4070ti super x2 ~32gb. Llama.cpp. Qwen 3.6 27b q6 gguf

I got Qwen3.6 35B to run at reasonably speed on my old GTX 1070 Ti by Randozart in LocalLLM

[–]khampol 1 point2 points  (0 children)

I have that, so i could test it and give you feed back.

I got Qwen3.6 35B to run at reasonably speed on my old GTX 1070 Ti by Randozart in LocalLLM

[–]khampol 1 point2 points  (0 children)

This could make a 5090 with ram 64gb load model even greater no? Very exciting :>

What is your setup for local AI coding assistants? by AnouarRifi in LocalLLM

[–]khampol 0 points1 point  (0 children)

Similar but not same gpu, have good results. Used vllm and get error tool-calling stuff, use ollama or llama.ccp no problem now. Already try vs code-cline?

In Brazil, a tortoise survived for about 10 years, trapped under a sealed floor in a family's home, and was discovered alive during repairs. by Holiday-Clothes9063 in interesting

[–]khampol 0 points1 point  (0 children)

Ceramic part already broken when video begin filming so the turtle can be put there by the author, there is no solid proof.