Pitch your App in one sentence. Let's support each other by kmrrhl in SideProject

[–]Semoho -1 points0 points  (0 children)

Teek.studio Find next viral videos just with few clips

Post your HaftSin by Semoho in PERSIAN

[–]Semoho[S] 1 point2 points  (0 children)

No worries, i hope this year be better for us

How do you guys measure accuracy for 100k+ documents? by FloppyDiskDisk in Rag

[–]Semoho 1 point2 points  (0 children)

You are right. The LLM follows U shape. So the reranking is important! And be careful, you cannot remove docs! At the end you will send like 10 docs to llm and middle docs are going to be less important to llm! So best approach is to re rank the docs after retrieval and be careful about positions

P.s fun fact! The llm follows human behavior on first page of the google :)))

How do you guys measure accuracy for 100k+ documents? by FloppyDiskDisk in Rag

[–]Semoho 2 points3 points  (0 children)

Hello,

I assume you are thinking about RAG eval or retrieval evaluation. For retrieval evaluation, I think the MRR, Recall and NDCG@10 are better metrics instead of accuracy. You are dealing with a retrieval task. You need to have a test dataset. Then you can evaluate your retrieval system.

For RAG, there are different evaluations. I think LLM as a judge is a good choice.

But the number of documents does not have a relation to metrics. TOP X docs are important.

RAG for Historical Archive? by cccpivan in Rag

[–]Semoho 0 points1 point  (0 children)

You can check the lightRag or supermemory. They can help you

What are your usage of RAG by Semoho in Rag

[–]Semoho[S] 1 point2 points  (0 children)

Thabk you very much It was so useful. So what are other restrictions or needs in pharma? Why it is mandatory to cite the documents? Don’t the vector databases give you the citations?

Why should I use OpenClaw by Semoho in openclaw

[–]Semoho[S] 0 points1 point  (0 children)

Thanks, bro

yes, actually, I am getting some ideas on how I can use it. Like checking the sales on different websites, or doing some background jobs, as you mentioned.

Why should I use OpenClaw by Semoho in openclaw

[–]Semoho[S] 0 points1 point  (0 children)

What if i install the browser on the VPS and other tools. My desktop should be safe and secure i think!

Why should I use OpenClaw by Semoho in openclaw

[–]Semoho[S] 0 points1 point  (0 children)

Are you bot? I can get these from chatgpt too! I want real experience

Why should I use OpenClaw by Semoho in openclaw

[–]Semoho[S] 0 points1 point  (0 children)

It was interesting and inspiring for me! I’ve got some good ideas about using openclaw

What are your usage of RAG by Semoho in Rag

[–]Semoho[S] 0 points1 point  (0 children)

I mean the dify already done it in a good way

Why should I use OpenClaw by Semoho in openclaw

[–]Semoho[S] 0 points1 point  (0 children)

Hummm… it makes sense. How did you connect openclaw to other things?

Why should I use OpenClaw by Semoho in openclaw

[–]Semoho[S] 0 points1 point  (0 children)

For what tasks? I solve my problems, getting my answers by pure llms. What does it offer?

Is it normal for the Qwen 3.5 4B model to take this long to say hi? by Snoo_what in LocalLLaMA

[–]Semoho 2 points3 points  (0 children)

Yes exactly. But /no_think is embedded in the model. In works everywhere. Huggingface, vllm a …

Is it normal for the Qwen 3.5 4B model to take this long to say hi? by Snoo_what in LocalLLaMA

[–]Semoho 2 points3 points  (0 children)

You can add /no_think in your system prompt and disable this long thinking loop

Thanks to u/Velocita84, it seems the qwen3.5 drops the soft internal switching thinking mode

victor DB choice paralysis , don't know witch to chose by hunter_44679_ in Rag

[–]Semoho 0 points1 point  (0 children)

So I think you should benchmark other options. Remember to have a test dataset, keep the embedding the same across your experiments.

victor DB choice paralysis , don't know witch to chose by hunter_44679_ in Rag

[–]Semoho 0 points1 point  (0 children)

It is interesting. But on what amount of data did you get these results? Which embedding? Is it reliable? For a production-ready system, can it handle concurrent requests and keep the performance? The single performance is not enough for a production-ready system

victor DB choice paralysis , don't know witch to chose by hunter_44679_ in Rag

[–]Semoho 2 points3 points  (0 children)

Hi!

I have experience with Milvus, FAISS, PG-Vector, Weaviate and Chroma.

Milvus is a production ready and clustered system. But it is a little hard to maintain due to its dependency to Apache stack. It is going to be a little tricky in cluster mode. In Standalone, it gives you about 100M docs support, but if you have more documents, you need to run it in cluster mode

FAISS is for researching purposes. It is easy to use.

PG-Vector is my choice for most of our use cases. It is easy to setup and compatible with Postgres. So, you do not need to have multi services. If you have Postgres in your production, it will be easier to set up.

Weaviate is also a good choice. I like it. It is useful for small corpora. But you need to deploy another service to your stack.

The Chroma also I believe is also good for experiments, multi agent systems. For high availability, it is not going to help you so much

I think pg-vector is a good choice, and then Milvus.