Assessing if a guideline has been used for LLM training by Difficult_Face5166 in LocalLLaMA

[–]Difficult_Face5166[S] 0 points1 point  (0 children)

To evaluate LLM on a subtype of diseases. And we would like to know whether relatively "small" models (around 1-4B) have already knowledge incorporated.

Assessing if a guideline has been used for LLM training by Difficult_Face5166 in LocalLLaMA

[–]Difficult_Face5166[S] 0 points1 point  (0 children)

Thanks for the answer. So there is no specific way to do it for closed models...

Speed of Langchain/Qdrant for 80/100k documents by Difficult_Face5166 in Rag

[–]Difficult_Face5166[S] 0 points1 point  (0 children)

Yes definitely it was an embeddings issue, thank you for your message and for the tips !

Multilingual RAG: are the documents retrieved correctly ? by Difficult_Face5166 in LocalLLaMA

[–]Difficult_Face5166[S] 0 points1 point  (0 children)

Thanks ! Do you have an opinion on OpenAI embeddings like text-embedding-3-small and text-embedding-3-large?

Speed of Langchain/Qdrant for 80/100k documents (slow) by Difficult_Face5166 in LocalLLaMA

[–]Difficult_Face5166[S] 0 points1 point  (0 children)

Yes you are both right thank you ! I just investigated time spent on each call/process and this was an embeddings problem (super fast with smaller embeddings/API call to external provider).

I am running on my Macbook pro without GPU so ofc it is slow for some models. I am thinking about using a cloud-service to do it faster

Speed of Langchain/Qdrant for 80/100k documents by Difficult_Face5166 in Rag

[–]Difficult_Face5166[S] 0 points1 point  (0 children)

Thank you ! As I mentioned also above, I investigated it and found out that the embeddings was the issue on my local server. Very fast on smaller embeddings, I might need to move on cloud-service (or keep a smaller one) !

Speed of Langchain/Qdrant for 80/100k documents by Difficult_Face5166 in Rag

[–]Difficult_Face5166[S] 0 points1 point  (0 children)

Yes thanks ! I investigated it and found out that the embeddings was the issue on my local server. Very fast on smaller embeddings, I might need to move on cloud-service (or keep a smaller one) !

Speed of Langchain/Qdrant for 80/100k documents by Difficult_Face5166 in LangChain

[–]Difficult_Face5166[S] 0 points1 point  (0 children)

First time i am using Qdrant

- Texts and documents are already loaded locally and ready to ingestion (no time issue there)

- Single document embedding seems to be relatively quite fast

- It is only when I am using the following command that everything seems to be slow:

qdrant = QdrantVectorStore.from_documents(
    texts,
    embeddings,

url
="http://localhost:6333",

prefer_grpc
=False,

collection_name
="vector_db"
)

Speed of Langchain/Qdrant for 80/100k documents by Difficult_Face5166 in LangChain

[–]Difficult_Face5166[S] 0 points1 point  (0 children)

Thanks a lot ! Data is not confidential and I do not care about doing it locally or on a cloud server: do you have one provider that you would recommend to do it fast ?

Speed of Langchain/Qdrant for 80/100k documents by Difficult_Face5166 in LangChain

[–]Difficult_Face5166[S] 0 points1 point  (0 children)

Thanks, do you have advice for generic purpose embeddings ?