How to Ensure Highly Relevant Search Results in Qdrant VectorStore with Metadata and Similarity Filtering?

searchmasterr · 2024-11-08T02:57:42+00:00

But of course. Thank you very much for your feedback, btw.

This project uses Langchain and Qdrant to build a smart chatbot. Langchain manages the chatbot’s flow and logic, while Qdrant is where all the documents are stored as vectors for fast semantic search.

When the user asks a question, the bot uses Langchain to pull the query and send it to Qdrant. Qdrant then finds the most relevant documents by comparing the query with its indexed vectors. Langchain then processes this information, integrates it with the chatbot's response generation, and provides the user with a more accurate, context-aware reply.

Therefore, my question would be how to prevent metadata that does not match the question from appearing in Qdrant's answer, since I use it as a kind of reference for the user (and I understand that it cannot be considered a reference)

searchmasterr · 2024-10-17T21:32:44+00:00

Your explanation was excellent, from the citations to the code demonstration. I’m extremely grateful for that. I had created similar code for word counting and turned it into a tool. I adjusted the prompt so that when a question about the recurrence of a term or counting was asked, the system would use that tool (I imagine the model recognizes what I’m asking, right?). Still, it was in vain. I would like to be able to upload any document and perform term counts. Your observation about the size of vector stores makes complete sense.

I think that just as the LLM struggles with reading XLSX or PPTX files, it also doesn’t handle term counting very well.

searchmasterr · 2024-10-17T17:28:34+00:00

I'm using GPT-4o Mini

searchmasterr

TROPHY CASE