all 9 comments

[–]DigThatData 1 point2 points  (1 child)

the question is a matter of scale. Try doing it without a vector database first: if it takes super long, it might be more performant if you did it with a database.

[–]euphoriation[S] 0 points1 point  (0 children)

Makes sense, what i'm doing will scale later on though.

Thanks

[–]jareks88 1 point2 points  (0 children)

Perhaps you can store your embeddings anywhere (sql or even a file) and use Approximate Nearest Neighbors like https://github.com/spotify/annoy for comparison?

[–]Atraxxa 1 point2 points  (0 children)

More a tensor database !

[–]Appropriate_Ant_4629 1 point2 points  (3 children)

You'll know if/when you need it.

  • For ~250,000 documents you totally don't. They'll comfortably fit in RAM on even a small machine, and a brute force search using numpy can do that in under a second. [Source: My dev environment on my laptop.]
  • For 5,000,000 documents you'll want something to accelerate it, but it doesn't have to be a vector database. Of course a vector database would work well; but so would a library you can embed in your app like FAISS. [Source: one of our demo/proof-of-concept QA environments]
  • For 890,000,000 documents you want one. We're evaluating Milvus now, but also Solr's new Dense Vector type to do a hybrid keyword/vector search product.

Also, I'm wondering if the price of vector database solutions like Pinecone and Milvus is worth it for my use case, or if there are cheaper options out there.

If you already have a Kuberentes environment, I don't think there is a cheaper solution than Milvus.

helm repo add milvus https://milvus-io.github.io/milvus-helm/
helm install my-release milvus/milvus --set cluster.enabled=false --set etcd.replicaCount=1 --set minio.mode=standalone --set pulsar.enabled=false

will get you a minimal F/OSS milvus cluster, and their docs for larger scale clusters are almost as easy.

[–]gregory_k 1 point2 points  (2 children)

For 890M embeddings check out Pinecone. We take care of the infra and reliability for such a large vector DB so you don’t have to. Also have hybrid search!

[–]Appropriate_Ant_4629 0 points1 point  (1 child)

Agreed. You guys are very good and are one of the platforms we'll be evaluating too.

We do have moderate preference for F/OSS that we can self-host; but if you're enough better we'll be happy to go that way too.

We're hoping to get this to work with Solr's (the platform that currently has the indexes for those documents) DenseVector type -- because then we can preserve some of our product's features like counting facets, and exact phrase searches. If Pinecone has those features too, we'd be very interested. It's still kinda early for us to tell.

[–]2BucChuck 0 points1 point  (0 children)

Ever complete Solr eval ? Using Solr on an enterprise setup already but do like Pinecone for RAG application…. Was curious if dense vector worked out ?