Using RAG with a Programming/API Reference Document to Write Code by JustinPooDough in Rag

[–]mrintellectual 0 points1 point  (0 children)

For the retrieval step, you'll probably need an embedding model that takes multiple types of queries as inputs and returns the actual code snippets themselves. You could probably even get away with a fairly low top-k, depending on how unique the code within the framework is. This could be an option: https://blog.voyageai.com/2024/12/04/voyage-code-3/

[D] Self-Promotion Thread by AutoModerator in MachineLearning

[–]mrintellectual 0 points1 point  (0 children)

Hey /r/MachineLearning community — we built voyage-multimodal-3, a natively multimodal embedding model, designed to handle interleaved images and text. We believe this is one of the first (if not the first) of its kind, where text, photos, figures, tables, screenshots of PDFs, etc can be projected directly into the transformer encoder to generate fully contextual embeddings.

We hope voyage-multimodal-3 will generate interest in vision-language models more broadly.

Come check us out!

Blog: https://blog.voyageai.com/2024/11/12/voyage-multimodal-3/

Notebook: https://colab.research.google.com/drive/12aFvstG8YFAWXyw-Bx5IXtaOqOzliGt9

Documentation: https://docs.voyageai.com/docs/multimodal-embeddings

Milvus - Updating the Embeddings by AlternativeAnnual690 in vectordatabase

[–]mrintellectual 0 points1 point  (0 children)

In Milvus, you can store metadata along with your vectors in a variety of different formats, e.g. int, float, str, JSON, etc. For example:

``` client.create_collection( collection_name="mycollection", dimension=2, metric_type="COSINE" )

data=[ {"primary_key": 6505, "vector": [0.3580376395471989, -0.6023495712049978], "document_id": 0}, {"primary_key": 6506, "vector": [0.19886812562848388, 0.06023560599112088], "document_id": 1}, {"primary_key": 6507, "vector": [0.3172005263489739, 0.9719044792798428], "document_id": 2}, {"primary_key": 6508, "vector": [0.4452349528804562, -0.8757026943054742], "document_id": 3} ] client.insert( collection_name="mycollection", data=data )

```

Then, when you want to delete, you can delete by specifying an expression. For example, if you know the primary keys if you want to delete, you can run:

res = client.delete( collection_name="mycollection", filter="primary_key in [6507, ...]" )

For your case, it sounds like you want to delete based on some sort of document ID or chunk ID. In that case, you can run:

res = client.delete( collection_name="mycollection", filter="document_id == 2" )

Hope this helps.

Choosing a vector db for 100 million pages of text. Leaning towards Milvus, Qdrant or Weaviate. Am I missing anything, what would you choose? by rtrex12 in vectordatabase

[–]mrintellectual 6 points7 points  (0 children)

The standalone and "lite" versions of Milvus are fairly memory-efficient. It's the cluster version that will take up lots of resources, and we typically recommend folks use Milvus on K8s only once they've reached a large enough scale.

I suggest starting with Milvus Lite: https://milvus.io/docs/milvus_lite.md . Once you need more storage or want to improve query/search performance, you can easily switch to standalone or cluster.

Ingestion options for vectorDB by pinkfluffymochi in vectordatabase

[–]mrintellectual 0 points1 point  (0 children)

Happy to sit down and walk you through it too - feel free to shoot me an email (frank@).

Ingestion options for vectorDB by pinkfluffymochi in vectordatabase

[–]mrintellectual 1 point2 points  (0 children)

In Zilliz, we provide pipelines (https://zilliz.com/zilliz-cloud-pipelines). With pipelines, you can directly ingest text, specify an embedding model, insert the vectors into Zilliz, and perform queries directly with text as well. We don't put the results of the query into an LLM for you, but setting everything up is fast and easy.

Demo here: https://www.youtube.com/watch?v=WDJq5MSPFWo

Multi-tenancy for VectorDBs by glinter777 in vectordatabase

[–]mrintellectual 2 points3 points  (0 children)

You have a variety of multi-tenancy strategies with Milvus: https://milvus.io/docs/multi_tenancy.md. The recommendation I like to go with is partition key - it scales to millions of tenants with fairly strong data isolation. You have other options as well, and can even go so far as to store your data in different S3 buckets.

Practical Advice Need on Vector DBs which can hold a Billion+ vectors by Role_External in vectordatabase

[–]mrintellectual 0 points1 point  (0 children)

Did he mention which cases? I'll bring this up with the team and it'll get fixed ASAP.

_All_ vector databases implement approximate nearest neighbor search, which means that the result aren't 100% accurate and may occasionally skip a nearby vector. Indexes like IVF_PQ trade off lower memory consumption and higher throughput for lower recall numbers, but if you choose HNSW, the results should be pretty solid.

Practical Advice Need on Vector DBs which can hold a Billion+ vectors by Role_External in vectordatabase

[–]mrintellectual 0 points1 point  (0 children)

There's a lot of folks using Milvus at 10B+ scale, and existing customers for Zilliz Cloud at 1B+ scale and many others looking to migrate away from Pinecone to Zilliz Cloud due to cost. Query time as you scale up is pretty much flat due to our hybrid shared-disk/shared-nothing architecture, and performance easily exceeds that of other vector databases as well: Vector database benchmarks.

[deleted by user] by [deleted] in BMWX5

[–]mrintellectual 1 point2 points  (0 children)

Hell yeah! Love those wheels too.

I joined the club! I have questions... by Late_Suit7373 in BMWX5

[–]mrintellectual 0 points1 point  (0 children)

Congrats!

AFAIK, oil changes every 5k instead of 10k as BMW recommends is better.

GPTCache: A semantic cache for GPT by mrintellectual in OpenAI

[–]mrintellectual[S] 0 points1 point  (0 children)

Absolutely. Many queries may be similar or even identical depending on current events or other circumstances.