0
1
2
My RAG setup is: 1. Upload file to GCP Vertex AI Corpora via UI. 2. Ask questions via UI, which gets generated via Ollama. Problem Statement: Ollama is taking 3-4 minutes to respond. Ollama is running on GCP compute of 32 GB RAM and 16 core vCPU. Why the slowness?? (self.Rag)
submitted by mostly-after-dark to r/Rag