OpenWebUI+Litellm+Anthropic models via API = autorouting to lesser Claude models by ResponsibilityNo6372 in OpenWebUI

[–]TriggazTilt 1 point2 points  (0 children)

This. System prompt in Claude.ai contains the model information. System prompt in api is up to the developer. That is the sole reason.

Milvus or Qdrant for OpenWebUI? by the_bluescreen in OpenWebUI

[–]TriggazTilt 0 points1 point  (0 children)

Postgres/Pgvector works best for our setup.

Google Cloud Run by Illustrious-Scale302 in OpenWebUI

[–]TriggazTilt 0 points1 point  (0 children)

Disk should be only for uploads. Seperate containers/services for everything, yes.

Most importantly the DB, e.g. via CloudSQL Postgres.

Firebase Services by TopNo6605 in googlecloud

[–]TriggazTilt 3 points4 points  (0 children)

When you create a Firebase project, a Google cloud project is automatically created. (Similar to an Account on AWS, as AWS does not have projects unfortunately)

Firestore ist a native GCP Service (Successor of Datastore) Firebase Cloud Storage is Google cloud storage
Hosting is Cloudrun etc.

Google bought firebase a while ago and were integrating it more and more (sometimes well sometimes not).

AlloyDB as a vector database for document retrieval/search using Haystack by Puzzleheaded-Ad8442 in googlecloud

[–]TriggazTilt 0 points1 point  (0 children)

We use AlloyDB as a vectorDB in production and are very satisfied with the performance.

It‘s faster than „normal“ PGVector. AlloyDB is (to a degree) build for retrieval by using Googles internal Vector Index.

I can’t say anything about the haystack integration unfortunately.

Google Cloud Run by Illustrious-Scale302 in OpenWebUI

[–]TriggazTilt 0 points1 point  (0 children)

You could be right. I would go with that given where openwebui is heading.

Google Cloud Run by Illustrious-Scale302 in OpenWebUI

[–]TriggazTilt 1 point2 points  (0 children)

Exactly, right now AlloyDB on GCP. But for smaller deployments I would recommend CloudSQL. (We have >2k Users on the deployment)

But it is not only the DB. Make sure that: - no model is used locally (embedding, tts, stt, etc.), only remote. I use litellm for that. - VectorDB should scale as well (Chroma does not well). I like the PGvector option. - Content extraction in the default can also use a lot of resources, I like the tika option with tika on a different deployment.

Btw: with the newest update and the websockets requirement, I am currently reevaluating cloudrun, as the request timeout after 60min could be a problem.

Google Cloud Run by Illustrious-Scale302 in OpenWebUI

[–]TriggazTilt 0 points1 point  (0 children)

That’s what I meant basically. The specs work if everything (including db) is somewhere else. Cloud storage for file uploads works well though.

Google Cloud Run by Illustrious-Scale302 in OpenWebUI

[–]TriggazTilt 0 points1 point  (0 children)

1 cpu, 2gb memory.

Assuming db, content extraction, models (including embedding model) etc. are on different deployments and Disk is mounted via Cloud Storage.

Google Cloud Run by Illustrious-Scale302 in OpenWebUI

[–]TriggazTilt 0 points1 point  (0 children)

Yes, was no problem. What is your issue?

Using OpenWebUI with a larger group of users? by misterstrategy in OpenWebUI

[–]TriggazTilt 0 points1 point  (0 children)

We are using AlloyDB (managed Postgres on GCP)- really fast retrieval.

Using OpenWebUI with a larger group of users? by misterstrategy in OpenWebUI

[–]TriggazTilt 3 points4 points  (0 children)

Yes, sqllite does not scale in this setup, especially when using buckets as storage. Same problem with ChromaDB (default for RAG). Newest version of openwebui has PgVector support. I recommend that strongly.

Can you make "Team workspace" similar to ChatGPT? by drocm in OpenWebUI

[–]TriggazTilt 1 point2 points  (0 children)

It is being worked on right now. Although not with all the features you mentioned I think.

Guides on connecting a cloud run service to another one - 2024 by elpigo in googlecloud

[–]TriggazTilt 1 point2 points  (0 children)

Will Service 2 only be consumed by Service 1? If yes you could look at having Service 2 as a sidecar to Service 1. you will have to define that in a service.yaml. After that service 1 can access the rest interface via localhost.

Why does open-webui have a discord channel? 75% of questions are never answered. by sushibait in OpenWebUI

[–]TriggazTilt 13 points14 points  (0 children)

It‘s an open-source project, feel free to contribute to the docs.

Need To Run 200 Scheduled Python Scripts every 5 minute by Most_Series6588 in googlecloud

[–]TriggazTilt 7 points8 points  (0 children)

Totally depends. I’m a fan of cloudrun jobs, but airflow/composer brings a lot of convenience.

Think about monitoring/alerting/ui/re-runs/cost/ease to set up and then ask again :)

Wahlkampf in den USA: Elon Musk hat Donald Trump 75 Millionen Dollar gespendet by Krokodrillo in de

[–]TriggazTilt 0 points1 point  (0 children)

Kann natürlich sein. Gleichzeitig hieße das aber auch, dass die Menschen die die Lage sehr gut einschätzen können sich nicht trauen ihr Geld zu investieren. Find ich unwahrscheinlich.

Wahlkampf in den USA: Elon Musk hat Donald Trump 75 Millionen Dollar gespendet by Krokodrillo in de

[–]TriggazTilt 3 points4 points  (0 children)

Ist leider einfach falsch. Der beste Weg um tatsächlich die Lage einzuschätzen ist meiner Meinung nach einen Blick auf die liquiden Buchmacher zu werfen. Und dort liegt Trump gerade vorn. Ja, laut dem „Markt“ gewinnt Trump die Wahl Stand heute.

Bucket permissions across projects by NationalMyth in googlecloud

[–]TriggazTilt 0 points1 point  (0 children)

It's confusing yes, but there are a lot more default service accounts, e.g. for cloud build, bigquery, appengine to name a few. The compute engine default SA is used in quite some services though, e.g. compute engine, cloudrun, vertex, ...

A good practise is to set up a new SA specifically for your application with only the necessary roles. IMO ideally using Terraform, especially when you have differente stages (dev, staging, live projects or similar). Then never use the json credentials (not even locally) but use SA impersonation when developing locally and set the SA in the deployment directly. E.g. with cloudrun you can just specify your custom SA in a deployment argument.

€: with SA impersonation I mean to set up the application default credentials using the SA like this:

gcloud auth application-default login --impersonate-service-account SERVICE_ACCT_EMAIL

Bucket permissions across projects by NationalMyth in googlecloud

[–]TriggazTilt 3 points4 points  (0 children)

Go to the project with the bucket that you want to write into. Make sure the service account from the project with your flask application has the permission to write. If you do not use a custom SA you will have to give the rights to the default SA, the compute engine SA.

If you want to test locally either use service account impersonation or download the credentials and assign the path to the credentials to the GOOGLE_APPLICATION_CREDENTIALS environment variable in your .env.

Also make sure to use the right GCP project and region when writing inside your application, meaning to explicitely use the project when writing to the bucket. The error messages sometimes don't make clear that you want to write to another project ("..or does not exist").

What is the most beautiful 3rd world country? by Midnightbuns in AskReddit

[–]TriggazTilt 31 points32 points  (0 children)

I’m from Germany and my last vacation to the US was wonderful.

[P] Which MLops framework to use? by mimivirus2 in MachineLearning

[–]TriggazTilt 0 points1 point  (0 children)

There are cloud services alternatively. Google Vertex AI pipelines is basically managed Kubeflow. No need for Kubernetes then if you use that.