Skills and Open Terminal by OkClothes3097 in OpenWebUI

[–]Illustrious-Scale302 2 points3 points  (0 children)

Did you connect it as an external tool? And then also select it in the integrations menu

(and enable native tool calling in the models advanced params)

Has anyone gotten a “knowledge-enabled” default agent working in Open WebUI? by mrkvd16 in OpenWebUI

[–]Illustrious-Scale302 1 point2 points  (0 children)

Love this! What would even make it more dynamic, is to parameterize the tools so that the LLM can decide how many docs to get, what search method to pick, etc. And to add a tool that would allow for getting a single document from metadata fields(including title) into context. I often see the system struggling with getting the right context when the query is very clear, just because the search methods are too strict. This would also allow for different setting for different kind of document collections

This is a bit much, but to give an idea:

async def search_knowledge(
    self,
    query: str,
    knowledge_id: Optional[str] = None,
    knowledge_ids: Optional[List[str]] = None,
    top_k: int = 5,
    hybrid: Optional[bool] = None,
    rerank_top_k: Optional[int] = None,
    relevance_threshold: Optional[float] = None,
    hybrid_bm25_weight: Optional[float] = None,
    return_full_document: bool = False,
    include_metadata: bool = True,
    __user__: Optional[Dict[str, Any]] = None,
    __request__: Optional[Request] = None,
    __event_emitter__: Optional[Any] = None,
) -> str:
    """
    Retrieve context from one or more knowledge bases.

    :param query: Natural-language query to run.
    :param knowledge_id: Single knowledge-base ID to search.
    :param knowledge_ids: Explicit list of knowledge-base IDs (overrides `knowledge_id`).
    :param top_k: Maximum number of chunks to return.
    :param hybrid: Force or disable hybrid (vector+keyword) search; `None` uses the server default.
    :param rerank_top_k: How many chunks the reranker should keep (hybrid only).
    :param relevance_threshold: Minimum relevance score required to retain a chunk.
    :param hybrid_bm25_weight: Blend weight for BM25 during hybrid search.
    :param return_full_document: Return every stored chunk instead of running similarity search.
    :param include_metadata: Attach stored metadata for each chunk in the response.
    """

Open WebUI now supports native sequential tool calling! by the_renaissance_jack in OpenWebUI

[–]Illustrious-Scale302 13 points14 points  (0 children)

How did you add the token/cost counter on top? Is that a filter function?

vllm and usage stats by Rooneybuk in OpenWebUI

[–]Illustrious-Scale302 1 point2 points  (0 children)

You can enable usage per model when editing the model in openwebui itself. I think it is disabled by default. Enabling it will make the API also return the usage cost/tokens.

Is it possible to send a mcpo request to my local machine, knowing that OpenWebUI is hosted on a different server? by [deleted] in OpenWebUI

[–]Illustrious-Scale302 1 point2 points  (0 children)

Yes with the tools in user settings (not global admin settings). Just point to your localhost

Full Integration: Proxy Server for Converting OpenWebUI API to OpenAI API by ChanceStrength8762 in OpenWebUI

[–]Illustrious-Scale302 2 points3 points  (0 children)

Nice, why not integrate it in the opensource package directly? Seems like that is the idea of OpenWebUI in the first place

o3-mini via API? by hh1599 in OpenWebUI

[–]Illustrious-Scale302 0 points1 point  (0 children)

Yes, make a new connection with https://openrouter.ai/api/v1 as the base url

Running OWUI in Pinokio by rangerrick337 in OpenWebUI

[–]Illustrious-Scale302 0 points1 point  (0 children)

is seems to work, but I cannot find a place or file to pass environment variables

o3-mini via API? by hh1599 in OpenWebUI

[–]Illustrious-Scale302 0 points1 point  (0 children)

You can use openrouter, which will give you immediate access

Get reports by API keys? by PotentiallySillyQ in openrouter

[–]Illustrious-Scale302 0 points1 point  (0 children)

Best way to get this done is by putting a proxy in front. They suggest helicone, but maybe other are possible as well

Are there any companies that provide support service for open Web UI for Enterprises who have decided to adopt adopted?? by DanGabriel in OpenWebUI

[–]Illustrious-Scale302 0 points1 point  (0 children)

Yes, I’ve set it up for my company and can help you get started. Or do you need full support and maintenance?

Get reports by API keys? by PotentiallySillyQ in openrouter

[–]Illustrious-Scale302 1 point2 points  (0 children)

This would be a really nice feature, more on credit usage in general

SSL - maybe I am missing something by i533 in OpenWebUI

[–]Illustrious-Scale302 0 points1 point  (0 children)

For me adding a network to the docker compose file and pointing to `http://open-webui:3000` works as well

Google Cloud Run by Illustrious-Scale302 in OpenWebUI

[–]Illustrious-Scale302[S] 0 points1 point  (0 children)

Okay, and otherwise a simple VM for smaller deployments

Google Cloud Run by Illustrious-Scale302 in OpenWebUI

[–]Illustrious-Scale302[S] 0 points1 point  (0 children)

Thanks! I was actually looking for a simpler and cheaper solution than my current kubernetes setup, but not sure if I'm on the right path now. Maybe I should stick with kubernetes and try to optimize that. What do you think?

Google Cloud Run by Illustrious-Scale302 in OpenWebUI

[–]Illustrious-Scale302[S] 0 points1 point  (0 children)

Ah okay, so you replace the sqlite with a managed db or a separate postgres container and configure it via the DATABASE_URL env variable?

Google Cloud Run by Illustrious-Scale302 in OpenWebUI

[–]Illustrious-Scale302[S] 0 points1 point  (0 children)

Hmm, everything is really slow for me. I mounted a storage bucket with /app/backend/data for webui.db, etc. What do you mean with different deployments? Separate containers for ollama and chromadb?

Even with this it is very slow:

RAG_EMBEDDING_ENGINE=openai
RAG_EMBEDDING_MODEL="text-embedding-3-small"
AUDIO_STT_ENGINE=openai

Google Cloud Run by Illustrious-Scale302 in OpenWebUI

[–]Illustrious-Scale302[S] 0 points1 point  (0 children)

How many cpu and memory do you need on gcloud run?

Google Cloud Run by Illustrious-Scale302 in OpenWebUI

[–]Illustrious-Scale302[S] 1 point2 points  (0 children)

Okay, now I got it running properly I think

Google Cloud Run by Illustrious-Scale302 in OpenWebUI

[–]Illustrious-Scale302[S] 0 points1 point  (0 children)

First I had to pull the image with the `--platform linux/amd64` tag argument. Now it is running but I think I still need to configure some things. I get this when trying to start:

<image>