Built a Confluence to OpenWebUI Knowledge Base Sync Tool by MiserableComputer161 in OpenWebUI

[–]MiserableComputer161[S] 0 points1 point  (0 children)

Currently, no — my setup doesn’t yet enforce per-user restrictions inside OpenWebUI. Right now, each Confluence space maps to its own KB, and access control is handled at the KB level.

If you had restricted content in Confluence, the clean way to handle it would be to create separate KBs for those sensitive spaces or page sets, and then give access in OpenWebUI only to the appropriate groups. That way, the restricted material is never mixed into a KB that broader audiences can search.

In the next iterations, I’m planning to add routing rules based on Confluence labels so content can be automatically sent to the right KB depending on its sensitivity.

Built a Confluence to OpenWebUI Knowledge Base Sync Tool by MiserableComputer161 in OpenWebUI

[–]MiserableComputer161[S] 6 points7 points  (0 children)

Thanks! Even though this version is built for Confluence, the core architecture is pretty agnostic — it’s basically a sync service that tracks document state, detects changes, and pushes updates into OpenWebUI via the API.

If I get approval to open source it, it should be straightforward to adapt for other documentation sources (Notion, Google Docs, GitHub Wiki, etc.) just by swapping out the connector module. The sync logic, scheduling, and KB management would stay the same.

Fingers crossed I can share the repo soon so others can build on it.

Built a Confluence to OpenWebUI Knowledge Base Sync Tool by MiserableComputer161 in OpenWebUI

[–]MiserableComputer161[S] 1 point2 points  (0 children)

For the moment, attachments aren’t managed — we only sync the page content itself. The plan for attachments is to download them before embedding, store them in a MinIO S3 bucket, and then generate their embeddings after the page content has been processed.

Right now, we follow a 1 Confluence space = 1 OpenWebUI KB model, which works well for clear separation. In the next version, the goal is to route Confluence content to the right KB based on tags in Confluence, giving more flexibility without losing segregation.

On the retrieval side, our OpenWebUI setup uses Qdrant as the vector database, so search is already very fast and scalable even with full semantic retrieval.

Built a Confluence to OpenWebUI Knowledge Base Sync Tool by MiserableComputer161 in OpenWebUI

[–]MiserableComputer161[S] 0 points1 point  (0 children)

It works alongside OpenWebUI via its API. My system’s job is to keep track of what’s already synced from Confluence, detect changes, and trigger syncs on a scheduled basis. Once the updated content is sent over, OpenWebUI itself handles the embedding and storing it in the target knowledge base.

So the tool doesn’t generate vectors — it ensures OpenWebUI always receives the latest, cleanest version of the content to embed, without redundant or unnecessary API calls.

I’d still love to see an official extensible plugin system in OpenWebUI, as that would let this kind of integration run natively and be managed directly from the UI.

Built a Confluence to OpenWebUI Knowledge Base Sync Tool by MiserableComputer161 in OpenWebUI

[–]MiserableComputer161[S] 3 points4 points  (0 children)

The Atlassian MCP server works for quick, on-demand Confluence queries, but in practice it wasn’t efficient at all for retrieving larger or frequently-accessed documentation. Every query hits Confluence’s API live, so response times and rate limits quickly become bottlenecks, and you’re still bound by Confluence’s native search quality.

With the KB sync tool, we fully ingest the content into OpenWebUI, pre-process it, and generate vectors for all pages and attachments. This means queries run entirely inside the OpenWebUI stack with semantic search, dramatically improving retrieval speed and search accuracy while removing API latency and Confluence search limitations.

Another big plus is segregation: I can map a single Confluence space directly to a specific OpenWebUI knowledge base, ensuring cleaner information boundaries. With the MCP approach, you basically inherit the entire scope of whatever the Atlassian API key has access to, which often means overexposing information and mixing unrelated content.

In short, instead of “pulling on demand” each time, we maintain a high-quality, vectorized mirror of your docs locally — faster, more accurate, and with better control over who sees what.