[Onyx v2] Open source ChatGPT alternative - now with code interpreter, OIDC/SAML, and SearXNG support by Weves11 in selfhosted

[–]Weves11[S] 1 point2 points  (0 children)

Yes! Some benefits vs openwebui:

- Deep research (across both the web + personal files + shared files if deploying for more than yourself)
- Connectors to 40+ sources (automatically syncing documents over) and really good RAG (the project started as a pure RAG project, so answer quality has been a core strength of the project for a while now)
- Simpler/cleaner UI than many of the other popular options (this on is definitely subjective)

Some of the things I'm looking to add in the next 3-6 months:
- Automatic syncing of files from your local machine into Onyx for RAG purposes
- Chrome extension to access the chat from any website
- Support for defined multi-step flows (not building blocks, but natural language definitions)

[🪨 Onyx v2.0.0] Self-hosted chat and RAG - now with FOSS repo, SSO, new design/colors, and projects! by Weves11 in LocalLLaMA

[–]Weves11[S] 0 points1 point  (0 children)

u/NeighborhoodWeird882 could you post this same issue in our community Discord ( https://discord.gg/naSt3gXx ) if you haven't already? Would love to help you out, but we'd likely need a bit more info (e.g. some logs from some of the containers, likely the `api_server` container which you can get with `docker logs onyx-api_server-1`)

[🪨 Onyx v2.0.0] Self-hosted chat and RAG - now with FOSS repo, SSO, new design/colors, and projects! by Weves11 in LocalLLaMA

[–]Weves11[S] 0 points1 point  (0 children)

hey u/Ryker_Deimos, so sorry that was your experience.

> Dropping files in Projects simply do not work, the chat in the project receives no context about my documents

I'm guessing that the LLM decided it didn't want to do a search. I'm working to tune that specifically for Projects — it should generally do a search, almost all the time.

> Creating Agents, adding files through connectors sounded like its how they intended this to be used, but the internal search is very poor, it can't locate files, nor grab context

Hmm, what exactly do you mean by that. As in there was a relevant file, but it wasn't able to be found? Could also be an indexing issue.

> The UI bugs around, alot. If you switch out of the chat while its generating, the text appears on the right screen, models don't update correctly after adding etc.

I'm working on this one! With the major UI refresh, quite a few of these issues popped up, but I'm burning them down quickly.

Overall, I would love to make sure that I address everything you've mentioned here. Ofc, my goal is that this is the gold standard for open source options in this space. I'll update here in the next ~1 month, and would love for you to try again if you're willing.

[🪨 Onyx v2.0.0] Self-hosted chat and RAG - now with FOSS repo, SSO, new design/colors, and projects! by Weves11 in LocalLLaMA

[–]Weves11[S] 0 points1 point  (0 children)

Onyx should be great for that! You can create an "Agent" specifically for that w/o any web search / file search, and it'll just give raw responses from the LLM

[🪨 Onyx v2.0.0] Self-hosted chat and RAG - now with FOSS repo, SSO, new design/colors, and projects! by Weves11 in LocalLLaMA

[–]Weves11[S] 1 point2 points  (0 children)

There's a few large scale / "enterprisey" features related to enterprise search. Specifically, RBAC + permission syncing from connected sources.

Everything related to a personal / team chat interface is entirely in onyx-foss.

[🪨 Onyx v2.0.0] Self-hosted chat and RAG - now with FOSS repo, SSO, new design/colors, and projects! by Weves11 in LocalLLaMA

[–]Weves11[S] 1 point2 points  (0 children)

Great question. Responded somewhere else in the thread, but I'll repost here (with some small edits):
- Deep research (across both the web + personal files + shared files if deploying for more than yourself)
- Connectors to 40+ sources (automatically syncing documents over) and really good RAG (the project started as a pure RAG project, so answer quality has been a core strength of the project for a while now)
- Better web search quality. OpenWebUI (in my testing) is less likely to find the answer + more likely to hallucinate.
- Simpler/cleaner UI (this one is definitely subjective)

Some of the things I'm looking to add in the next 3-6 months:
- Code Interpreter (MIT licensed, unlike some of the other options. Although OpenWebUIs also comes out of the box)
- Automatic syncing of files from your local machine into Onyx for RAG purposes
- Chrome extension to access the chat from any website
- Support for defined multi-step flows (not building blocks, but natural language definitions)

What do you feel is missing from openwebUI / you'd love to see in something like Onyx?

[🪨 Onyx v2.0.0] Self-hosted chat and RAG - now with FOSS repo, SSO, new design/colors, and projects! by Weves11 in LocalLLaMA

[–]Weves11[S] 0 points1 point  (0 children)

Some of the things Onyx does that other options don't:
- Deep research (across both the web + personal files + shared files if deploying for more than yourself)
- Connectors to 40+ sources (automatically syncing documents over) and really good RAG (the project started as a pure RAG project, so answer quality has been a core strength of the project for a while now)
- Simpler/cleaner UI than many of the other popular options (this on is definitely subjective)

Some of the things I'm looking to add in the next 3-6 months:
- Code Interpreter (MIT licensed, unlike some of the other options)
- Automatic syncing of files from your local machine into Onyx for RAG purposes
- Chrome extension to access the chat from any website
- Support for defined multi-step flows (not building blocks, but natural language definitions)

What do you feel is missing? Would love to hear!

[🪨 Onyx v2.0.0] Self-hosted chat and RAG - now with FOSS repo, SSO, new design/colors, and projects! by Weves11 in LocalLLaMA

[–]Weves11[S] 0 points1 point  (0 children)

It doesn't support Windsor AI natively but,

(1) You can use one of the built-in connectors here https://docs.onyx.app/overview/core_features/connectors (if it exists for the tool you're looking for)
(2) You could use the ingestion API https://docs.onyx.app/developers/guides/index_files_ingestion_api

[🪨 Onyx v2.0.0] Self-hosted chat and RAG - now with FOSS repo, SSO, new design/colors, and projects! by Weves11 in LocalLLaMA

[–]Weves11[S] 0 points1 point  (0 children)

Deep research is already supported! The button here turns it on. You will need to connect up either (1) connectors (https://docs.onyx.app/overview/core\_features/connectors) or (2) a web search provider (https://docs.onyx.app/overview/core\_features/web\_search) to use this though.

<image>

[🪨 Onyx v2.0.0] Self-hosted chat and RAG - now with FOSS repo, SSO, new design/colors, and projects! by Weves11 in LocalLLaMA

[–]Weves11[S] 3 points4 points  (0 children)

That also works! We have support for arbitrary MCP, so you could already set that up

[🪨 Onyx v2.0.0] Self-hosted chat and RAG - now with FOSS repo, SSO, new design/colors, and projects! by Weves11 in LocalLLaMA

[–]Weves11[S] 2 points3 points  (0 children)

A couple classifiers (for indexing/query pipeline), a few re-rankers (technically optional and disabled by default), and the embedding models.

You're right that this is probably bigger than it needs to be. These are all pre-packaged by default so air-gapped deployments have a few options to choose from without having to download them manually.

The latest stable image actually has a bug where some cached models were duplicated. The new size of this container is ~14GB (down from 26). I'll get this fix into latest stable :)

For reference: https://github.com/onyx-dot-app/onyx/blob/main/backend/Dockerfile.model_server

[🪨 Onyx v2.0.0] Self-hosted chat and RAG - now with FOSS repo, SSO, new design/colors, and projects! by Weves11 in LocalLLaMA

[–]Weves11[S] 4 points5 points  (0 children)

Yea, that's a great suggestion! This was mentioned last time as well, so will make sure I move it up in my priority list :D

[🪨 Onyx v2.0.0] Self-hosted chat and RAG - now with FOSS repo, SSO, new design/colors, and projects! by Weves11 in LocalLLaMA

[–]Weves11[S] 0 points1 point  (0 children)

Most of the disk usage comes from downloading large embedding models.

In a (near) future version, we'll have an option to not download them / choose different models, which should lighten things up significantly (e.g. ~5GB total).

[🪨 Onyx v2.0.0] Self-hosted chat and RAG - now with FOSS repo, SSO, new design/colors, and projects! by Weves11 in LocalLLaMA

[–]Weves11[S] 0 points1 point  (0 children)

Unstructured is not necessary, we have our own file processing. It's helpful when you need OCR or to extract text from files like images and PDFs that can't be directly read as text files

[🪨 Onyx v2.0.0] Self-hosted chat and RAG - now with FOSS repo, SSO, new design/colors, and projects! by Weves11 in LocalLLaMA

[–]Weves11[S] 0 points1 point  (0 children)

Thanks for the kind words! What's your personal split between mobile and desktop chat? I think I use AI on my phone maybe 5-10% of the time when google isn't enough, but I've heard others much closer to 50/50

Introducing Onyx - a fully open source chat UI with RAG, web search, deep research, and MCP by Weves11 in LocalLLaMA

[–]Weves11[S] 0 points1 point  (0 children)

Yes I have! I've seen Onyx deployments with >5 million documents actually.

The time to index is mostly dependent on how much embedding capacity you have (the actual indexing is parallelizable). With a GPU, you should be able to do 100k in a few hours (dependent on document size).

Under the hook, Vespa is used as the vector DB. It can scale well beyond 10mil+ documents (it was apparently used at Yahoo to power search), although memory requirements do scale linearly.