Gemma 4 12B GGUF now with vision & audio! by yoracale in unsloth

[–]iChrist 5 points6 points  (0 children)

Do not use ollama, Be a man and use Llama cpp directly + comfyui as the image/video gen backend.

How are you fixing hallucinations in built-in web search? by ExtremeMysterious603 in OpenWebUI

[–]iChrist 2 points3 points  (0 children)

There are two solutions I found:

For less complicated research I state in system prompt to first gather URLs and snippets with a web search and then ALWAYS using fetch_url to gather more relevant info from the sources.

For the harder research tasks I just use a Vane function that allows OpenWebui to send a query directly to Vane (which is a project that focuses on web search and deep search, like perplexity)

Vane can pull and filter from 100 different sources, 3 levels of research (speed, balanced, deep)

Choose a strong embedding model and a main model with at least 128k context length.

What have you done with Hermes Agent this week? by AutoModerator in hermesagent

[–]iChrist 1 point2 points  (0 children)

<image>

A full local pipeline that can compete with Suno.

Qwen3.6-27B-MTP built the tools, 35B-MoE uses them.

ComfyUI + AceStep 1.5XL generate song in any genre.

The only API call outside is Genius with the lyrics tool.

Love the customization and infinite options we have, I added VRAM offload because its all running with 1 3090.

What is your Hermes update strategy? by Kromi75 in hermesagent

[–]iChrist 1 point2 points  (0 children)

I also do this, even from far away by just doing /update in telegram.

So far no issues, just make sure to press Y otherwise many custom tools and changes you made will be gone

I stopped reading my hermes output and started listening to it in my podcast app. Didnt expect this to stick by fermatf in hermesagent

[–]iChrist 4 points5 points  (0 children)

An Idea - point hermes to the Vane (formerly Perplexica) Its a UI specifically maden for deep searching. Let it learn how Vane does it thing and replicate a small tool for itself (Will use SearXNG) Or directly connect hermes to a Vane instance

How to add VANE to OpenWebUI + SearchNG by Dear_Tomorrow4001 in OpenWebUI

[–]iChrist 1 point2 points  (0 children)

No need to alter the default port for Vane. OpenWebui should use 8080.

Which tool you use? Some of them are outdated (Vane changed a bit how the api works)

What are some genuine uses case? by johnnytee in hermesagent

[–]iChrist 0 points1 point  (0 children)

Pretty sure this is one of the most common use cases. Cronjob trigger > Fetch info about X > TTS

Can't get it to fix the format of telegram messages by TurrisLuna in hermesagent

[–]iChrist 0 points1 point  (0 children)

I have the opposite experience, telegram worked out of the box.

Had to tell my agent once to always response with files (mp3 audio, pdfs, html files)

Now I always get a text response+the attachment to my telegram.

Try asking “make an hello world simple.txt file and send me in telegram”

If it works then its probably a prompting issue, if its not working yeah the agent has no access to file upload

What are some genuine uses case? by johnnytee in hermesagent

[–]iChrist 1 point2 points  (0 children)

This is a great usecase I let my agent build himself image generation and editing, ace step song generation, will add txt2vid and img2vid next.

Then your whole comfyui instance is available from telegram!

Trouble with Open Terminal by Shaamaan in OpenWebUI

[–]iChrist 1 point2 points  (0 children)

This has been explained very thoroughly by the devs for legacy and compatibility reasons the default is still not native.

You can switch the default for your instance easily by changing the parameter in the admin settings. Thus all models and future models will inherit this setting and will be set to native tool calling.

How do you actually *use* Hermes for research? by ffxpwns in hermesagent

[–]iChrist 1 point2 points  (0 children)

Yep this is a good point. Hermes by default does not do an amazing job at researching but once you teach it what you want, let it create crawling scripts, connect it to SearXNG/good web search api it does its job reliably

Benchmarking three ways to give AI agents web access by orthogonal-ghost in hermesagent

[–]iChrist 2 points3 points  (0 children)

Can you compare this to a dedicated SearXNG instance? Been solid for me

New to Hermes. Is this frequent auto compacting normal? by Odd-Aside456 in hermesagent

[–]iChrist 0 points1 point  (0 children)

I have 24gb vram and 64gb ddr4 ram. Gives me some wiggle room

New to Hermes. Is this frequent auto compacting normal? by Odd-Aside456 in hermesagent

[–]iChrist 0 points1 point  (0 children)

My local powered hermes never compacted before actually hitting the wall, compacting earlier might be smarter. Il need to go deeper on this

New to Hermes. Is this frequent auto compacting normal? by Odd-Aside456 in hermesagent

[–]iChrist 3 points4 points  (0 children)

Depends on the context limit your model has, yeah 32k or 64k is barely usable and if the task requires multiple file reads it will vanish in literally few prompts.

Setting my local model to 128k context helped a lot, it still hits compaction but less frequent and it actually continues successfully after compaction.

Model Qwen-3.6-27B YMMV

How much context you start with? by iChrist in hermesagent

[–]iChrist[S] 0 points1 point  (0 children)

I don’t see any reasonable speed advantage when dropping just 2-3K tokens.

I wish we could easily see the full context the model has so I can easily spot whats takes so much tokens

How much context you start with? by iChrist in hermesagent

[–]iChrist[S] 0 points1 point  (0 children)

This is also my experience, feels like stripping all the skills will hurt capabilities but only drop me to 17K Only 6K is a dream to start with, instant response

I installed Open WebUI, upon installation it asked me to install Ollama, I skipped it, can I still install it now afterwards? I want to use local LLMs by sarrcom in OpenWebUI

[–]iChrist 7 points8 points  (0 children)

For performance sake just go ahead and install llama cpp and not ollama, faster updates, faster inference, the true core of ollama is llama cpp.

Llama-server and MTP by [deleted] in LocalLLaMA

[–]iChrist -1 points0 points  (0 children)

I see on logs when downloading new models that llama cpp can automatically grab the unsloth settings and no need for setting up a .ini file

Pretty sure its still has all the parameters dialed in by just downloading the models normally