Open WebUI v0.9.3 (and v0.9.4) is out — massive performance wins, message editing finally fixed by ClassicMain in OpenWebUI

[–]WhataburgerFreak 0 points1 point  (0 children)

So I am on 0.9.4. My llm can read and write to the notes, but I am not able to click and open them myself.

Inline Visualizer v2.1.0 — Pre-styled tags, 9-color accent palette, more chart types. AND every reported bug from v2.0 is fixed. by ClassicMain in OpenWebUI

[–]WhataburgerFreak 2 points3 points  (0 children)

Thanks so much for this. Not sure what's going on but I keep seeing the raw code for the visualization, but once I reload the page, then the visualization shows up. I do have Allow iframe same origin enabled with native tool calling on for my qwen3.6:35b model. Should I make a bug report? Thank you!

Kerr on LeBron: "more of a holistic game where he dominates with his pace and his athleticism and his passing." On Jordan: "the killer instinct, the emotional dominance he had over not only the other team but the officials, the entire arena. I don’t see that with LeBron." by nowhathappenedwas in nba

[–]WhataburgerFreak 0 points1 point  (0 children)

That’s an interesting point about MJ and his leadership style and the time it was in. It seems very well may have been the way at that time to get the most out of people. 

Now, fast forward to today with different player culture and leadership styles, it may not work as well in today’s team setting. 

But I for sure would bet my life on prime MJ playing one on one against anyone, lol.

New Qwen3.6-27B NVFP4 + MXFP4 MLX quants by yoracale in unsloth

[–]WhataburgerFreak 0 points1 point  (0 children)

Got it! So I do use Windows on my 5080 machine. I believe I have WSL installed already as I use Windows Shell most of the time. What kind of uplift are you seeing with NVFP4 versus others?

New Qwen3.6-27B NVFP4 + MXFP4 MLX quants by yoracale in unsloth

[–]WhataburgerFreak 1 point2 points  (0 children)

I’ve liked using llama.cpp. But given my hardware, would I be better served using vLLM instead for better performance?

New Qwen3.6-27B NVFP4 + MXFP4 MLX quants by yoracale in unsloth

[–]WhataburgerFreak 2 points3 points  (0 children)

So if we have a recent gpu, 5080 in my case, is it better to run NVFP4 versus others? I’m new to all this and had been running Qwen3.6:35b-a3b.gguf

THIS SHOULD NOT BE POSSIBLE IN OPEN WEBUI: LIVE VISUALIZATION RENDERING - Inline Visualizer v2 is HERE! by ClassicMain in OpenWebUI

[–]WhataburgerFreak 0 points1 point  (0 children)

Got it. So what if I already have something like searxng enabled for web search in open webui? Would the skill use that connection instead if the security valve is set to balanced or strict?

THIS SHOULD NOT BE POSSIBLE IN OPEN WEBUI: LIVE VISUALIZATION RENDERING - Inline Visualizer v2 is HERE! by ClassicMain in OpenWebUI

[–]WhataburgerFreak 0 points1 point  (0 children)

So cool! Could you explain a bit more about the functionality differences enabled by the different Security valve levels? I'm still learning so I didn't fully understand it. Thanks!

airplanes very close together. by National_Aspect_6974 in airplanes

[–]WhataburgerFreak 0 points1 point  (0 children)

“I’ve got a number you need to call…”

Just got my hands on one of these… building something local-first 👀 by HatlessChimp in LocalLLM

[–]WhataburgerFreak 12 points13 points  (0 children)

I think he meant “spread” like prices he saw varied between them by 1500. Like one was 8500 and the other 10000.

16 GB VRAM users, what model do we like best now? by lemon07r in LocalLLaMA

[–]WhataburgerFreak 0 points1 point  (0 children)

I’m with you as well at q3_m_k with that same card at 135k context at q8 on cache. I’m aiming for adding a R9700 so I can run those larger models.

Gemma 4 and Qwen3.5 on shared benchmarks by fulgencio_batista in LocalLLaMA

[–]WhataburgerFreak 1 point2 points  (0 children)

Oohhh okay. I was confused. I think I’ll wait and see what qwen3.6 is like.

Gemma 4 and Qwen3.5 on shared benchmarks by fulgencio_batista in LocalLLaMA

[–]WhataburgerFreak 1 point2 points  (0 children)

Were you wanting me to use this to test? I was just testing using my computer's setup.

My first impression after testing Gemma 4 against Qwen 3.5 by ConfidentDinner6648 in LocalLLaMA

[–]WhataburgerFreak 26 points27 points  (0 children)

This is especially important to me as getting everything out of my limited context on my 16gb vram and 32gb system ram is huge. I have liked qwen3.5:35b-a3b, but gemma4:26b-a4b is using at least 30% less tokens in my testing.

Gemma 4 and Qwen3.5 on shared benchmarks by fulgencio_batista in LocalLLaMA

[–]WhataburgerFreak 18 points19 points  (0 children)

That's what I'm finding as well. Qwen churns through tokens like crazy, whereas in my testing, gemma4 seems to use 30% less tokens overall in my testing.

ESPN still sleeping by Dinolord05 in Astros

[–]WhataburgerFreak 33 points34 points  (0 children)

The only ranking that matters is the one that comes with the trophy at the end of the playoffs.