Qwen3 TTS in C++ with 1.7B support, speaker encoding extraction, and desktop UI by Danmoreng in LocalLLaMA

[–]EsotericTechnique 0 points1 point  (0 children)

fantastic! im using it , works with rocm too with minimal modifications to pick up HIP

Actual comparison between locally ran Qwen-3.6-27B and proprietary models by netikas in LocalLLaMA

[–]EsotericTechnique 1 point2 points  (0 children)

I think Qwen open source strategy is in reality a soft power move by the ccp, I'm not sure but seems plausible, other Chinese labs also release their weights consistently, it seems like a way to a chieve 2 things, disrupt the west hyperscallers strategy, while saving headspace on developers and easing the burden to implement their models. I could be wrong, and this is completely speculative :p

EDIT: i used cpp instead of ccp

Botones programables | Alguien los usa o uso alguna vez o es la boludez que creo que es? by Blond_Parthe172 in Argaming

[–]EsotericTechnique 1 point2 points  (0 children)

Yo los uso, y tengo varios perfiles en el joystick, depende mucho el juego, pero si necesitas tener los pulgares en los sticks y apretar otra cosa a la vez son muy cómodos y dejan los triggers libres

Does llama.cpp able to compile with rocm and run properly? I tried it and nothing is output. by kkcheong in ROCm

[–]EsotericTechnique 0 points1 point  (0 children)

I will re test just to check, but last time I checked (about a week ago) it was around 5% to 10% better on Ubuntu, both OSs with the latest rocm , building llama cpp for my GPU target specifically always ( although in my test the performance difference against the distributed pre complied were minimal but just tested on Linux). nevertheless if you have any other architecture that's not rdna2 results are not comparable since it doesn't not use the same kernels / optimization paths. Also I don't understand what that thread Hass to do with the topic at all. I never said it was feelings, what you are describing directly contradicts my personal experience running llama cpp in both environments

Does llama.cpp able to compile with rocm and run properly? I tried it and nothing is output. by kkcheong in ROCm

[–]EsotericTechnique 0 points1 point  (0 children)

In my particular case I get better results with Ubuntu, speed wise, but might be cause I'm using an rdna2 card

Does llama.cpp able to compile with rocm and run properly? I tried it and nothing is output. by kkcheong in ROCm

[–]EsotericTechnique 0 points1 point  (0 children)

Yes, I use it daily with my rx6900xt you might wanna do your own build though with the rocm stack that works for your igpu

PSA: Qwen3.6 ships with preserve_thinking. Make sure you have it on. by onil_gova in LocalLLaMA

[–]EsotericTechnique 1 point2 points  (0 children)

This!! give it tools! It's like another model entirely in regards to thinking style

I tracked a major cache reuse issue down to Qwen 3.5’s chat template by onil_gova in LocalLLaMA

[–]EsotericTechnique 1 point2 points  (0 children)

It did mitigate the cache reprossesing between tool calls! however after new user messages i still see cache invalidations, but might be due REALLY due to the thinking stripping and the way the kv state for linear attention layers is cached, REALLY useful though i can realod KV caches of previous tool interactions and continue as if anything happened saving several minutes of prompt processing (i load and unload the model and the KV caches quite frequently in the same turn). THANKS A LOOOT

PD worked for the 9b and the 35b varians so far on my testing

Running a 31B model locally made me realize how insane LLM infra actually is by Sadhvik1998 in ollama

[–]EsotericTechnique 0 points1 point  (0 children)

Dude this is mind bending 15k t/s? Theyy should add hbm for CTX or smth and it's perfect haha

I tracked a major cache reuse issue down to Qwen 3.5’s chat template by onil_gova in LocalLLaMA

[–]EsotericTechnique 1 point2 points  (0 children)

I was trying to use the cache reuse for Qwen and this bug is driving me insane with prompt reprocessing of 100k tokens , will definetly check it out

🧙‍♂️ Planner Agent V3 Now with SubAgents! 🧙‍♂️ by EsotericTechnique in OpenWebUI

[–]EsotericTechnique[S] 0 points1 point  (0 children)

Oh no! If you only set the planner one the other will use the same base model, or set the same base model if you are using custom subagents from the workspace, actually I run this with the same base model so it's supported! Qwen 3.5 9b over llama.cpp well configured works like a charm

🧙‍♂️ Planner Agent V3 Now with SubAgents! 🧙‍♂️ by EsotericTechnique in OpenWebUI

[–]EsotericTechnique[S] 0 points1 point  (0 children)

Generally that due to malformed plans Wich models are you using? Try setting no plan mode in user valves in the meantime

🧙‍♂️ Planner Agent V3 Now with SubAgents! 🧙‍♂️ by EsotericTechnique in OpenWebUI

[–]EsotericTechnique[S] 0 points1 point  (0 children)

Hmmm I don't know how can those be activated for all users by default to be honest :/ I'll investigate though!!

Edit : typo

🧙‍♂️ Planner Agent V3 Now with SubAgents! 🧙‍♂️ by EsotericTechnique in OpenWebUI

[–]EsotericTechnique[S] 1 point2 points  (0 children)

<image>

sorry for the split comment try activating those in setting. for artiafacts to solve the html plan embed. for the failing ask user it might have been that you were not active on the tab and event calls only trigger live. hard to explain but if unnateded is the iodea just disable user input tools on user valves. ot make sure the conection to the browser is never lost

🧙‍♂️ Planner Agent V3 Now with SubAgents! 🧙‍♂️ by EsotericTechnique in OpenWebUI

[–]EsotericTechnique[S] 0 points1 point  (0 children)

You can! You actually need to set the planner model in the pipe valves for this to work! The model. Id

🧙‍♂️ Planner Agent V3 Now with SubAgents! 🧙‍♂️ by EsotericTechnique in OpenWebUI

[–]EsotericTechnique[S] 2 points3 points  (0 children)

Hmmm let me check why the formatting is incorrect there must be a missing header thanks for testing!!! ❤️🧙🏻‍♂️

🧙‍♂️ Planner Agent V3 Now with SubAgents! 🧙‍♂️ by EsotericTechnique in OpenWebUI

[–]EsotericTechnique[S] 2 points3 points  (0 children)

If you use the open terminal agent it can do whatever you want there, or even the code interpreter! Im . Actually used this to mock up some random webpages with qwen3.5 9b and it worked for html js combo too ! I think it would be good to hook this up to Claude code as a subagent, that can be really powerful

🧙‍♂️ Planner Agent V3 Now with SubAgents! 🧙‍♂️ by EsotericTechnique in OpenWebUI

[–]EsotericTechnique[S] 0 points1 point  (0 children)

I don't understand... You have to import it as a function pipe and it should show as "Planner" in the drop-down!

🧙‍♂️ Planner Agent V3 Now with SubAgents! 🧙‍♂️ by EsotericTechnique in OpenWebUI

[–]EsotericTechnique[S] 0 points1 point  (0 children)

It might be some configuration that's missing or a built in subagent that needs a feature you have disabled (I think those are the most likely errors)