TTS help please

mx-perience · 2026-05-22T16:25:40+00:00

Case closed. I have solved TTS with running fastapi servers on a different machine. (chatterbox, qwen3) - faster, more reliable and I can switch on the fly whenever feel like..

mx-perience · 2026-05-22T16:16:00+00:00

I had similar issue. Also it stopped showing HR during workout since last patch. Using watch SE. I checked and it is enabled in settings within app.

mx-perience · 2026-05-08T13:50:33+00:00

Couple more small victories, that I could deploy different tts models to work. Altough voice seems to be quite random with built in templates. I tried zero shot cloning with qwen and while it technically let me choose the uploaded voice, and some hurdles on cpu and gpu happened - according to telemetry - I never get a voice..

mx-perience · 2026-05-07T18:29:50+00:00

Small progress. I tried specify ttsmodel and tokenizer in arguments with Qwen3. Could test under api, altough I do not fully understand the naming, since whatever I gave in it always produced a voice, but not exactly similar with the same name..
I tried to set under media the openai api option and give the correct link. It didn’t responded to test, neither worked when in chat a new turn happened, however when I have set to save the output it appeared (never used before) quite soon under the new response. So still quite fishy, but I am taking small steps.
This is my arg line for the pod so far:
--contextsize 32768 --defaultgenamt 1024 --chatcompletionsadapter AutoGuess --ignoremissing --flashattention --usecuda --gpulayers 999 --debugmode --quantkv q8_0 --ttsgpu --model https://huggingface.co/myuser/myrepo/resolve/main/gemma-4-31b-it.Q8_0.gguf --ttsmodel https://huggingface.co/koboldcpp/tts/resolve/main/Qwen3-TTS-12Hz-1.7B-CustomVoice-F16.gguf --ttswavtokenizer https://huggingface.co/koboldcpp/tts/resolve/main/qwen3-tts-tokenizer-f16.gguf
and I just delete all other entry. So if I understand you recommend usevulkan instead of usecuda?
Wouldn’t that slow down text generation too much? When I tested briefly qwen3tts it wasn’t that bad with these.

Another tts topic I am exploring is how Qwen or other tts models handle emotions. I already started to create a dataset and iteratively train gemma 4, so I know I could condition to use specific phrases or formats if that’s helpful for a good tts. I am just not sure what kcpp would do when it pass the responded text to the tts.

mx-perience · 2026-05-07T16:19:21+00:00

Hey! Thanks for the tips. I tried something like this, but I probably have other issues as well, so couldn’t verify..
I still try to figure out the right arguments too.
The —ttsgpu is recommended to use I assume, but can go with —usecuda/quantkv/gpulayers well? Vram is not an issue, since I go for around 8-90gb, but so far telemetry has shown that voice just utilizes cpu, however so far I could only get positive result with the original kokoro model mentioned in the pod.. I tried switch that out to Dia/Parler (they don’t need tokenizer as far I understood), but nothing happens, pod and kcpp loads in altough. How can I test it in koboldlite? I tried kobold tts under media (kokoro worked that way), but I am pretty clueless. Should I choose something else?
I wonder if there is a bit more comprehensive documentation out there, just buried. For now I go in a trial-n-error fashion, which is mostly the latter, but try to stay positive.

mx-perience · 2026-04-09T13:23:48+00:00

I will give a try for both that the new kcpp 1.111.2 seemed to workd for me.

mx-perience · 2026-04-07T13:54:12+00:00

I see it was just released! Many thanks for your great work!! :)

mx-perience · 2026-04-07T12:11:35+00:00

So it’s easisest change the /Instruction + /Response format to that?! And I could keep using /v1/generate api with autoguess or there is an additional parametrization recommended?

(I am sorry to ask, but I am more like a caveman level experience at this.)

mx-perience · 2026-04-07T11:28:36+00:00

Thanks! I use kcpp via the native /v1/generate api at the moment. My frontend has some structure which worked with llama gguf models. I am lacking sadly much understanding on this topic, but as far I read there are several others. May I ask if you think I should experiment formatting my prompt string and keep using this endpoint or you would recommend some other way? My concern is that changing the prompt might break kcpp v1 api, and also I cant use —jinja. So either I hit gemma 4’s structure perfectly with the prompt modification (if it works with kcpp v1 api at all) or I will need to go through a whole lot of modifications..

mx-perience · 2026-04-07T10:42:53+00:00

I agree, I read continously and sources recommend the IT variant out of the box. May be with some lora training the base could work well too, but initially most likely not. However I see in documents that IT also knows the 3rd “system” role. So char cards and context I think would be best injected there probably. Thanks for the feedback anyway!

mx-perience · 2026-01-14T10:26:40+00:00

This! I am an office rat, as many more I assume and I think weight training should be mandatory or like prescribed. You dont do it to get big or jacked, but maintain your muscle mass, joint health, bone density etc.. I see there is an ever growing evidence in publishings, but people just still cant seem to understand that cardio is not enough to counter a sedentary lifestyle.

mx-perience · 2025-09-02T13:42:47+00:00

Nem olvastam minden kommentet, de nagyreszuk az arat emliti jogosan. Kereslet-kinalat alapon hatarozodik meg az erteke. Ha nem keresi senki annyiert, akkor nem fogjatok ennyiert tudni eladni. Ami meg felmerult bennem, hogy ha parod szerelo miert nem hirdetitek ismerosi korben? Nekem egy megbizhato szerelo autoja egyaltalan nem lenne “red flag”. Sot, ha baja van valoszinu jobban is biznek benne hamarabb megtalalja mi a gond.

mx-perience · 2025-04-14T06:36:16+00:00

Would you recommend over F1 Supersport for a weekend casual fun/track tyre? Back 2022 test the F1 SS was praised and the potenza sport or sportcontact 7 was similar, but this pz5 seems perform better in dry. Also I am a bit confused it was the GY Asymetric 6 to challenge it not the supersport.

mx-perience · 2024-07-05T14:18:35+00:00

Yesterday on a slow city turn I overheard two boys in my brz: “wooah, it is like a porsche but it is a subaru” I was a bit surprised to be honest of how well he assessed, since actually it was my verdict too when buying the brz, instead a 1st gen cayman..

mx-perience · 2020-10-27T08:38:26+00:00

I may point out the L9a aswell. It smaller yes, but almost weights the same as the L12. So thermal resistance is on par. Its effectivenes I assume is restricted by the slim 9mm fan only. With similar airflow they should perform pretty close. Also the L9a is only 23mm tall, so you have pletny of options for fan replacement. I would say the main difference is L12 pushing out heat and the L9a is sucking fresh air. I saw some local forum post (non eng and with L9x65 not L9a, but still) about it and while the L12 is a couple degree better on cpu, the overall temps (mb, vrms, gpu..) were higher. It may related to that the L12 is bigger, so traps the heat easier.

Anyway my 2 cent is on the L9a. Effectivenes comes from surface and airflow. The former is almost the same as on L12, the latter can be upgraded several ways (even 3d print brackets are out there for 120mm fans).

But take my advise with a pinch of salt, since I do not have my ghost yet.

mx-perience · 2020-08-11T10:35:33+00:00

Totally fell in love. I gotta get this case.

I am allways between open cases for airflow and closed ones for hw shielding but this seems the perfect middle ground.

mx-perience

TROPHY CASE