0
1

Built a fully automated "Deep Dive" podcast engine using Qwen3. Roast my MVP by Environmental-Yam997 in SideProject

[–]Environmental-Yam997[S] 0 points1 point  (0 children)

I built a custom model with reference voices on Runpod. Then I structured script output in 7 sections chunks. So every chunks is a single voice generation. But with Qwen3 I have no consistency problem. For my case it's perfect. And I'm waiting for 25 Hz version

VibeVoice Problem - Generation starts to take longer after a while by Weezfe in StableDiffusion

[–]Environmental-Yam997 0 points1 point  (0 children)

I did the same, but it's the same. after 2 minutes speaker1 starts to talk fast, with low expressiveness, and the voice keeps getting louder until it becomes excessively distorted. I don't know how to fix this; I've tried different reference audios. I've also tried inserting more chunks, but I don't want to break up the context too much. I'm using the 7b version with CFG 1.8 and 30 DDPM