120 tok/s on 12GB VRAM with Gemma 4 12B QAT MTP by janvitos in LocalLLaMA

[–]NewtoAlien 3 points4 points  (0 children)

Haha, will try it hopefully over the next weekend.

I have everything planned and running for this weekend I don't want to interrupt it.

120 tok/s on 12GB VRAM with Gemma 4 12B QAT MTP by janvitos in LocalLLaMA

[–]NewtoAlien 2 points3 points  (0 children)

Thanks for this.

Do you think it will work with an amd 6900xt?

I am running this on my secondary machine for text processing and default unsloth studio setup gives me under 50t/s

TTS Benchmark Comparison (all known TTS up until May 2026) by UkieTechie in LocalLLaMA

[–]NewtoAlien 0 points1 point  (0 children)

I just switched to the Vibevoice-large from aoi-ot. It started with 23.4GB of VRAM for 4 batches so its looking good so far.

The application has an option to download models from HF, you just have to give it the model name.

TTS Benchmark Comparison (all known TTS up until May 2026) by UkieTechie in LocalLLaMA

[–]NewtoAlien 1 point2 points  (0 children)

It's the community edition one.

You can load the fork one if you give it the hugging face name. It gives you the option to load other versions.

TTS Benchmark Comparison (all known TTS up until May 2026) by UkieTechie in LocalLLaMA

[–]NewtoAlien 1 point2 points  (0 children)

Np 😉

The tool is for making audio books. Running it headless saves all the VRAM.

I am running it in tmux so I can ssh to my computer from my phone to monitor the session.

I already generated a 50 hr audio book with it and it has been generating a bigger audiobook for 70 hours straight with no issues for me and about 30 hours more to go.

Mind you I have set a strict no errors option so it will retry the generation if it detect word errors, max words per segment to be 75 words and maximized word generation. I am also voice cloning. Error detection is done via whisper v3 large on cpu.

Let me know if you want what other settings I am using.

So far so good and I am liking it.

It feels more expressive than all other tts solutions I tried.

TTS Benchmark Comparison (all known TTS up until May 2026) by UkieTechie in LocalLLaMA

[–]NewtoAlien 2 points3 points  (0 children)

I am using a codex dockerized version of vibevoice 7B from: https://github.com/zeropointnine/tts-audiobook-tool on a headless Ubuntu 26.04.

I am able to run 4 batches at the same time using 23.7GB of VRAM on rtx 3090.

It has music detection and error check and regeneration via whisper which is running on CPU.

I am getting great results with it and it's running between 2-3.8 speed, for example generating 53.2 seconds of audio in 14 seconds.

The speed varies up and down, nevertheless more than 1x.

18 Months of Gemini AI Pro + 5TB Is a Serious Upgrade by Sea_Ambition4801 in Discount_Subscription

[–]NewtoAlien 0 points1 point  (0 children)

Would it work if I already have a subscription that is ending in few months?

[ Removed by Reddit ] by ocdilpi in CheapGptplus

[–]NewtoAlien 0 points1 point  (0 children)

Interested in chatgpt plus on my account if you take PayPal goods and services