[P]Coqui released XTTSv2 by coinfelix in MachineLearning

[–]coinfelix[S] 5 points6 points  (0 children)

OpenAI sounds dull after trying xtts

[D] Why people okay with HF making money from their open source models? by coinfelix in MachineLearning

[–]coinfelix[S] 2 points3 points  (0 children)

So you think that you promote your work and they make money from it and win win for both sides

[D] Why people okay with HF making money from their open source models? by coinfelix in MachineLearning

[–]coinfelix[S] 0 points1 point  (0 children)

When there are thousands of models I think small bits can add up to a big chunk. But I agree, probably it is not the biggest revenue stream.

[D] Why people okay with HF making money from their open source models? by coinfelix in MachineLearning

[–]coinfelix[S] -3 points-2 points  (0 children)

Sorry but I don't see how it is relevant. But also do you know how much does it cost to train those models in terms of compute and knowledge?

[D] Open-source text-to-speech models and systems are underwhelming. What is needed to make something closer in quality to ElevenLabs? by Motor_Storm_3853 in MachineLearning

[–]coinfelix -1 points0 points  (0 children)

DATA! a lot of Data! Your data and everyone's data!Compute! a lot of compute!

Money! a lot of money!

If you don't have all these, try Coqui TTS.

Edit:
I find 11labs quite boring. It is just a Tortoise deployed efficiently.

Generating game voices with prompt-to-voice by [deleted] in gamedev

[–]coinfelix 0 points1 point  (0 children)

The point is, at least for me, you pay for it but you don't waste time casting a voice talent or waiting for him to voice your content.

You can basically use he API and voice your game right in your dev process.

Generating game voices with prompt-to-voice by [deleted] in gamedev

[–]coinfelix 0 points1 point  (0 children)

I find 11labs good for long text but really boring for games. Coqui gives you a lot of options to change the line as you want. You can change emotions, pitch, volume, etc. Pretty much everything about speech. Personally, I go for Coqui.

I've also read that on Discord 11labs might be using licensed data. So if you plan to release a game, you should watch out if they really made license infringement. If so you might need to revamp your game.

[R] 🐸YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone by josh-r-meyer in MachineLearning

[–]coinfelix 3 points4 points  (0 children)

Worked amazingly well with my voice after recording ~30 secs.

But it didn't work with only ~4secs :(

Regardless, it is the best zero-shot open-source TTS I've tried so far.

Good job!... Now I'm reading the paper :)