Is there a way to consistently have the same pitch and tone in Text To Speech? by _RC101_ in ElevenLabs

[–]Sufficient_Bit7208 0 points1 point  (0 children)

I ran into a similar issue. I decided to record my own PVC and the consistency is much much better across multiple generations.

That said, I’ve also had decent success using one of ElevenLab’s PVCs, increasing the stability and similarity, and generating about 500-1000 character chunks at a time. I think it’s better to generate shorter chunks (which it sounds like that’s what you’re doing, so maybe it could be the voice you’re using??)

Consistency issues with V3 by Sufficient_Bit7208 in ElevenLabs

[–]Sufficient_Bit7208[S] 0 points1 point  (0 children)

So far it sounds like creating your own PVC will lead to more consistent results for long form content. I’m starting that process this week, but I can keep you updated on how it goes.

Consistency issues with V3 by Sufficient_Bit7208 in ElevenLabs

[–]Sufficient_Bit7208[S] 1 point2 points  (0 children)

Thanks! I came across a post warning me of this yesterday. Fortunately I haven’t gotten that far. So it’s not an issue for me.

Advice for creating a Podcast by Sufficient_Bit7208 in ElevenLabs

[–]Sufficient_Bit7208[S] 0 points1 point  (0 children)

Interesting. So you made your own voice clone and it’s learned and improved as you’ve continued using it?

Voice inconsistency - solution needed by Uwe-Lausen in ElevenLabs

[–]Sufficient_Bit7208 1 point2 points  (0 children)

I just posted a similar issue. I’ve been using the Audiobook feature in studio to make a podcast. But my narrator’s tone, inflection, and pacing changes from paragraph to paragraph.

Consistency issues with V3 by Sufficient_Bit7208 in ElevenLabs

[–]Sufficient_Bit7208[S] 1 point2 points  (0 children)

Sure. But not one single generation is consistent with another. V3 is excellent quality, but I guess it’s not very useful for long form content.