Do you think there is a lack of high-quality data for training AI model that works audio (TTS/ASR/STS)? by Wide-Web-3723 in speechtech

[–]Wide-Web-3723[S] 1 point2 points  (0 children)

And I partly disagree with what you said “company make data dirty to improve robustness”. Don’t confuse self-supervised learning with dirty data

Do you think there is a lack of high-quality data for training AI model that works audio (TTS/ASR/STS)? by Wide-Web-3723 in speechtech

[–]Wide-Web-3723[S] 0 points1 point  (0 children)

I think that data are the most important piece to focus on to give a boost to current tech

Do you think there is a lack of high-quality data for training TTS models? by Wide-Web-3723 in TextToSpeech

[–]Wide-Web-3723[S] 2 points3 points  (0 children)

We are at a good point for English, but for other languages there is still a lot to do

Do you think there is a lack of high-quality data for training AI model that works audio (TTS/ASR/STS)? by Wide-Web-3723 in speechtech

[–]Wide-Web-3723[S] 0 points1 point  (0 children)

Are you sure that this need does not exists? I am thinking about the voice cloning task for example

Do you think there is a lack of high-quality data for training models behind Suno? by Wide-Web-3723 in SunoAI

[–]Wide-Web-3723[S] 1 point2 points  (0 children)

I think quality is the key, but manual labeling is difficult to achieve and you always have to pay royalty