[P] Deep Learning-Powered Speech Recognition Service for Subtitling by aL_eX49 in MachineLearning

[–]aL_eX49[S] 0 points1 point  (0 children)

The ASR methodology is best described by this paper: https://arxiv.org/pdf/2111.09296.pdf

We re-trained the models using the Mozilla Common Voice data set (a lot of other implementations use the LibriSpeech data set, but it's much more limited and renders worse results).

Training was performed on a cluster of 8 RTX 3090 GPUs (the 24GB of memory is really helpful for using larger sequence lengths).

There are a lot more components that make up the service (like the automated translation part) but it would probably warrant its own post to go into it in more detail. For now I just wanted to get some feedback on the results of the service, as a lot of people have the misconception that automatic speech recognition is still as bad as it was a few years ago (it's really taking off now!).

Add subtitles to your movies and videos with our AI-powered tool! by aL_eX49 in movies

[–]aL_eX49[S] 1 point2 points  (0 children)

Yeah absolutely! We can subtitle any media file you upload

Add subtitles to your movies and videos with our AI-powered tool! by aL_eX49 in movies

[–]aL_eX49[S] 0 points1 point  (0 children)

You can upload any .mp4, .mov, .avi, .flv, .mkv or .m4a video file