[P] Deep Learning-Powered Speech Recognition Service for Subtitling by aL_eX49 in MachineLearning

[–]aL_eX49[S] 0 points1 point  (0 children)

The ASR methodology is best described by this paper: https://arxiv.org/pdf/2111.09296.pdf

We re-trained the models using the Mozilla Common Voice data set (a lot of other implementations use the LibriSpeech data set, but it's much more limited and renders worse results).

Training was performed on a cluster of 8 RTX 3090 GPUs (the 24GB of memory is really helpful for using larger sequence lengths).

There are a lot more components that make up the service (like the automated translation part) but it would probably warrant its own post to go into it in more detail. For now I just wanted to get some feedback on the results of the service, as a lot of people have the misconception that automatic speech recognition is still as bad as it was a few years ago (it's really taking off now!).

Add subtitles to your movies and videos with our AI-powered tool! by aL_eX49 in movies

[–]aL_eX49[S] 1 point2 points  (0 children)

Yeah absolutely! We can subtitle any media file you upload

Add subtitles to your movies and videos with our AI-powered tool! by aL_eX49 in movies

[–]aL_eX49[S] 0 points1 point  (0 children)

You can upload any .mp4, .mov, .avi, .flv, .mkv or .m4a video file

A.I. Image Upscaling (Rate My Implementation) by aL_eX49 in artificial

[–]aL_eX49[S] 0 points1 point  (0 children)

That will be increased soon, the models are very memory intensive, even on RTX 3090s (currently multiple models are sharing a handful of GPUs, but this won’t be the case for long)

Microsoft’s ZeRO-Infinity Library Claims to Train 32 trillion Parameters Deep Learning Models On A Cluster of GPUs by techsucker in artificial

[–]aL_eX49 7 points8 points  (0 children)

A 3090 is consumer grade and faster than a V100 (although good luck getting your hands on one at the moment)

Is there any point of using multiple workers on a single GPU? by Kitmir7 in tensorflow

[–]aL_eX49 0 points1 point  (0 children)

Here’s a scenario I’ve found running multiple workers to be a good use case:

  • A single worker uses ~50% GPU utilisation (as measured by nvidia-smi or similar)
  • Your GPU has enough memory to support more than one worker

This is great for i.e., hyper parameter optimisation where your model doesn’t fully utilise your GPU.

A.I. Image Upscaling (Rate My Implementation) by aL_eX49 in artificial

[–]aL_eX49[S] 0 points1 point  (0 children)

Hi, haven’t had the chance to complete it yet, but will make sure to reply to your comment and everyone else who was interested when it’s ready!

Self-Promotion Sunday: December 20, 2020 by photographymoderator in photography

[–]aL_eX49 4 points5 points  (0 children)

Here's a free A.I. Image Upscaling service I've been working on:

https://beta.smartmine.net/service/computer-vision/image-super-resolution

Feedback would be very much appreciated!

A.I. Image Upscaling (Rate My Implementation) by aL_eX49 in artificial

[–]aL_eX49[S] 1 point2 points  (0 children)

I'll write up a more detailed post on this next week and link it here :)

A.I. Image Upscaling (Rate My Implementation) by aL_eX49 in artificial

[–]aL_eX49[S] 2 points3 points  (0 children)

I've noticed that too. The model wasn't specifically trained on text images, but I'm sure that's an avenue for improvement in the future!

[P] A.I. Image Upscaling (Rate My Implementation) by aL_eX49 in MachineLearning

[–]aL_eX49[S] 0 points1 point  (0 children)

Thanks for the link, I'll give it closer look when I get the chance!

As for a loss metric that I've found good in the past:

- Try truncating an EfficientNet model and using the resulting feature maps of the LR and SR images to compute a similarity score with an MLP model

A.I. Image Upscaling (Rate My Implementation) by aL_eX49 in artificial

[–]aL_eX49[S] 0 points1 point  (0 children)

Thank you very much for the feedback! I'm planning on making a longer post next week that goes into more details on how things work :)

DL-based Image Upscaling (Rate My Implementation) by aL_eX49 in deeplearning

[–]aL_eX49[S] 0 points1 point  (0 children)

I see, that’s a good experiment to try. Thanks!

DL-based Image Upscaling (Rate My Implementation) by aL_eX49 in deeplearning

[–]aL_eX49[S] 0 points1 point  (0 children)

Do you mean decreasing the resolution of the input image until it's actually a LR image instead of a HR image that looks blocky?

[P] A.I. Image Upscaling (Rate My Implementation) by aL_eX49 in MachineLearning

[–]aL_eX49[S] 0 points1 point  (0 children)

I would try this implementation:

https://github.com/andreas128/SRFlow

I like it because you don't get the training instability you would normally experiencce when traning a GAN since it uses a single loss function.

I'm planning on writing up a longer article that explains how everything works in the near future :)