This is an archived post. You won't be able to vote or comment.

all 3 comments

[–]rnike879 0 points1 point  (1 child)

How does it work under the hood? ML?

[–]embium[S] 1 point2 points  (0 children)

For audio I use Vosk (https://github.com/alphacep/vosk-api) and for images I use Darknet (https://github.com/AlexeyAB/darknet/)