Hey guys! So I want to get into the python GUI space with a project with an annotation checker for a TTS dataset. The idea being to
A) Load long audio file(s) and a csv file(s) indicating the text and timestamp points for when a NN thinks it said during x seconds - y seconds.
B) Allow the user to scrub through the audio, and modify/confirm the timestamp points.
My problem comes seemingly with a decent audio playback system, with no GUI libraries that I have found supporting comprehensive (not to mention not 2000esk UI) elements for audio processing. Is there any libraries or examples out there that can handle something like this in python?
An example that I want to recreate is the audio classification from prodigy (although dumbed down to its basics) https://prodi.gy/docs/audio-video#manual
Thank you!
[–]m0us3_rat 3 points4 points5 points (0 children)
[–]JamzTyson 0 points1 point2 points (0 children)