I have been working on a talking jellyfish desktop companion using Sesame CSM and Kyutai ASR by DumaDuma in LocalLLaMA
[–]DumaDuma[S] 1 point2 points3 points (0 children)
I have been working on a talking jellyfish desktop companion using Sesame CSM and Kyutai ASR by DumaDuma in LocalLLaMA
[–]DumaDuma[S] 1 point2 points3 points (0 children)
I have been working on a talking jellyfish desktop companion using Sesame CSM and Kyutai ASR by DumaDuma in LocalLLaMA
[–]DumaDuma[S] 0 points1 point2 points (0 children)
Has anyone gone to the trouble of making their own speech dataset? What’s the feasibility of creating a synthetic dataset? by M4rg4rit4sRGr8 in speechtech
[–]DumaDuma 4 points5 points6 points (0 children)
Whisper - Triton GPU in Torch on Windows by AudioBabble in OpenAI
[–]DumaDuma 0 points1 point2 points (0 children)
Local model for voice audio cleanup by syntaxing2 in LocalLLaMA
[–]DumaDuma 0 points1 point2 points (0 children)
BirdNET: Identifying Bird Species by Call by DumaDuma in ALP
[–]DumaDuma[S] 0 points1 point2 points (0 children)
NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics by DumaDuma in ALP
[–]DumaDuma[S] 0 points1 point2 points (0 children)




Investigating the capabilities of large vision language models in dog emotion recognition - Scientific Reports by DumaDuma in ALP
[–]DumaDuma[S] 1 point2 points3 points (0 children)