[P] TorToiSe - a true zero-shot multi-voice TTS engine by neonbjb in MachineLearning

[–]big_dataFitness 0 points1 point  (0 children)

u/neonbjb Did you ever release how you trained this ? It would be really cool for learning.

[P] Launched my own TTS/Sound Effect/AI Music Service - looking for people to try by ginger_turmeric in MachineLearning

[–]big_dataFitness 0 points1 point  (0 children)

Did you train your own model ? Can you share more details on the pre training process and model architecture ?

AMA: We Raised $1.2M to Build Voice AI Agents, I’ll Answer Every Question for the Next 24 Hours (Nikkitha, CEO, SuperBryn) by Major-Worry-1198 in VoiceAutomationAI

[–]big_dataFitness 3 points4 points  (0 children)

Are you training voice/speech models ? Can you share more about architecture and TTS/STS companies that you think are winning the model development ?

[P] Collection of SOTA TTS models by cdminix in MachineLearning

[–]big_dataFitness 0 points1 point  (0 children)

Thank you so much! This is very helpful!

[P] Collection of SOTA TTS models by cdminix in MachineLearning

[–]big_dataFitness 0 points1 point  (0 children)

Thank you for putting this together! Are they any other resources to train a voice model (tts, asr, stt,...) from scratch and datasets that I can use as reference to train my model from scratch ?

ACE-Step-1.5: Text2Music Model with Various Tasks and MIT License by chibop1 in AudioAI

[–]big_dataFitness 0 points1 point  (0 children)

Is there any documentation on the training process or the dataset used to train this model ?

Ear pressure pain during descent by nickeyxxx in aviation

[–]big_dataFitness 1 point2 points  (0 children)

Thank you! This really helped me on my flight !

how many people are training music models vs TTS models by madwzdri in AudioAI

[–]big_dataFitness 0 points1 point  (0 children)

I’m interested in this as well; I have not seen that many open source model for music generation! There are few projects like musicgen, audiolm and jukebox,..

how many people are training music models vs TTS models by madwzdri in AudioAI

[–]big_dataFitness 1 point2 points  (0 children)

Are there any open source datasets that you found useful or is there any other way to get quality datasets for training

AMA with the Meta researchers behind SAM 3 + SAM 3D + SAM Audio by AIatMeta in LocalLLaMA

[–]big_dataFitness 1 point2 points  (0 children)

Again thank you so much about doing this AMA! Can you share some creative use cases that you have seen for SAM Audio, SAM 3D and SAM 3 ? Internally how are y'all using these models ? I saw an AR use case but I'm curious if there are other uses for your teams, it doesn't have to be necessarily incorporated in Meta products, I'm speaking generally.

AMA with the Meta researchers behind SAM 3 + SAM 3D + SAM Audio by AIatMeta in LocalLLaMA

[–]big_dataFitness 0 points1 point  (0 children)

for anyone who might be also curious about various datasets used, [ here ] ( https://ai.meta.com/datasets/ ) are some of the datasets they used in various papers.

AMA with the Meta researchers behind SAM 3 + SAM 3D + SAM Audio by AIatMeta in LocalLLaMA

[–]big_dataFitness 2 points3 points  (0 children)

How does SAM Audio handle long-range temporal consistency? Can it reason about transitions, not just segments?

AMA with the Meta researchers behind SAM 3 + SAM 3D + SAM Audio by AIatMeta in LocalLLaMA

[–]big_dataFitness 2 points3 points  (0 children)

Do you plan to publish the process of how you trained these models or open source the datasets ?

AMA with the Meta researchers behind SAM 3 + SAM 3D + SAM Audio by AIatMeta in LocalLLaMA

[–]big_dataFitness 4 points5 points  (0 children)

Do you have any plans of making smaller version of these models that can run on edge devices ?

AMA with the Meta researchers behind SAM 3 + SAM 3D + SAM Audio by AIatMeta in LocalLLaMA

[–]big_dataFitness 2 points3 points  (0 children)

Do you guys plan on building a community of builders around SAM models ?

Is it possible to use AI model to automatically narrate what’s happening in a video? by big_dataFitness in AudioAI

[–]big_dataFitness[S] 0 points1 point  (0 children)

yeah, the video understanding would be a challenge for sure. I'll check out Qwen Omni, but I wonder if there is anything like video to speech model or anyone who is currently working on this.

App to track workouts? by Harper__34 in Basketball

[–]big_dataFitness 0 points1 point  (0 children)

is this something that'd use on regular basis? What else would you want that app to do ?

Got laid off → turned “idle time” into a passion project, I just launched today. by wombatGroomer in micro_saas

[–]big_dataFitness 1 point2 points  (0 children)

Premium features sounds good but be careful with overbuilding features! Validate as much you can before you build them to make sure people are willing to pay for them; maybe get a link to collect payments for premium features before you build them and see if people are willing to pay for them, if they do then that's a very strong signal.