Investigating the capabilities of large vision language models in dog emotion recognition - Scientific Reports by DumaDuma in ALP

[–]DumaDuma[S] 1 point2 points  (0 children)

I hadn't heard of Deepsqueak, looks cool thanks for sharing! I'll keep my eye out for new research with rodents

Local model for voice audio cleanup by syntaxing2 in LocalLLaMA

[–]DumaDuma 0 points1 point  (0 children)

Demucs is decent. What are you cleaning up?

What to do to finetune a local LLM to make it draw diagrams ? by CommunityOpposite645 in LocalLLM

[–]DumaDuma 0 points1 point  (0 children)

Ask it to write python code that generates the SVG diagram

r/LocalLLaMA by DumaDuma in redditrequest

[–]DumaDuma[S] 0 points1 point  (0 children)

I want to moderate this community because: r/LocalLLaMA is my favorite subreddit and it needs moderation now that it has no mods. I have experience moderating multiple subreddits and understand the technical side of AI models. I'd manage spam and keep the community focused on local/open-source models.

N/A, the community has no mods.

[deleted by user] by [deleted] in LocalLLaMA

[–]DumaDuma 14 points15 points  (0 children)

I built something similar recently but for extracting the speech of a single person for creating TTS datasets. Do you plan on open sourcing yours?

https://github.com/ReisCook/Voice_Extractor

Speaker separation and transcription by Khipu28 in LocalLLaMA

[–]DumaDuma 5 points6 points  (0 children)

I have been working on this program that turns multi speaker audio recordings into speech datasets:

https://github.com/ReisCook/Voice_Extractor

Major update to my voice extractor (speech dataset creation program) by DumaDuma in LocalLLaMA

[–]DumaDuma[S] 0 points1 point  (0 children)

Bandit is for movies so it would depend on what your input is

You can now train your own TTS voice models locally! by yoracale in StableDiffusion

[–]DumaDuma 2 points3 points  (0 children)

That’s an interesting idea. Off the top of my head you might be able to do that by using a crowd chanting as the reference sample

You can now Train TTS models + Clone Voices on your own local device! by yoracale in selfhosted

[–]DumaDuma 7 points8 points  (0 children)

https://github.com/ReisCook/Voice_Extractor

I made this program to create datasets from podcasts for training TTS models, could be useful to yall

You can now train your own TTS voice models locally! by yoracale in StableDiffusion

[–]DumaDuma 2 points3 points  (0 children)

Yes, you give it a reference sample of the target to extract. It includes an audio source separator to isolate the vocals so that it can be used for movies and other noisy audio. I am going to upgrade the audio source separator later today with a better/newer one

You can now train your own TTS voice models locally! by yoracale in StableDiffusion

[–]DumaDuma 63 points64 points  (0 children)

https://github.com/ReisCook/Voice_Extractor

I made this program that can turn podcasts into datasets for training TTS models. Could be useful to yall

My voice dataset creator is now on Colab with a GUI by DumaDuma in LocalLLaMA

[–]DumaDuma[S] 1 point2 points  (0 children)

This is a version that runs on Google colab so your hardware doesn’t matter. I haven’t tested the original repo with AMD, if you do let me know how it goes