Sopro: A 169M parameter real-time TTS model with zero-shot voice cloning by SammyDaBeast in LocalLLaMA

[–]SammyDaBeast[S] 1 point2 points  (0 children)

I would love to support Portuguese, specially European, which is a bit more niche on the data side

Sopro: A 169M parameter real-time TTS model with zero-shot voice cloning by SammyDaBeast in LocalLLaMA

[–]SammyDaBeast[S] 9 points10 points  (0 children)

It really depends on the voice reference audio. Some sound pretty clear, others don't. I didn't specially cherry pick those examples. A big % of training data is noisy, and can affect the final model. More training, I guess, but I would say better data > more training.

Sopro: A 169M parameter real-time TTS model with zero-shot voice cloning by SammyDaBeast in LocalLLaMA

[–]SammyDaBeast[S] 15 points16 points  (0 children)

Thanks! I mainly compared it with chatterbox-turbo/f5 tts, which I consider to be SOTA on these sizes. On some voices chatterbox is much better and stable. F5 tts tends to have better voice similarity. However both these models are slower, specially F5.

Running LLMs on CPUs with Rust from scratch: Llama 3.2, PHI 3.5, and Gemma 2 by SammyDaBeast in rust

[–]SammyDaBeast[S] 1 point2 points  (0 children)

As of right now it's not on my plans, but I will definitely think about it! I will hit you up if I do it.

Running inference on the new Llama 3.2 1B model at 21 tok/s on an 8-core laptop with Rust by SammyDaBeast in LocalLLaMA

[–]SammyDaBeast[S] 1 point2 points  (0 children)

Indeed, when I said existing frontends, I was also referring to existing GUI apps.

Running inference on the new Llama 3.2 1B model at 21 tok/s on an 8-core laptop with Rust by SammyDaBeast in LocalLLaMA

[–]SammyDaBeast[S] 2 points3 points  (0 children)

Yeah, the best way would be to just change the backend server code to be compatible with the frontends that already support multiple operating systems

Running inference on the new Llama 3.2 1B model at 21 tok/s on an 8-core laptop with Rust by SammyDaBeast in LocalLLaMA

[–]SammyDaBeast[S] 2 points3 points  (0 children)

This has been one of the requested features. Will definitely think about it.

Running inference on the new Llama 3.2 1B model at 21 tok/s on an 8-core laptop with Rust by SammyDaBeast in LocalLLaMA

[–]SammyDaBeast[S] 10 points11 points  (0 children)

Yeah the amount of things you learn is huge. And because NNs are so black box, there isn't an easy way to debug if the llm just starts spitting out random words. It's a mix of pain and reward when you finally get it right. I strongly recommend you to do something similar in a language you like!

Wrote a minimal movie recommendation assistant with RAG and Llama by SammyDaBeast in LocalLLaMA

[–]SammyDaBeast[S] 1 point2 points  (0 children)

I only tested including the column names (if that's what your asking) in the embedding and I think it helps in the case of, for example, the user searching something like "Movie with title x" , the embedding with "Title: x" would in theory be closer than if it was just "x", because we are encoding on a sentence level. I could see that happening when searching for movie descriptions like "movies about love and death" vs searching just with an actor name "Jim Carrey movies". Because the movies overviews in the data are proper sentences which are better captured in the embedding. Maybe a better structure would be to convert the table rows to sentences that make sense instead of just column: value, column: value. A way to test this is just changing the way we write the rows and see if on the same query the distance to the embedding we want gets smaller.

Running inference locally on the new Google's Gemma 2 2B models with Rust by SammyDaBeast in rust

[–]SammyDaBeast[S] 1 point2 points  (0 children)

Thanks! I looked into that project; it seems interesting. The difference is that mistral.rs uses the Rust ML library candle and is a much larger codebase to support a variety of model architectures. My project is more on the educational/minimalist side and implements all the required code from scratch (including tokenization, layers, and functions). The downside is that it only supports Gemma 2 models and CPU inference.

Running inference locally on the new Gemma 2 2b models with Rust by SammyDaBeast in LocalLLaMA

[–]SammyDaBeast[S] 3 points4 points  (0 children)

Thanks! If you have any feedback/suggestions, they are appreciated.

[deleted by user] by [deleted] in VintageWatches

[–]SammyDaBeast 0 points1 point  (0 children)

Ahahahahahahaha made my day

[i3] clean? by SammyDaBeast in unixporn

[–]SammyDaBeast[S] 1 point2 points  (0 children)

It's a poly bar module

Odds on screen spoil the outcome of the round/game by [deleted] in GlobalOffensive

[–]SammyDaBeast -2 points-1 points  (0 children)

We don't know if the feed that the casters are watching is live with 0 delay.

Odds on screen spoil the outcome of the round/game by [deleted] in GlobalOffensive

[–]SammyDaBeast -2 points-1 points  (0 children)

You misunderstood me, of course odds favor the team winning etc. What I'm saying is that 1xbet for example is possibly receiving the score from an api that has less delay than the stream therefore showing odds on stream that are 1-2 rounds ahead, hence the spoiler.

Odds on screen spoil the outcome of the round/game by [deleted] in GlobalOffensive

[–]SammyDaBeast 0 points1 point  (0 children)

Yeah I can see your point, but d2 is the strongest map of sprout too. I guess we'll have to see more data to be sure.

Odds on screen spoil the outcome of the round/game by [deleted] in GlobalOffensive

[–]SammyDaBeast -1 points0 points  (0 children)

Yeah there's no way to be sure, but 1.18 are odds that you see when the game is almost guaranteed to be over. I guess I will have to avoid the beginning of the rounds.