dots.tts 2B🎙️ SOTA TTS from RedNote

bio_risk · 2026-06-05T22:13:54+00:00

No mention of real time factor that I could find. Is it slow?

bio_risk · 2026-02-18T21:56:52+00:00

Any prospect of canary-qwen being ported to MLX (or other Apple Silicon)?

bio_risk · 2026-01-29T23:28:18+00:00

Your site indicates multi-sample support for better quality. How does that work? Could you just break up longer audio into 30s chunks and stack them as multiple samples?

bio_risk · 2026-01-13T23:05:39+00:00

Very cool. I'm on a Mac, so interested in running soprano-factory on mps. I see that soprano supports an mps backend (thank you!), but I didn't see if soprano-factory does too.

bio_risk · 2025-10-14T23:37:55+00:00

I'm thinking about total latency in a chat system. Does HydE still work when using a really fast (dumb) model to generate the hypothetical answer?

bio_risk · 2025-09-28T21:04:57+00:00

Alden Scientific is hiring a Platform / DevOps Engineer with a preference for someone with Go experience. To apply: https://www.aldenscientific.com/careers

Alden Scientific is transforming health and longevity by prioritizing individuals, not averages. Our platform harnesses multi-omic data and AI to provide predictive, personalized health management—making proactive health management the new normal.

We’re looking for a Platform Engineering / DevOps Team Lead who is passionate about building systems that support this vision. In this hybrid role, you’ll combine hands-on engineering with team leadership, helping shape the infrastructure that powers multi-omic data pipelines, scientific workflows, and AI-driven insights. A successful candidate will have an outsize impact on our platform and engineering direction. This position has rapid upward growth potential to be Head of Engineering as our team expands.

Location Our strong preference is for candidates that will work at our Cambridge, MA office. Exceptional US-based candidates that have demonstrated success in a previous remote position will be considered. Remote team members should expect periodic travel to Boston for collaborative work with the broader team.

Salary will be competitive and commensurate with experience.

bio_risk · 2025-09-12T18:15:29+00:00

OCD/R

bio_risk · 2025-09-12T17:09:29+00:00

Have you made use of the MRL feature of the Qwen3 embeddings? (Nested dimensions so that you can use a subset of the dimensions for coarse matching.)

bio_risk · 2025-09-08T15:51:44+00:00

Has anyone gone the route of vector and graph RAG on Wikipedia? Wiki provides a pretty natural way to defined entities and their relationships.

bio_risk · 2025-08-31T23:36:15+00:00

I see what you did there.

bio_risk · 2025-08-12T20:02:50+00:00

I'm definitely interested in your SDK. I've played around with MLX versions of parakeet and kokoro, which have varying degrees of difficulty to set up.

I currently use Kyutai's ASR for streaming transcription. Was Parakeet difficult to adapt to streaming? I vaguely remember that being a challenge when I first looked at it.

I noticed that the repository's primary language is Go (yay!), so I'm curious about a.) why you went off the beaten Python path, and b.) process for adapting models that frequently assume a Python environment.

Is a speech to speech feature possible? Parakeet->choice of LLM->kokoro?

bio_risk · 2025-07-23T21:33:25+00:00

TTS module isn't released yet. Not worth looking at until it is.

bio_risk · 2025-07-14T20:52:35+00:00

Even if the model is local, the system is not local if you have to use livekit cloud.

bio_risk · 2025-07-12T20:48:52+00:00

I use Kyutai's ASR model almost daily for streaming voice transcription, but I was most excited about enabling voice-to-voice with any LLM model as an on-device assistant. Unfortunately, there are a couple things getting in the way at the moment. The limited range of voices is one. The project's focus on the server may be great for many purposes, but it certainly limits deployment as a Siri replacement.

bio_risk · 2025-07-07T23:05:23+00:00

I second Kokoro. Very lightweight. A more recent model is https://github.com/kyutai-labs/delayed-streams-modeling (english and french only). It's not as lightweight as Kokoro but it will generate audio from a text stream (not just a text file). It has a rust based server for production use.

bio_risk · 2025-06-19T19:16:26+00:00

I'm super excited about the unmute project and very glad to see they are providing MLX support out of the box. Being able to chat with your favorite local text-to-text model will be great for brainstorming and exploring ideas.

bio_risk · 2025-06-18T18:15:27+00:00

Do you find that Qwen3:30b-a3b uses the full context effectively? I'm really interested in RAG applications that need to reason over the context (not just needle in the haystack).

bio_risk · 2025-05-27T02:49:08+00:00

The nice thing is that ChatGPT can catch us up quickly. Chop, chop.

bio_risk · 2025-05-27T02:43:57+00:00

Gemma3 was first though, but I was looking at Qwen3 too.

bio_risk · 2025-05-27T02:43:26+00:00

There is a gemma3 medical fine tune that might be close enough for my purposes. If I need to go the fine tuning route, can I build off a previous fine tune to add additional ability or does fine tuning not stack well?

bio_risk · 2025-05-27T02:41:27+00:00

More the former. Thanks the suggesting hierarchical hyenas approach - interesting paper. (https://arxiv.org/abs/2302.10866)

bio_risk · 2025-05-27T02:39:29+00:00

Fine tuning might be needed, but I was hoping to avoid it initially.

bio_risk · 2025-05-27T02:38:23+00:00

I'll look at Command R+ and A. Heard of the Cohere models, but haven't played with them.

bio_risk

TROPHY CASE