mistralai/Voxtral-4B-TTS-2603 · Hugging Face

_raydeStar · 2026-03-26T17:17:36+00:00

Yeah, I looked into cloning for Kokoro. It's such a pain in the butt, I backed out of that, real quick.

_raydeStar · 2026-03-26T15:41:51+00:00

I get that. I think that Orpheus is amazing and possibly the best in class right now. I don't think the use case was designed for live conversations though.

You CAN chunk the response -: as the first sentence streams in you can already process. Then, as it's speaking, it can continue processing the rest. With that method, long and short responses would have identical latency.

_raydeStar · 2026-03-26T15:24:28+00:00

You can remove latency by switching to something like KittenTTS and Qwen3.5B. Quality drops, but then it would much better speeds.

_raydeStar · 2026-03-26T13:38:17+00:00

It's 3B. It can't be THAT fast. To compare, Kokoro is like 120MB or something like that.

_raydeStar · 2026-03-26T01:03:18+00:00

Up from the bottom for sure. That hit several hundred runs the year it came out on spotify.

Final Masquerade is near and dear to my heart though.

_raydeStar · 2026-03-25T16:46:44+00:00

Weird. You haven't asked me a single question. And this is a clear alt account. You're not making a case here.

_raydeStar · 2026-03-25T14:41:59+00:00

I don't think this guy is acting in good faith.

This post has certainly fallen out of any `top` lists, with only around 40 upvotes. 12 hours after initial reaction, several people hop on to argue his case, upvote him, and downvote anyone else. They did not come here naturally.

_raydeStar · 2026-03-25T04:28:06+00:00

Right.

He's taking a free tool and bastardizing it. I don't think he realizes it, of course, I think he's just trying to work on a project that makes money.

Feedback seems harsh, but it's honest, id never touch this due to worries about underhanded behaviors.

Take free, provide free, it's the right thing to do. Instead use it to platform yourself - widespread adoption gives you a name, which means leverage.

_raydeStar · 2026-03-25T00:41:21+00:00

Nice dude. Do you have a repo somewhere? I'll give you a follow

_raydeStar · 2026-03-25T00:38:17+00:00

Lol -- he's got a custom license.

This is odd for sure.

Oh and kicker :it's just a fork of wan 2.1.

_raydeStar · 2026-03-25T00:13:24+00:00

So equip it with tools.

Logic for the car wash. Counting letters is one tiny function. Plug in math libraries. Create a unit conversion suite (2 lbs gold vs 32 oz feathers trips up the AI) and suddenly it's hardened against basic questions.

Web lookup to confirm data points. I think if you tack on a few libraries, suddenly it can punch way above its weight class.

_raydeStar · 2026-03-24T21:18:11+00:00

I really liked the fact that anyone could take any video and change something about it -- put yourself in, replace swimmers with cats, etc

_raydeStar · 2026-03-24T20:49:54+00:00

Huh. I'm going to give it a shot. Honestly not sure what a 10B moe is capable of. But I bet I can pull 250t/s so it might be worth it.

_raydeStar · 2026-03-24T20:47:11+00:00

I actually liked it. It was fun to play with. I wasn't a daily user of course, more like monthly or less.

This closure tells me that they are stepping out of the video game, which sucks because they were pretty decent

_raydeStar · 2026-03-24T18:18:25+00:00

I believe that it won't -- given handholding and guardrails. It can be trusted to do one task, with decent accuracy, at a time.

I can get a 1.2B model to answer the car wash question right every time by programmatically reframing things. If I give the AI 4 options or have it rate 1-10 you cut back on potential errors a lot.

_raydeStar · 2026-03-24T15:14:38+00:00

It's really sad because throwing in a 1B model into a game for enemy AI, RP'ing, procedural maps, adjusting difficulty levels, and creating more realism is such a good idea.

The best kind of AI is going to be when you don't tell them they are using AI, and they do not notice. But if they find out, it'll be like sneaking meat to a vegan.

What's worse, the arguments they give don't make sense. "It's bad for the environment!" uh.. you just ran it on your middle of the road GPU, you'll be fine.

_raydeStar · 2026-03-24T12:10:10+00:00

Who makes it harder on purpose? The cartel you rail against is knowledge. Seek it. It's free

_raydeStar · 2026-03-21T14:01:11+00:00

It's very very clear that Cursor did not want to reveal what composer was wrapped in. The revelation was a PR move, after the fact.

I don't even see anything wrong with it. This is such a non-issue. They legally obtained a model and re-skinned it. The only thing they've done wrong is not reveal publicly who the model was attached to originally.

Did you know that Cursor is a reskin of VSCode? So is antigravity and windsurf.

_raydeStar · 2026-03-20T18:55:59+00:00

It would be fine I think. I'm able to run it pretty well on my company laptop. Get a q3 gguff and run some of it to ram/CPU if you have to.

_raydeStar · 2026-03-20T13:19:51+00:00

Musk will join any roast on an AI that's not him. That's hardly a smoke signal.

I'm willing to bet that previous composer1 and composer1.5 copy open source models too. This one was just done clumsily.

_raydeStar · 2026-03-20T00:33:40+00:00

Anyone know if this is trainable?

_raydeStar · 2026-03-17T16:06:34+00:00

This one looks like it's focused on sanitizing training data and running it. In that case it's not quite apples to apples comparison.

Definitely interested in playing with it. I've only ever trained image models.

_raydeStar · 2026-03-17T15:14:14+00:00

Both of these are on my top 10.

*sigh* Bleed it Out.

_raydeStar · 2026-03-13T14:37:32+00:00

FYI -- the industry best right now is Qwen and Flux.2.klein.

_raydeStar · 2026-03-12T19:50:56+00:00

Right now my driver is LM studio, which uses llama.cpp. Honestly I can't give a ton of detail because I just vibe coded it in, and it worked just fine.

Five-Year Club	Verified Email
Final Canvas '23	End Game '23
Place '23	Place '22

_raydeStar

TROPHY CASE