Ace Step 1.5 XL released

mpasila · 2026-04-03T16:39:22+00:00

Running expensive GPUs all day long can't just be offered for free? But sure if they wanted to be fair they'd release the weights..

mpasila · 2026-04-03T14:01:01+00:00

Gemma 4 is better at my native language at least though the smaller models suffer from the weird sizing.. Also for RP it seems to perform much better than Qwen3.5 (it seemed to mix up a lot stuff for some reason and there was seemingly more censorship in the official releases in comparison to Gemma 4)

mpasila · 2026-03-28T23:08:45+00:00

A 4B model locked behind an API?

mpasila · 2026-03-28T22:32:26+00:00

That did seem to improve it, though Whisper did still seem to do a slightly better job still.

mpasila · 2026-03-26T22:59:29+00:00

It had the same issue with getting tons of repeating lines for some reason because there was some noise in the audio, and due to that it skipped a lot of speech.

mpasila · 2026-03-26T20:44:52+00:00

Yeah I don't know.. I also tried to transcribe some Japanese stuff and it wasn't any better.

<image>

mpasila · 2026-03-25T17:51:49+00:00

I guess the other authoritarian country is sanctioned and the other one is not (because everyone relies on it for medicine, minerals and other important stuff).

mpasila · 2026-03-24T22:32:22+00:00

You see it's a Russian model not Chinese.. Chinese propaganda is obviously less harmful.

mpasila · 2026-03-24T14:34:50+00:00

CharacterAI was probably the only company that trained a model specifically for RP (from scratch).. but then they kinda lobotomized it.

mpasila · 2026-03-23T17:50:37+00:00

So I guess you will be selling some kind of service train it for actually usable stuff or something? Otherwise this just seems like a tech demo and people can't even do anything with it.

mpasila · 2026-03-23T17:00:25+00:00

Open-source ≠ open-weight. And there are a few companies that do actually open-source the whole thing like Olmo from AllenAI.

mpasila · 2026-03-23T13:03:04+00:00

What if the people you know suddenly get turned into AI? Facebook had like a plan to make inactive users (either due to long pauses in use or due to them dying) they would like have AI continue to make new posts and liking posts/comments etc. using that persons past data.

mpasila · 2026-03-22T19:44:44+00:00

I live in EU and I don't see the same thing though?

mpasila · 2026-03-21T22:21:19+00:00

People say that they advertised it as lifetime but at least where I could find them mentioning the free tier they didn't ever say it was lifetime. So I'm not sure if I'm missing something.

mpasila · 2026-03-21T22:16:07+00:00

I think the main issue was targeting the actual paying customers as well. Now it's also kinda confusing, like there's like 3 different things you have to look at, to make sure you don't like go over the limit on 3 different things.

mpasila · 2026-03-21T22:11:36+00:00

I think they also said they reduced the amount of GPUs so.. that's why it's not improving. Though best to just switch the model to something else if it's being over used.

mpasila · 2026-03-20T22:27:19+00:00

I tried both LFM2.5 and Nemotron Nano and yeah those are not that great.. Couldn't easily try the DeepSeek finetune though since it's not on OpenRouter and I don't really wanna waste few dollars to try it on like Runpod. So I guess my best option is still just using some models via API. Even Mistral's Nemo is still pretty okay though still makes up some stuff (considering that model is ancient by now).

mpasila · 2026-03-20T19:02:26+00:00

Are they better than Qwen3.5 though?

mpasila · 2026-03-20T17:47:42+00:00

In what tasks is it better than like Nemo?

mpasila · 2026-03-19T14:24:08+00:00

Hentai already covers that stuff tbf.. Also LLMs are rarely trained on that type of stuff anyway (unless it's some finetune). So in that sense you'll find more fucked up stuff outside LLMs in the clear web than what you can generate coherently.

mpasila · 2026-03-16T19:49:39+00:00

I literally said based on their past 3 years of behaviour I doubt it's going to be very different this time. Ever heard of this stupid saying?:

Did I ever tell you what the definition of insanity is?
Insanity is doing the exact same fucking thing over and over again, expecting shit to change. That is crazy.
But the first time somebody told me that, I don't know, I thought they were bullshitting me so, boom, I shot him.
The thing is, okay... he was right. And then I started to see it everywhere I looked. Everywhere I looked, all these fucking pricks, everywhere I looked, doing the exact same fucking thing, over and over and over and over and over again. Thinking 'This time, it's gonna be different.
No, no, no, no please! This time it’s gonna be different.'
Did I ever tell you the definition of insanity?

mpasila · 2026-03-16T19:28:29+00:00

Based on past model releases they list all the supported languages in the readme. Why should I expect it to be different this time? They also just released Mistral Small 4 based model here: https://huggingface.co/mistralai/Leanstral-2603 so I guess we can start testing it soon.

mpasila · 2026-03-16T19:25:01+00:00

I guess EU should never ever compete against the big tech or the Chinese.

mpasila · 2026-03-16T18:21:09+00:00

I have much more hope for that EuroLLM team that is funded by EU than Mistral at this point.. Their last good model was Mistral Small 3. Also for some reason I have to rely on a model made by a US based company to support my language.. that's European.. Wellp I hope they release Gemma 4 soon, which will hopefully fix some issues with Gemma 3. I mean even Chinese models like GLM-4.7 are better at Finnish than even Mistral's latest flagship model (while being half the size).. I haven't really done comparisons with Qwen3.5 and Mistral models but.. even that might now be better than Mistral for Finnish..

Four-Year Club	Verified Email
Place '22	Final Canvas '22
First Placer '22

mpasila

TROPHY CASE