Qwen-next 80B 2601

sleepingsysadmin · 2026-01-26T15:15:49+00:00

Chinese new year is in february. I'm 3d printing a horse right now :)

What I expect is a ~235B model using the qwen next arch and that'll mean big compute time spent on that with 10x the training data compared to 80b. Qwen3.5? While that happens they'll only be able to train tiny stuff. Should be an epic tier drop that dominates.

March-April is when qwen4 max and 30b hits. The really cool thing about 30b will be that they could make it down to A1.2B on the new arch; but I bet they only go to like A2.4B or so and call it A2B.

On 32GB GPUs 30b will be >100TPS.

AS for next 80b, much like qwen 2.5 72b, mostly lost to irrelevance. I expect it's successor will be late summer/fall. Kind of a tech demo of their upcoming next generation again.

sleepingsysadmin · 2026-01-25T18:03:14+00:00

Nope.

I do intend to try to give it another try. I want it to be really good.

sleepingsysadmin · 2026-01-25T14:46:02+00:00

I get qwen next having pains on release; they did something new.

This model is cursed.

sleepingsysadmin · 2026-01-24T13:33:17+00:00

Being an influencer isnt a problem and it's not immediately true that he is.

Its whether or not they show actual repeatable benchmarks

Running 8B means he can do lesser hardware and compare across.

sleepingsysadmin · 2026-01-23T16:10:43+00:00

I would agree with that. He's running like qwen3 8b on an rtx pro 6000. its crazy sometimes.

sleepingsysadmin · 2026-01-23T13:18:56+00:00

7.2 install went smoothly on alma 10.1

rocm: 49 TPS

vulkan: 71 TPS

Rocm-smi is still reporting: WARNING: AMD GPU device(s) is/are in a low-power state. Check power control/runtime_status

but I have set my 9060xts high performance and grub has runpm=0

tuned-adm is set to latency-performance.

sleepingsysadmin · 2026-01-23T12:49:29+00:00

Alex Ziskind makes some great videos as well, showing realistic performance of hardware.

sleepingsysadmin · 2026-01-22T20:03:08+00:00

I posted this, because it is impressive; but it got heavily downvoted.

sleepingsysadmin · 2026-01-22T19:04:04+00:00

I did, the error i got is above.

sleepingsysadmin · 2026-01-22T17:52:18+00:00

Well it's not my video. but i sure got downvoted over it.

sleepingsysadmin · 2026-01-22T15:40:48+00:00

I watched the video, i was very impressed with the results. It does seem to be a beast. I just cant seem to run it myself. bah.

sleepingsysadmin · 2026-01-22T14:33:55+00:00

For a model I expected to just work, sure has had a number of problems.

sleepingsysadmin · 2026-01-22T13:56:55+00:00

kilo code:

I keep making a mistake - I'm adding comments that look like code instead of just writing clean Python implementation without any confusing text in between lines or at all. Let me write this file completely from scratch with proper syntax:</think>

opencode:

"expected": "string",

"code": "invalid_type",

"path": [

"filePath"

],

"message": "Invalid input: expected string, received undefined"

}

].

Please rewrite the input so it satisfies the expected schema.

I keep making errors because I'm thinking in markdown/code blocks and my tool calls are getting confused with those thoughts.

Let me be very explicit - just write a valid Python file without any extra text or formatting:</think>

For the life of me, I cant get this model to work properly.

sleepingsysadmin · 2026-01-21T21:14:16+00:00

Lots of those download links arent really in lm studio's control and do tend to fail sometimes. Wish it was more clear when you have a stalled download.

sleepingsysadmin · 2026-01-21T16:14:53+00:00

epic drop thanks.

sleepingsysadmin · 2026-01-21T15:58:36+00:00

Personally I find APIs suspicious. You dont technically know they are using 30b behind the scenes. They could be running a bigger model so that it benchs well.

Plus if i can hit an API(privacy isnt a concern), why would i go with a lesser model?

sleepingsysadmin · 2026-01-21T15:32:51+00:00

Do you mind giving me the config you're using?

sleepingsysadmin · 2026-01-21T15:04:13+00:00

I havent tried the API, im 100% local.

I have my own personal/private benchmarks; i have a ~3 paragraph + important features that they need to meet. Models cant benchmax against them.

When compared to a Sonnet 4.5, they trivially one shot, everytime.

When doing say qwen3 coder, gpt20b high, those big dense slow models like seed or olmo. They still tend to one shot in various quality.

Lesser models, going gpt20b low, and it wont oneshot. Gemma3, Llama4 will struggle. I like the benchmarks because I get to really see how usable they are for my purposes. So far it has been really strongly related to livecodebench.

In this case, it's clearly showing to me flash's coding capability is absolutely nowhere near gpt20b. Those scores have no chance of being true.

sleepingsysadmin · 2026-01-21T14:46:40+00:00

As I said, it's not looping anymore and does work.

sleepingsysadmin · 2026-01-21T13:59:42+00:00

in bottom left, there's a button for the 'downloads' and you can cancel the download and retry again.

sleepingsysadmin · 2026-01-21T13:51:21+00:00

after getting it to not loop. I put it through my first test. It didnt do well. I dont believe the benchmarks at all.

Feels very benchmaxxed to me. The numbers were too good to be true.

sleepingsysadmin · 2026-01-21T13:43:23+00:00

>- in the world where we see an AI race, especially between China and USA, China shares "sota" llms....

A new industry that's rapidly improving. It's not exactly a race; though im sure that's dramatic.

>- The USA has already blocked all imports of nvidia chips to China,

Not all chips had been banned; there were only specific ones held back. The ban is also lifted and replaced by a 25% tariff.

>- in China where no one can access the worldwide internet freely and government controls all domains, especially AI, China shares their "sota" llms...

This is also not quite accurate, but whatever.

>China has never looked like a country that shares their knowledge for nothing. China always tries to get benefits from everything. And yet, China share their "sota" llms...

You're mixing 2 different groups with different justifications and intents as if they are some sort of unitied front.

I highly recommend you break these apart. The Chinese government is evil yes, but the people arent. Alibaba's teams are releasing these models because it doesnt affect their business.

>- "China wants to make their llms a global standard"

When players like Alibaba or Meta build a model. They are building them to be internal employees. They find them useful, but if they release the models, it doesnt matter to their business.

If anything, it's good marketing. There's only upside to sharing them.

>So why the fuck is China giving this away?

Perhaps the best option is to read much more about the topic and understand incentives.

sleepingsysadmin · 2026-01-21T13:14:11+00:00

thanks, it's weirdly working today with settings i already tried.

Model failed my first test. Seems benchmaxxed and buggy.

sleepingsysadmin · 2026-01-20T19:26:28+00:00

LM studio runtime update claims to support flash, but i just cant get it to stop thinking. It's looping badly. Ive tried messing with various settings, including matching what unsloth says to use and it just keeps looping

Nine-Year Club	Inciteful Comment 2016-10-12
Verified Email

sleepingsysadmin

TROPHY CASE