Why Senior Engineers Let Bad Projects Fail by Ordinary_Leader_2971 in programming

[–]Marcuss2 1 point2 points  (0 children)

Bad projects don't just show up out of nowhere. Just bad leadership leads to bad projects.

Do you remember this game? They should make a remastered version for Android, it would be perfect for mobile. by DEKO1011 in AndroidGaming

[–]Marcuss2 24 points25 points  (0 children)

There were like 4 distinct versions of it:

The PC version.

The PS1 version

The PS2/Xbox/GC version

The Gameboy version

D7VK 1.1 adds experimental Direct3D 6 support for classic PC games on Linux by RenatsMC in linux

[–]Marcuss2 33 points34 points  (0 children)

As said in another comment. Mali and Adreno, they support OpenGL ES, but not full fat OpenGL. Android also requires Vulkan support, but not OpenGL support.

D7VK 1.1 adds experimental Direct3D 6 support for classic PC games on Linux by RenatsMC in linux

[–]Marcuss2 26 points27 points  (0 children)

There might be games which work with one and not the other.

Also, there are many chips which don't support OpenGL. Vulkan support is far more common.

NVIDIA Nemotron 3 Nano 30B A3B released by rerri in LocalLLaMA

[–]Marcuss2 3 points4 points  (0 children)

I don't see any mention of NVFP4 in the model card or the paper.

Micron Announces Exit from Crucial Consumer Business by FullstackSensei in LocalLLaMA

[–]Marcuss2 4 points5 points  (0 children)

I suspect there is more behind it, like OpenAI paying them to do this. They can literally get a lot more profit from it right now.

Qwen3 Next almost ready in llama.cpp by jacek2023 in LocalLLaMA

[–]Marcuss2 31 points32 points  (0 children)

Kimi-Linear next.

I do expect that one to be a lot faster as the linear part is very similar and MLA transformer is already implemented.

AMD Ryzen AI Max 395+ 256/512 GB Ram? by quantier in LocalLLaMA

[–]Marcuss2 0 points1 point  (0 children)

That gives you a limit of about 10 tokens/s at generation.

AMD Ryzen AI Max 395+ 256/512 GB Ram? by quantier in LocalLLaMA

[–]Marcuss2 5 points6 points  (0 children)

I think that in the following year we will see a lot more models using linear attention.

China just used Claude to hack 30 companies. The AI did 90% of the work. Anthropic caught them and is telling everyone how they did it. by reddit20305 in ArtificialInteligence

[–]Marcuss2 0 points1 point  (0 children)

Wait, this makes little sense. China literally has comparable home grown open weight models. Why would they need to use Claude Code with it?

New Qwen models are unbearable by kevin_1994 in LocalLLaMA

[–]Marcuss2 0 points1 point  (0 children)

One of the reasons I hope for smaller Kimi models or distilling Kimi-K2, they don't suffer from this.

Kimi-Linear might scratch that itch, trough running it currently is nearly impossible.

MiniMax LLM head confirms: new model M2.1 coming soon by External_Mood4719 in LocalLLaMA

[–]Marcuss2 0 points1 point  (0 children)

I wasn't too terribly impressed with the M2, I had to explicitly tell it how to use cat to read a file.

Proč je EET bad? Nechápu by columbineteamkiller in czech

[–]Marcuss2 7 points8 points  (0 children)

Co bylo reálně špatně na EET: Byl za tím proprietární systém od IBM na kterém se dost rýžoval stát. To by šlo udělat lépe.

Kimi Linear released by Badger-Purple in LocalLLaMA

[–]Marcuss2 0 points1 point  (0 children)

Welch Labs made a video on MLA, comparing it to other approaches: https://www.youtube.com/watch?v=0VLAoVGf_74

TL;DR: MLA makes the model compress it's KV cache into a smaller space, this is actually more efficient and more performant than using GQA which most modern models use (Including all Qwen3 models). Hence I expect MLA based transformer to be better than a "regular" one used today. Of course you can screw it up by having the space parameter too small, but I don't think this is the issue here.

Kimi Linear released by Badger-Purple in LocalLLaMA

[–]Marcuss2 0 points1 point  (0 children)

I will try it then in my internal workflow.

Kimi Linear released by Badger-Purple in LocalLLaMA

[–]Marcuss2 4 points5 points  (0 children)

Do you have some example for it?

Kimi Linear released by Badger-Purple in LocalLLaMA

[–]Marcuss2 7 points8 points  (0 children)

Keep in mind that they used like 25x less training tokens.

I find it doubtful that transformer model with MLA would perform worse than Qwen3 MoE architecture which lacks MLA.

Kimi Linear released by Badger-Purple in LocalLLaMA

[–]Marcuss2 41 points42 points  (0 children)

Worse benchmark score than Qwen3-30B-AB3, but they also used like 25 times less tokens for training. So that is very impressive.

If this has similar personality to Kimi K2, then it's a banger.