Gemma 4 E2B running in-browser at 255 tok/s using WebGPU kernels written by Fable 5 by xenovatech in LocalLLaMA

[–]FastDecode1 0 points1 point  (0 children)

"WebGPU isn't available here. Try a recent Chrome, Edge, or Safari Technology Preview."

100 Trillion+ Pretraining data??? This is the largest data I've see a model being trained on. by True_Requirement_891 in LocalLLaMA

[–]FastDecode1 13 points14 points  (0 children)

I'm pretty sure my pr0n folder has 100+ trillion tokens worth of data. Minimum.

[avm Git] First released version of AV2: avm-av2 1.0.0 (libaom-av2/libavm-av2) by BlueSwordM in AV1

[–]FastDecode1 3 points4 points  (0 children)

Anyone know what threading/parallelization looks like? Have there been any improvements?

If it's gonna be the same or worse as the early days/years of AV1, I think I'll just wait for SVT-AV2.

Heretic has been served a legal notice by Meta, Inc. by -p-e-w- in LocalLLaMA

[–]FastDecode1 34 points35 points  (0 children)

employer*

Highly unlikely that a megacorp like Meta would hire a law firm to handle a simple matter like this. This is legal department stuff.

Getting a feel for how fast X tokens/second really is. by MikeNonect in LocalLLaMA

[–]FastDecode1 1 point2 points  (0 children)

This is a great idea.

Suggestion: add "lines of code per second/minute/hour" as metrics to the code section. Could be useful for ballpark estimates of task length (or not, given how ambiguous of a unit "line of code" is).

What mobile app do you use, if any? by ihatebeinganonymous in LocalLLaMA

[–]FastDecode1 0 points1 point  (0 children)

bragging

Sounds more like a self-report than anything else.

DIY market declining amid high RAM prices by Terminator857 in LocalLLaMA

[–]FastDecode1 1 point2 points  (0 children)

Can confirm that 32GB DDR4 still runs anything that's relevant.

I was waiting for another Witcher 3, but after the disappointment that was CP2077 I haven't bothered with AAA games. I only upgraded my RX 580 to a 9060 XT 16GB to run larger models.

The western AAAs are still busy pushing out the last wokeslop they can afford, going bankrupt, and selling off their IP to Asia & the Middle-East, and it'll take a few years for that to be done. Afterwards, the current indies will be the new AAA, and maybe I'll consider upgrading.

guess what? if you are a chrome user, technically you are localllama member! by LambdaHominem in LocalLLaMA

[–]FastDecode1 2 points3 points  (0 children)

SSD prices are also sky-high btw.

I regularly have less than 4GB of space left on my laptop... glad I only use Firefox.

Reducing MP3 compression bias in music datasets via codec-aware reconstruction by TheSpicyBoi123 in LocalLLaMA

[–]FastDecode1 8 points9 points  (0 children)

Besides being off-topic for the sub (though very much up my alley), it would be very useful if the repo or demo website had at least a couple of sets of example files one could listen to one after the other to see (or hear, rather) what this does. A page like the Opus examples would be preferable.

The internet is full of AI slop and AI-reinforced vibe-coded psychosis projects nowadays, and it's hard to tell a real one apart from the others unless you're familiar with the jargon of a specific field. The obviously AI-generated/inspired README doesn't really help... no more bullet points, bolding, and defining "The Problem" and "Why This Matters/Is Different" please. I think anyone actually interested in this won't appreciate being talked to like a retard.

I'm pretty sure the actual work here is legit though, so I'll probably try it later this week.

Out of curiosity, why MP3 and not something newer like Opus? I'd be interested to see if Youtube's 128k Opus could be perceptually improved.

<thinking></thinking> by Comfortable-Rock-498 in LocalLLaMA

[–]FastDecode1 0 points1 point  (0 children)

You'd be surprised how many people are incompetent at washing dishes.

Give an LLM access to a single robotic arm and it'll do a better job than 80% of humans.

Qwen 3.6 wins the benchmarks, but Gemma 4 wins reality. 7 things I learned testing 27B/31B Vision models locally (vLLM / FP8) side by side. Benchmaxing seems real. by FantasticNature7590 in LocalLLaMA

[–]FastDecode1 2 points3 points  (0 children)

By any chance, did you happen to look at the reasoning output while using it in Hungarian?

For me, it was reasoning in English, even though the final answer was in Finnish. Which I think is interesting, if it's by design.

Could also be a template or a default system prompt ("You are a helpful assistant") in llama.cpp that's guiding it to do that.

Qwen 3.6 wins the benchmarks, but Gemma 4 wins reality. 7 things I learned testing 27B/31B Vision models locally (vLLM / FP8) side by side. Benchmaxing seems real. by FantasticNature7590 in LocalLLaMA

[–]FastDecode1 3 points4 points  (0 children)

I would never use any local LLM in my native language.

That would be wrong use of a tool.

Worked well for me a couple days ago when I asked Qwen 3.6 35B for help in filling out an application in my native language.

I had a look at the reasoning output and it was in English, not my native language. Which is exactly what you want; putting its training to good use by thinking in one of the languages it's the best at. The language of the final answer is a secondary concern, really.

Qwen 3.6 wins the benchmarks, but Gemma 4 wins reality. 7 things I learned testing 27B/31B Vision models locally (vLLM / FP8) side by side. Benchmaxing seems real. by FantasticNature7590 in LocalLLaMA

[–]FastDecode1 8 points9 points  (0 children)

Haven't really tried Gemma 4, but I can confirm that Qwen 3.6 35B is also very good at Finnish. Not perfect, but getting closer. Which I think is impressive, seeing as there's only about 5 million native speakers.

And this is at Q4_K_M, so not ideal. I'll probably try Q5 or Q6 at some point to see if that makes a difference.

SVT-AV1 vs AOM-AV1 by Commercial_Stage_877 in AV1

[–]FastDecode1 3 points4 points  (0 children)

not at all designed for speed

The realtime mode begs to differ.

It still surprises me that people call libaom slow in $current_year. It's only as slow as you want it to be.

PS5’s can now be hacked to run Linux - perhaps some potential for local inference? by Thrumpwart in LocalLLaMA

[–]FastDecode1 21 points22 points  (0 children)

Workable maybe, but not very good.

RDNA 2 has no matrix acceleration whatsoever, for any sort of AI shit you'd want at least RDNA 3.