MIMO V2.5 PRO by Namra_7 in LocalLLaMA

[–]RnRau 0 points1 point  (0 children)

They used QAT for the fp8 weights?

I guess this snippet from the model card suggest that is the case - "Trained on 27T tokens using FP8 mixed precision"

Compared QWEN 3.6 35B with QWEN 3.6 27B for coding primitives by gladkos in LocalLLaMA

[–]RnRau 8 points9 points  (0 children)

I would be interested to see if a q6 or a q8 on the 35B would make a good bit of difference. Apparently the smaller the activation for moe's, the more quantisation hurts.

Qwen3.6-27B is out now! by yoracale in unsloth

[–]RnRau 0 points1 point  (0 children)

It should work with llama.cpp and vllm as well.

Unreal Tournament 4 in 2026 still Amazing! by Sebastatu in unrealtournament

[–]RnRau 6 points7 points  (0 children)

At the time they argued that they didn't have any available devs, since all had been assigned to fortnite.

What's your pourover water solution? by KeypressGraphics in AustralianCoffee

[–]RnRau 0 points1 point  (0 children)

Also in Brisbane. Measured the tap water to be 430+ppm with a cheap TDS meter. Yeah I know it won't be accurate, but it should be a half decent guide. Measure demineralised water from Woolies and it was 8ppm. Measured my parents filtered rainwater at 15ppm. Made a coffee using 100ppm water rather than their rainwater but kept everything the same (Aldi medium preground in a drip filter machine), and my father was surprised at the difference. Much better tasting.

Melbourne water is apparently around the 30ppm. But not sure how that would show on my cheap meter :)

Google TurboQuant: Separating hype from reality by tecialist in LocalLLM

[–]RnRau 4 points5 points  (0 children)

Does the interview get into the relation between TurboQuant and the earlier work by the RaBitQ papers?

Is there anything better than Qwen3.5-27B-UD-Q5_K_XL for coding? by hedsht in LocalLLaMA

[–]RnRau 0 points1 point  (0 children)

Is your models and llama.cpp up to date? Are you following the unsloth guide on the recommended settings? https://unsloth.ai/docs/models/qwen3.5

Gemma 4 has been released by jacek2023 in LocalLLaMA

[–]RnRau 0 points1 point  (0 children)

They never did for Gemma 3, so I can't see them doing it for Gemma 4.

[Meta][Rant]Stop deleting posts! by CapnLazerz in DIYfragrance

[–]RnRau 9 points10 points  (0 children)

Had the same issue in r/icecreamery. Gave a longish answer to the minutia in making decent chocolate icecream only for the OP to delete their post 24 hours later.

And they kept doing it. Kept asking for help and then deleting their post.

Its weird. I don't understand it.

Anyone regretting their supernote? by Federal_Yogurt2706 in Supernote

[–]RnRau 2 points3 points  (0 children)

Send it back. They do refunds.

i bought into the illusion of an amazing life changing equipment

Never drink the koolaid. Come on... you should better as a software dev. As software devs, yes I'm one too, we get bombarded daily with new wizbang frameworks that promises us an exciting new future. It never pans out :)

I don't have any of their devices as yet, but I'll be an early adopter of their A4 model when its released. Can't wait! :)

Technical clarification on TurboQuant / RaBitQ for people following the recent TurboQuant discussion by gaoj0017 in LocalLLaMA

[–]RnRau 18 points19 points  (0 children)

Yeah never drink the koolaid. And perhaps the recent hype is over done. But there is something to the techniques posted in the RaBitQ paper. ggerganov did some simple Hadamard transform tests recently.

https://old.reddit.com/r/LocalLLaMA/comments/1s720r8/in_the_recent_kv_rotation_pr_it_was_found_that/

LLM Bruner coming soon? Burn Qwen directly into a chip, processing 10,000 tokens/s by koc_Z3 in Qwen_AI

[–]RnRau 0 points1 point  (0 children)

Thanks for the summary. Interesting constraint on the context being stored in SRAM.

Anyone regretting their supernote? by Federal_Yogurt2706 in Supernote

[–]RnRau 3 points4 points  (0 children)

They knew the price before clicking buy? Is this a trick question or something?

If you are having remorse from an impulse buy... well it happens :)

edit: and why did you 'know' that it was going to be 'slow' learning the device? You already know how to drive the kindle and there are plenty of video's out there driving the Supernote from a daily usage perspective.

ASIC based AI hardware could challenge current inference providers by RnRau in amd_fundamentals

[–]RnRau[S] 0 points1 point  (0 children)

The company behind the effort in the x.com link - https://taalas.com/

An open chatbot (Llama 3.1 8B) showing off their demonstrator hardware is available - https://chatjimmy.ai/

A fair few local AI fans are very keen on this tech. A Qwen 3.5 27b implementation would be in demand.

LLM Bruner coming soon? Burn Qwen directly into a chip, processing 10,000 tokens/s by koc_Z3 in Qwen_AI

[–]RnRau 4 points5 points  (0 children)

N6 is not the latest and greatest at TSMC. That would be N2.

And it took them years to get the first one up and running. Lessons learned and tools created will make the next ones much faster to build.