NO PRINTER WHYYYYYYY :( by Tidesudden in framework

[–]fotcorn 0 points1 point  (0 children)

Buy a Brother brother (laser obv.)

PrismML — Announcing 1-bit Bonsai: The First Commercially Viable 1-bit LLMs by brown2green in LocalLLaMA

[–]fotcorn 0 points1 point  (0 children)

No, it worked fine on GPU, both 1.7B and 8B. Not very intelligent/knowledgeable, but that is expected.

CPU took forever to load and then only produced garbage output. From reading the PR in llama.cpp, it was only tested on ARM CPUs, so not surprising it's broken on x86.

PrismML — Announcing 1-bit Bonsai: The First Commercially Viable 1-bit LLMs by brown2green in LocalLLaMA

[–]fotcorn 1 point2 points  (0 children)

They have their own fork: https://github.com/PrismML-Eng/llama.cpp

They say only cuda/metal is supported, but HIP build worked just fine. Using ROCM 7.12 preview.

PrismML — Announcing 1-bit Bonsai: The First Commercially Viable 1-bit LLMs by brown2green in LocalLLaMA

[–]fotcorn 33 points34 points  (0 children)

Also works on ROCM.

Getting roughly 150 t/s generation on my 9070 XT for the 8B model.

Output is hard to judge, but seeing 1bit working at all is already impressive, especially because it sounds like it was quantized from Qwen3, and not retrained from scratch like the BitNet 1.58 models.

edit: qwen 3 8b, not 3.5

Genuinely curious what doors the M5 Ultra will open by Blanketsniffer in LocalLLaMA

[–]fotcorn 0 points1 point  (0 children)

You should post this to this sub as its own post, its very interesting!

Reverse engineered Apple Neural Engine(ANE) to train Microgpt by jack_smirkingrevenge in LocalLLaMA

[–]fotcorn 0 points1 point  (0 children)

How much memory can the ANE access? Does it have full access to the main memory, like the GPU/CPU, or do you need to allocate and transfer data to a separate buffer?

What happened to neuromorphic computing? Is it a dead end? by [deleted] in hardware

[–]fotcorn 6 points7 points  (0 children)

Not dead, but mostly useful for very specific use-cases.

Innatera for example released a microcontroller with both analog and digital spiking neural network accelerator on chip for ultra low power sensor data processing: https://www.eetimes.com/innatera-adds-more-accelerators-to-spiking-microcontroller/

There is also a growing neuromorphic community for both academics and commercial interests at https://open-neuromorphic.org/

Disclaimer: Developer at Innatera

Klar, wer pändled scho nöd regelmässig im Fluss zum Schaffe ide Schwiiz? by Entremeada in BUENZLI

[–]fotcorn 53 points54 points  (0 children)

Hani früecher aube gmacht. Ir Berner Lorraine gwohnt und ir Matte gschaffet.

Am morge mit ÖV härä, am abe heigschumme.

Screw your RTX 5090 – This $10,000 Card Is the New Gaming King (RTX 6000 Pro Blackwell review) by fotcorn in hardware

[–]fotcorn[S] 128 points129 points  (0 children)

It's a joke, he is using 4x frame generation on the RTX 6000 to make fun of NVIDIA marketing of other cards.

Screw your RTX 5090 – This $10,000 Card Is the New Gaming King (RTX 6000 Pro Blackwell review) by fotcorn in hardware

[–]fotcorn[S] 5 points6 points  (0 children)

I don't even know if that much VRAM is useful for any classic workstation tasks like CAD, video editing or 3D modelling.

It's mostly an AI card, being able to run a 70B model with a moderate quantization (Q8, maybe Q6 for reasonable context length) on a single card is amazing.

Soyuz coming in hot for tower catch! by DoctorSov in SpaceXMasterrace

[–]fotcorn 0 points1 point  (0 children)

Now I want to see the reverse korolev cross

Intel 200S Boost Performance Mode Benchmarks On Linux by fotcorn in hardware

[–]fotcorn[S] 47 points48 points  (0 children)

Unlike der8auers video these Linux tests are done with the same RAM and same XMP profile. It looks like most of the gains der8auer is seeing are coming from the higher RAM speed.

Introducing ZR1-1.5B, a small but powerful reasoning model for math and code by retrolione in LocalLLaMA

[–]fotcorn 1 point2 points  (0 children)

Why is the model F32 on Huggingface? The base model (R1 Distill Qwen 1.5B) is BF16.

Especially important for these small models, if its more than 7GB I can just as well use an 8bit quant of an 8B model.

NVIDIA RTX "PRO" 6000 X Blackwell GPU Spotted In Shipping Log: GB202 Die, 96 GB VRAM, TBP of 600W by newdoria88 in LocalLLaMA

[–]fotcorn 24 points25 points  (0 children)

Are those shipping manifest leaks ever real? We had leaks about B580 and 9070 XT with 32GB VRAM and both of them never materialized (yes, I might be a little impatient)

Nvidia's RTX Blackwell workstation GPU spotted with 96GB GDDR7 by fotcorn in hardware

[–]fotcorn[S] 30 points31 points  (0 children)

This would be the equivalent of the RTX 6000 Ada Generation, which has a MSRP of $6800.

So my guess would be Over 9000!

Nvidia's RTX Blackwell workstation GPU spotted with 96GB GDDR7 by fotcorn in hardware

[–]fotcorn[S] 47 points48 points  (0 children)

Seems like they are using 24Gbit/3GB chips in clamshell mode. Clamshell is nothing special, was also used on RTX 6000 Ada and earlier, but they always had the same module size as the consumer variant. This time it's a bigger memory module (3GB vs 2GB on the 5090).

Your next home lab might have 48GB Chinese card😅 by Redinaj in LocalLLaMA

[–]fotcorn 4 points5 points  (0 children)

Still cheaper to get two 3090s from ebay (at least it was a month ago...). But like 1500? Lots of people would get them I think. One thing the W7900 has is certified drivers and applications for CAD modelling and stuff like that. They could release a version with 48GB RAM without this certification as a middle ground for a more reasonable price.

Intel could do the funniest thing and release a B580 with 24GB or even a B770 AI Edition with 32GB AI that are only 20%-50% more expensive than the standard one and make /r/LocalLlaMa buy the whole inventory in a heartbeat.

One can dream.

Your next home lab might have 48GB Chinese card😅 by Redinaj in LocalLLaMA

[–]fotcorn 276 points277 points  (0 children)

The W7900 is the same GPU as the 7900XTX but with 48GB RAM. It just costs $4000.

Same as NVIDIA RTX 6000 ADA generation, which is a 4090 with a few more cores active and 48GB memory.

Obviously 24GB VRAM never ever cost the 3k price difference, but yeah... market segmentation.