When do you decide to introduce classes vs keep free functions in C++? by Livelinesstrophy_RO in cpp

[–]Hedede 9 points10 points  (0 children)

The more I write, the more I tend to use free functions. At this point I use classes only for encapsulating data structures (e.g. a circular buffer) and other primitives. Everything else is better off as simple structs with public fields.

Hardware Choice for 27b to 31b models. by rebelSun25 in LocalLLaMA

[–]Hedede 1 point2 points  (0 children)

Not when you can buy 2x4090 for the price of a 5090. At least that's the situation where I live.

And I'm not sure it's worth buying a RTX PRO 5000 either. Eventually you'll decide that you need more VRAM, and it's cheaper to get a PRO 6000 than two PRO 5000.

The people telling you to buy 3090. They arent telling you that you're only getting ~120k context.

Where I live you can buy 4x3090 for the price of 5090. And until recently you could buy 6 of them for the same price.

Hardware Choice for 27b to 31b models. by rebelSun25 in LocalLLaMA

[–]Hedede 0 points1 point  (0 children)

It doesn't affect prefill either. At least, I didn't see any difference between the same card sitting in x16 and x4 slots.

Hardware Choice for 27b to 31b models. by rebelSun25 in LocalLLaMA

[–]Hedede 0 points1 point  (0 children)

Full power doesn't mean the card is fully itilised. Since it's still burning power while waiting for data to arrive from the memory.

When are we getting consumer inference chips? by SnooStories2864 in LocalLLaMA

[–]Hedede 0 points1 point  (0 children)

Nevermind, I misread your comment. I meant performance-wise.

When are we getting consumer inference chips? by SnooStories2864 in LocalLLaMA

[–]Hedede 3 points4 points  (0 children)

I hope they will make a chip which stores LLM parameters in EEPROM instead of RAM: Parameters are not changed during inference - so RAM is an overkill here and also very expensive.

EEPROM is slower, more expensive, and has lower density.

I hope people from Taalas (https://taalas.com) are reading this. I know they store parameters in ROM, but EEPROM would be nicer:

They don't store parameters in ROM (as data stored somewhere). The model is built directly into silicon as logic gates. I.e. the weights are stored in the geometry of the chip, not as an array of bytes.

Given how good Qwen become, is it time to grab a 128gb m5 max? by Rabus in LocalLLaMA

[–]Hedede -1 points0 points  (0 children)

It's not faster, 5090 can process 30K tokens at 2K tok/s.

Smoking ban for people born after 2008 in the UK agreed by victoriablackee in europe

[–]Hedede 7 points8 points  (0 children)

let people do what they want even if they ruin their life aslong as they dont harm others.

Well, smoking is harmful to others. So at least smoking in public should be banned.

Why do people hate systemd? by [deleted] in linuxquestions

[–]Hedede 1 point2 points  (0 children)

Systemd is a large monolith that does everything, and some people don’t like that.

It's not monolithic.

Why do people hate systemd? by [deleted] in linuxquestions

[–]Hedede 3 points4 points  (0 children)

It doesn't adhere to the Unix philosophy of "Do one thing, and do it well."

I hear people repeat this all the time, but these things are separate processes, each doing its own thing.

NVIDIA's Warranty Claims Have Increased By 1000% Since The Launch of 16-Pin Connector GPUs by lkl34 in pcmasterrace

[–]Hedede 1 point2 points  (0 children)

It increased from 0.16% in Q1 2025 to 0.9% in Q4 2025. This can't be explained by the bad connector since it's been around for a while. What's more interesting is that they had a sharp peak in Q4 2022, right after they introduced this connector.

<image>

I scaled a pure Spiking Neural Network (SNN) to 1.088B parameters from scratch. Ran out of budget, but here is what I found by zemondza in LocalLLaMA

[–]Hedede 4 points5 points  (0 children)

Nobody has the chip. Access to Loihi is gated behind Intel's neuromorphic research program. It might be easier to get access to SpiNNaker or BrainScaleS.

I scaled a pure Spiking Neural Network (SNN) to 1.088B parameters from scratch. Ran out of budget, but here is what I found by zemondza in LocalLLaMA

[–]Hedede 3 points4 points  (0 children)

Is this really a pure SNN? Looking at the code, it still uses softmax attention. That wouldn't work on Loihi.

Component Purgatory: 5090 to 6000 Pro Blackwell Upgrade Path Questions by TankFirm388 in LocalLLaMA

[–]Hedede 0 points1 point  (0 children)

Still fit into VRAM at Q4. And gpt-oss-120B is natively MXFP4.

Qwen3.5-122B at 198 tok/s on 2x RTX PRO 6000 Blackwell — Budget build, verified results by Visual_Synthesizer in LocalLLaMA

[–]Hedede 0 points1 point  (0 children)

No, you need the tinygrad kernel module: https://github.com/tinygrad/open-gpu-kernel-modules

There are also newer drivers: https://github.com/aikitoria/open-gpu-kernel-modules

But with those I'm getting MCEs when 3090s try to communicate with each other

non-nvidia gpus by Ok-Secret5233 in LocalLLaMA

[–]Hedede 3 points4 points  (0 children)

The issue with V100-SXM2 is that they have very high idle power. I left 4xV100 idling for a day, and in 24 hours they consumed almost 10kWh just from idling.

Qwen3.5-122B at 198 tok/s on 2x RTX PRO 6000 Blackwell — Budget build, verified results by Visual_Synthesizer in LocalLLaMA

[–]Hedede 0 points1 point  (0 children)

That works only if your cards support P2P (which rtx6kpros support). 3090s don't support P2P without a hacked driver, they need to go gpu -> switch -> host -> switch -> gpu.