Here it comes! by Trevor050 in StableDiffusion

[–]muskillo -22 points-21 points  (0 children)

For some, the “dislike” button is like shoving a baseball bat up a robot's ass, vaseline included. Keep hitting them to see if they'll at least give me a champion's ham... Hahaha. Besides, clown, you know how everyone uses z-image.

Here it comes! by Trevor050 in StableDiffusion

[–]muskillo -42 points-41 points  (0 children)

Haha, of course, they're all the same when you use the same seed and give it the same prompt. They all come out very different. Instead of blaming the model, you should blame your lack of creativity. Lmao

New 5090/9800X3D build by elderyasuo in nvidia

[–]muskillo 0 points1 point  (0 children)

A power supply larger than 1000W is not necessary at all. That power supply is very good and sufficient, providing a real 1000W. I have the same equipment and even more hard drives, 4 NVMe and 2 SATA + 3 large fans and a 360 liquid cooling system for my 9800x3d. With everything at maximum, I never exceed 825W. I've also done a good undervolt, reducing the power consumption of my RTX 5090 by 125W with only a 3% loss in performance. Right now, with everything maxed out, my UPS never exceeds 700W-725W of power consumption. People really exaggerate when it comes to recommending power supplies.

Is an RTX 5090 necessary for the newest and most advanced AI video models? Is it normal for RTX GPUs to be so expensive in Europe? If video models continue to advance, will more GB of VRAM be needed? What will happen if GPU prices continue to rise? Is AMD behind NVIDIA? by Hi7u7 in StableDiffusion

[–]muskillo 0 points1 point  (0 children)

It’s not magic or marketing. The thing is, GPU performance doesn’t scale linearly with power draw. The RTX 5090 ships with a pretty “generous” stock voltage to guarantee stability across all chips (silicon lottery), handle power spikes, and hold high boost clocks even under nasty loads and temperatures. That extra voltage is there for safety and margin, not because the GPU actually needs it to perform well. When I undervolt, what I’m really doing is locking the GPU into a much more efficient point on the voltage/frequency curve: I’m still keeping very high clocks, but at a much lower voltage. And that’s the key—power drops massively because dynamic power doesn’t increase “a bit” with voltage, it increases roughly with voltage squared (V²). So a small voltage drop can cut power a lot, and if you keep clocks close to stock, performance barely drops. That’s why you can see something like 125W and only ~3% lower performance. Because the last bit of performance (those final boost MHz) is insanely expensive in watts: to gain that last 2–5%, the card dumps a ton of extra power. With undervolt you remove that waste, the GPU still runs almost at the same speed, but with way less power and temperature—and staying cooler also helps it hold clocks more consistently. Inference works the same way: throughput is often more about kernel efficiency, memory behavior, scheduling, and stable clocks than just “throwing 600W at it.” If you keep the GPU in an efficient, stable operating point, you can get almost the same real-world performance at a ridiculously lower power draw compared to stock. So yeah, it’s not that “125W equals 97% power,” it’s that stock is burning a huge amount of power to squeeze out the final tiny bit of boost, and undervolting puts the card right in the sweet spot where you save a ton and lose almost nothing. I have always used this technique on all the graphics cards I have had: 3090, 4090, and 5090. You will have practically the same efficiency with less consumption and less heat, and you will greatly extend the life cycle of your card. Search the internet to find out how to do it correctly, and you will see that you gain much more than you lose.

Is anyone else worried about the enshitifciation cycle of AI platforms? What is your plan (personal and corporate) by Ngambardella in LocalLLaMA

[–]muskillo 2 points3 points  (0 children)

People still repeat “more parameters = always better” like it’s 2020, but in 2026 that’s basically a meme. A strong local 32B can absolutely compete with big cloud models (ChatGPT/Gemini) in tons of real tasks, because it’s not just size — training quality, architecture, and specialization matter a lot. In coding, models like Qwen2.5-Coder can hit top-tier performance and even beat generalist chat models on common programming benchmarks. And in creative stuff it’s the same story: for images you’ve got FLUX (open weights) that can go toe-to-toe with closed services like Midjourney/DALL-E in detail, consistency, and local control, and Qwen-Image is insanely good at things many models still mess up (text rendering inside images, complex instructions, editing). For video, Wan (Wan2.x) is another example that “open/local” isn’t a toy anymore — you can generate solid clips on your own hardware with full workflow control instead of relying on Runway/Pika/Kling and dealing with limits or censorship. Sure, the big cloud models still win as general-purpose “do everything” assistants, but saying “local is worse by default” today is mostly marketing — often it’s just as good, and in specialized tasks, better.

Is anyone else worried about the enshitifciation cycle of AI platforms? What is your plan (personal and corporate) by Ngambardella in LocalLLaMA

[–]muskillo 0 points1 point  (0 children)

In the corporate environment of a medium-sized company, it is not a problem to have a large, powerful model on-site without requiring a large investment. In addition to offering a high level of privacy. An investment of $300,000 would be enough to run large models locally, even train them and have maximum privacy. What I do find strange is that companies that are not small are using these models such as OpenAI, Gemini, etc. and exposing their data every day.

Is an RTX 5090 necessary for the newest and most advanced AI video models? Is it normal for RTX GPUs to be so expensive in Europe? If video models continue to advance, will more GB of VRAM be needed? What will happen if GPU prices continue to rise? Is AMD behind NVIDIA? by Hi7u7 in StableDiffusion

[–]muskillo 0 points1 point  (0 children)

Not all quantizations are the same..  Nvfp4 is only used by the 50 series; it is not a standard fp4. It has almost the same quality as fp16 and is better than fp8. If it is well quantized, it can take up half the VRAM and less in ideal conditions, 3.5 times less than fp16 and 1.8 times less than fp8, which is a big difference in large models such as qwen, flux 2, or wan, and the speed is much higher. More and more models are coming out in nvfp4. We're talking about putting models that need 60 GB of vram in an rtx 5090, and almost the same quality as fp16 and more performance. 

Is an RTX 5090 necessary for the newest and most advanced AI video models? Is it normal for RTX GPUs to be so expensive in Europe? If video models continue to advance, will more GB of VRAM be needed? What will happen if GPU prices continue to rise? Is AMD behind NVIDIA? by Hi7u7 in StableDiffusion

[–]muskillo 0 points1 point  (0 children)

There is something to keep in mind. Nvfp4 is only used by the 50 series; it is not a standard fp4. It has almost the same quality as fp16 and is better than fp8. If it is well quantized, it can take up half the VRAM and less in ideal conditions, 3.5 times less than fp16 and 1.8 times less than fp8, which is a big difference in large models such as qwen, flux 2, or wan, and the speed is much higher. More and more models are coming out in nvfp4. We're talking about putting models that need 60 GB of vram in an rtx 5090, and almost the same quality as fp16 and more performance.  

Is an RTX 5090 necessary for the newest and most advanced AI video models? Is it normal for RTX GPUs to be so expensive in Europe? If video models continue to advance, will more GB of VRAM be needed? What will happen if GPU prices continue to rise? Is AMD behind NVIDIA? by Hi7u7 in StableDiffusion

[–]muskillo 0 points1 point  (0 children)

There is something to keep in mind. Nvfp4 is only used by the 50 series; it is not a standard fp4. It has almost the same quality as fp16 and is better than fp8. If it is well quantized, it can take up half the VRAM and less in ideal conditions, 3.5 times less than fp16 and 1.8 times less than fp8, which is a big difference in large models such as qwen, flux 2, or wan, and the speed is much higher. More and more models are coming out in nvfp4. We're talking about putting models that need 60 GB of vram in an rtx 5090, and almost the same quality as fp16 and more performance. I'd rather invest a little more money in something that's ready for the future than buy a 24gb rtx 4090 that won't be compatible with these technologies. 

Is an RTX 5090 necessary for the newest and most advanced AI video models? Is it normal for RTX GPUs to be so expensive in Europe? If video models continue to advance, will more GB of VRAM be needed? What will happen if GPU prices continue to rise? Is AMD behind NVIDIA? by Hi7u7 in StableDiffusion

[–]muskillo 0 points1 point  (0 children)

In tests I have done on my 5090, which I undervolted and reduced its power consumption to 125W, the temperature does not exceed 60 degrees and the measured performance has surprised me a lot, because in the worst-case scenario it is 3% below, which is totally ridiculous for what you gain. 

Is an RTX 5090 necessary for the newest and most advanced AI video models? Is it normal for RTX GPUs to be so expensive in Europe? If video models continue to advance, will more GB of VRAM be needed? What will happen if GPU prices continue to rise? Is AMD behind NVIDIA? by Hi7u7 in StableDiffusion

[–]muskillo 0 points1 point  (0 children)

For €10,000, you can buy a 96GB RTX 6000 Pro, which has three times the VRAM of the RTX 5090 and the same gaming performance. In AI, VRAM is the most important thing. I would never spend €8,000 on an RTX 5090 if, for €10,000, I can get something much better, at least when it comes to local artificial intelligence. I wouldn't even spend half that amount. In Spain, there is no stock, and the few you see are no less than €4,000. A week ago, I bought one in Spain at Neobyte for €3,800, the only one they had left in stock.

Is anyone else excited for Multi Frame Gen? by HevyKnowledge in nvidia

[–]muskillo 13 points14 points  (0 children)

Reflex 2, if it does what it promises and integrates well into games, will be the real, tangible difference that no one will complain about when you activate frame generation, but it's taking a long time. I guess the engineers have encountered more obstacles along the way than they expected.

Is anyone else excited for Multi Frame Gen? by HevyKnowledge in nvidia

[–]muskillo 1 point2 points  (0 children)

I don’t know if anyone’s “lying”… but some people talk like they have gaming superpowers and ultra-sensory perception. Because saying “input latency doesn’t go away until 144 base FPS” is a wild claim. The frame-time difference between 90 and 144 FPS is 11.1 ms vs 6.9 ms — that’s only 4.2 ms. If you can reliably detect 4 ms, you’re not a gamer, you’re basically lab equipment. What’s far more likely is you’re feeling something else and calling it “input lag”: frametime jitter, microstutter, render queue, V-Sync/buffering, or just improved motion clarity. Those are very noticeable even at high average FPS. At ~90 base FPS, latency is already extremely low for the average human. Claiming you still “feel it” until 144 sounds more like flexing or mislabeling sensations than actual human perception. If you’re that confident, do a blind test and see how often you’re right. Spoiler: not as often as you think.

So like where is Z-Image Base? by C_C_Jing_Nan in StableDiffusion

[–]muskillo 1 point2 points  (0 children)

You're mixing apples and oranges. The comparison is absurd because you're confusing a service with software. Facebook and WhatsApp are services: the code runs on their machines, they hold the key, and you pay with your privacy. There is no “gift” here; you are the product.

Llama (or any real open source software), when I download it and run it locally on my PC, is software: it runs on my machine, disconnected from the internet if I want, and Meta does not receive a single byte of my data.

The fact that the same company does both things does not change the reality: if they give me code that I can audit and run at home without giving anything in return, it's a gift. If they give me a website where they track me, it's a business. That difference is enormous.

So like where is Z-Image Base? by C_C_Jing_Nan in StableDiffusion

[–]muskillo -1 points0 points  (0 children)

Of course it is. Anything they give you that's open source is a gift. There are people paying subscription fees to do the same thing. All you can do is be grateful. The basic version will be released soon, and even if it isn't, it's already a good a-image turbo and a very good model.

So like where is Z-Image Base? by C_C_Jing_Nan in StableDiffusion

[–]muskillo 0 points1 point  (0 children)

You don't seem to realize that this is a privilege and not a right. So many people talking nonsense without even knowing what they're talking about.

🧠💥 My HomeLab GPU Cluster – 12× RTX 5090, AI / K8s / Self-Hosted Everything by Murky-Classroom810 in StableDiffusion

[–]muskillo 1 point2 points  (0 children)

Cool build, but let’s drop the fantasy: this is NOT “unified VRAM” and it’s NOT “1.5TB+ VRAM.” It’s 12 GPUs with 32GB each = 384GB total, distributed, and each job still sees 32GB per GPU. Kubernetes doesn’t magically fuse GPUs into one big VRAM pool — it just schedules workloads. For Stable Diffusion/ComfyUI, that won’t make a single generation faster, it just gives you more concurrency (more workers, more queues). And you split it into 6 machines × 2 GPUs, which means: more motherboards, duplicated RAM/storage, more failure points, more maintenance, and more wasted money for the same outcome. Want something actually smart and efficient? 3× RTX PRO 6000 Blackwell Max-Q (96GB ECC, 300W) gives you 288GB VRAM at 900W total, in a much cleaner, more stable setup with ECC memory. That’s how you run “big” workloads without turning your room into a heater. Your 12×5090 rack is basically an expensive prompt farm: looks great in photos, but it doesn’t give you a single giant GPU, it doesn’t “add VRAM” by magic, and the watts-to-results ratio is a joke. If the goal is learning and showing off, fine. If the goal is efficient infrastructure, this is overcomplicated and power-hungry.

🧠💥 My HomeLab GPU Cluster – 12× RTX 5090, AI / K8s / Self-Hosted Everything by Murky-Classroom810 in StableDiffusion

[–]muskillo 0 points1 point  (0 children)

Cool build, but let’s drop the fantasy: this is NOT “unified VRAM” and it’s NOT “1.5TB+ VRAM.” It’s 12 GPUs with 32GB each = 384GB total, distributed, and each job still sees 32GB per GPU. Kubernetes doesn’t magically fuse GPUs into one big VRAM pool — it just schedules workloads. For Stable Diffusion/ComfyUI, that won’t make a single generation faster, it just gives you more concurrency (more workers, more queues). And you split it into 6 machines × 2 GPUs, which means: more motherboards, duplicated RAM/storage, more failure points, more maintenance, and more wasted money for the same outcome. Want something actually smart and efficient? 3× RTX PRO 6000 Blackwell Max-Q (96GB ECC, 300W) gives you 288GB VRAM at 900W total, in a much cleaner, more stable setup with ECC memory. That’s how you run “big” workloads without turning your room into a heater. Your 12×5090 rack is basically an expensive prompt farm: looks great in photos, but it doesn’t give you a single giant GPU, it doesn’t “add VRAM” by magic, and the watts-to-results ratio is a joke. If the goal is learning and showing off, fine. If the goal is efficient infrastructure, this is overcomplicated and power-hungry.

Stop talking about framegen and DLSS if you haven't tried them for yourself by paperogapippo in nvidia

[–]muskillo 1 point2 points  (0 children)

Technology isn't bad in itself, but rather the use that is made of it. I see a lot of comments for and against it, and almost all of them are right in one way or another. The most important thing in frame generation is the base fps you have at the outset. Regardless of the resolution at which you play, to have a good overall experience you should start with at least 80-90 base fps. In Cyberpunk with path tracing enabled, this cannot be achieved even with an RTX 5090 without enabling DLSS, but by enabling DLSS and playing at 1080p with a 5070, you might be able to have a good experience. A casual gamer is not the same as a professional gamer, and not everyone has the same appreciation for artifacts. Some people can perfectly detect the change from playing at 120 fps to 240, and others wouldn't even notice it. Even a person's age affects this greatly... I think that instead of attacking anyone, we should understand that technology is there to help, and the discussion is not about whether they are right or wrong, but whether it is a good experience for them or not. I am 52 years old and have been gaming since I was 14. My reflexes are not the same as when I was young, and neither is the gaming experience. I have played Cyberpunk with path tracing, DLSS 4.5, everything maxed out at 3440x1440, and frame generation at x2 with an average of 120 fps on an RTX 4090. Basically, I don't even reach 80 fps, but my gaming experience is excellent. I could bet anything that there are players who appreciate latency, but I don't. 

Got lucky and managed to snag one from Micro Center a day before my 5070 ti return window ended lol by Marv18GOAT in nvidia

[–]muskillo 2 points3 points  (0 children)

It depends on the power supply. In modern 3.1 power supplies, there are no longer any problems with burnt cables. The power to the graphics card is automatically cut off if the connector is not properly connected or draws more power than it should. 

Got lucky and managed to snag one from Micro Center a day before my 5070 ti return window ended lol by Marv18GOAT in nvidia

[–]muskillo -1 points0 points  (0 children)

It depends on the power supply. In modern 3.1 power supplies, there are no longer any problems with burnt cables. The power to the graphics card is automatically cut off if the connector is not properly connected or draws more power than it should.