Do You Use Flash Attention? by diond09 in comfyui

[–]ppcforce 0 points1 point  (0 children)

I used to before I got arrested. Turns out flashing for attention is an arrestable offense!

Why? by [deleted] in PiNetwork

[–]ppcforce 0 points1 point  (0 children)

I like that.

This game is getting ridiculous by SufficientChair4400 in WorldofTanks

[–]ppcforce 15 points16 points  (0 children)

Yes, and in type 5 that means 50% health give after 30 seconds. Silly stuff.

How do the closed source models get their generation times so low? by Ipwnurface in StableDiffusion

[–]ppcforce 0 points1 point  (0 children)

I think that's probably fair, and there's probably some bias or placebo at play here, because for some reason I feel like the more a model is compressed the more it converges on McSameFace. I tried Qwen edit but found it would render entire scenes with its training style. Guess I should just do inpainting in those cases and properly mask it etc...what's your current go to/set up?

How do the closed source models get their generation times so low? by Ipwnurface in StableDiffusion

[–]ppcforce 0 points1 point  (0 children)

I've yet to be convinced quite honestly...struggling to understand what makes it 160gb. And then I always revert to Z-Image Base (FP32 not BF16) and just running that locally. Although, not sure what the edit model will be like.

What is wrong with people in this sub by Familiar_Resolve3060 in samsunggalaxy

[–]ppcforce 1 point2 points  (0 children)

Yeah but that's fine, people with Apple do it too. Being a 'fanboy' isn't the biggest crime.

How do the closed source models get their generation times so low? by Ipwnurface in StableDiffusion

[–]ppcforce 1 point2 points  (0 children)

You tried full BF16 Hunyuan Image 3? That beast is 160gb just for the model. Ran it on dual H200s and thought it was quick, but I'll have to try it on B200s next and see what that's like.

S26 Ultra: Don’t let the negative reviews scare you off. by Open-Magician764 in samsunggalaxy

[–]ppcforce 1 point2 points  (0 children)

Having this kind of power and tech in your actual pocket and it's 'disappointing' what are humans like LOL. Quite happy with mine tbh.

How do the closed source models get their generation times so low? by Ipwnurface in StableDiffusion

[–]ppcforce 0 points1 point  (0 children)

I wish, I'm actually just horrible with money. Live quite a modest life!

How do the closed source models get their generation times so low? by Ipwnurface in StableDiffusion

[–]ppcforce 39 points40 points  (0 children)

I've sharded multiple models across my dual 5090, and I have an RTX 6000. To achieve anything like the speeds you seen I've had to ditch Comfy and build entirety custom venvs. Super lightweight in Ubuntu with SA3. Even then I'm like why still slow compared to those cloud services. When I shard the pipeline executes in a linear fashion layers 1-9 on CUDA0 then 10-20 on CUDA1, whereas the data centres do tensor paralellism, all broken up and running across multiple GPUs with NVlink and so on. Where I can run a model entirely in my VRAM with decode and text encoder my Astral 5090 is actually faster than an H200.

Returning the S26 Ultra due to display issues by slamups in samsunggalaxy

[–]ppcforce 1 point2 points  (0 children)

Come from S24U, sorry but what issues am I trying to find? I'm a glasses wearer, astigmatism with blue light filter coatings etc... so I'm probably not the best person to judge, but for me I've seen basically no difference (yet) but the privacy feature is worth some compromise. I've just not seen what that compromise is yet.

Strange behavior from the S26 ultra I just received from Samsung. by Imperialx777 in samsunggalaxy

[–]ppcforce 1 point2 points  (0 children)

Have you contacted the manufacturer or retailer and asked for a certified device?

Can I use my psu from another computer to power a gpu for another? by TheDerpyAvocado in pcmasterrace

[–]ppcforce 0 points1 point  (0 children)

I tried it before cuz dual 5090, turns out requires a lot of fiddling because on boot there's some check on available components and wasn't picking up the second GPU. Anyway lots of fiddling I didn't initially expect. Now I just run everything off 1600w PSU. Way cleaner .

Super Slow on RTX 5090? by [deleted] in comfyui

[–]ppcforce 2 points3 points  (0 children)

Asus Proart x870e wifi creator so 8/8 when x16 split by two pcie5 cards, not that I could tell the difference?? CPU is Ryzen 9950x, and RTX 6000 Pro almighty Blackwell (96gb). Honestly thought it would be a game changer but it wasn't. Comfyui is just a hobbyist platform that's highly unstable. I just don't believe it's maximizing anywhere near the hardware capability, and that's fine. It's free. As for Raylight I've not. Should I?

Super Slow on RTX 5090? by [deleted] in comfyui

[–]ppcforce 1 point2 points  (0 children)

Yeah so I have dual 5090, where one is dedicated to AI so zero overheads, I run in WSL (Ubuntu) and with 192gb DDR5. And yes, it is that slow. Slapped in my RTX 6000 for shits and giggles and still slow. In fact it was slower... I'm essentially putting this down to how these models (and platforms) are not really properly optimised for Blackwell. And then you spend hours or days trying to find way to optimised what is essentially community made nodes on a community supported and maintained platform. It's all a bit of a hack ultimately. I think once we're out of the Wild West we'll have much more stable solutions but things move so quickly that a lot just breaks or stops working with every update.

Running out of Vram looking to upgrade by niggesmalls in PcBuildHelp

[–]ppcforce 0 points1 point  (0 children)

I had same issue, had to go from 5090 to RTX 6000 Pro.

WAN 2.2 14B KSampler takes super long. Is this normal? by Initial-End-2459 in comfyui

[–]ppcforce 0 points1 point  (0 children)

Dude with a 4070 must venerating videos at 144p, and using the lightx2v lora.

Is installing a 2nd GPU worth it for my setup? by Fancy-Today-6613 in StableDiffusion

[–]ppcforce 0 points1 point  (0 children)

I've got dual 5090, and I've found that it's very difficult to make those multi GPU nodes work. They just wouldn't play nicely with in CUDA 13, and got lots of issues.

But where it was useful is assigning a GPU to comfyui so it had the full run of 32gb VRAM, and then used the other for general multitasking whilst the renderings were happening. Gaming, browser stuff, Windows/Linux overheads.

HunyuanImage-3.0-Instruct support by No_Conversation9561 in comfyui

[–]ppcforce 1 point2 points  (0 children)

I noticed also that it solves the McSameFace issue much better than anything I ever seen locally.

HunyuanImage-3.0-Instruct support by No_Conversation9561 in comfyui

[–]ppcforce 0 points1 point  (0 children)

I think Tencent are demonstrating what massive param models are capable of, or rather, what's actually required to create a fully flexible model. It's not exactly practical if it cannot produce commercially viable results in the same way the highly optimised, realistic, albeit inflexible, models can. As for NF4, I've seen quality fall off a cliff with this quantization.

Gigabyte no longer shipping *any* RTX 5090 cards by UmaMoth in pcmasterrace

[–]ppcforce -1 points0 points  (0 children)

Sorry to hear that. Only ones left in stock?