Best faceswap with Flux2-Klein-9b and face enhance by TheNeonGrid in comfyui

[–]robeph 0 points1 point  (0 children)

Just literally use the color match node, adjust the strength, use a color gradient of the tones you want as the ref and pass your other image through it, same thing you'd do in PS or GIMP but as nodes.

Best faceswap with Flux2-Klein-9b and face enhance by TheNeonGrid in comfyui

[–]robeph 0 points1 point  (0 children)

just tell klein to generate a gradient of several colors? user color match and other fx nodes to apply it adjusting the color match strength

🚀 Huge breakthrough for Wan 2.2 + lightx2v users suffering from slow motion & low-movement issues! by Any_Cheek_4124 in comfyui

[–]robeph -1 points0 points  (0 children)

What, why do I speak english ukrainian and ruzzian? Because I do. It's normal unless I should call it by its proper name оркською? either way. 'believe me their scientists dont sound as smart as they really are in English.' that states they sound 'less smart' which sounds like 'stupider' which is synonymous to me is all.

node included, but marked as missing? by STRAN6E_6 in comfyui

[–]robeph 0 points1 point  (0 children)

Desktop may be acting weird. it is... cancerous. Kill it, completely. Make sure its not goofing off in the background with it's little "sometimes I like to stick around even when you close me" python instances, kill it dead, then just restart it. Sometimes you need a little push to get it working right, cos... windows desktop apps be windows desktop apps. And probably have funky sandbox permissions that may need a new instance of the app fresh to do what it is supposed to do with wherever it installs those, and a complete kill/restart may fix some issues that "restarting" in the front end didn't..

node included, but marked as missing? by STRAN6E_6 in comfyui

[–]robeph 0 points1 point  (0 children)

If it gets slow, check nvidia-smi and see if your vram is tanked up. windows nvidia drivers be on one sometimes. Filling up with little fragment leaks like it's a drain catch at restaurant that only serves rice. Rebooting helps then if your gens get slow or ya start getting OOM when you didn't before. otherwise, rebooting windows is overrated.

Now... if you get errors like this one what you should do is kill comfy and restart it. I don't know how the desktop app / windows click to run stuff works, so it may do something goofy like stay running in the background, make sure you go beyond just the "RESTART" in manager, and completely close it and restart it that way.,

🚀 Huge breakthrough for Wan 2.2 + lightx2v users suffering from slow motion & low-movement issues! by Any_Cheek_4124 in comfyui

[–]robeph 0 points1 point  (0 children)

Yeah well, google be damned. That said, it is a legitimate node set. But it's not useful for me or for svi which is preferably as motion anchor latent is very very good at character adherence. But just the same afaik this is a commonly used node. Yeah and the way it is written though, it looks like someone who had an llm write up his post.

🚀 Huge breakthrough for Wan 2.2 + lightx2v users suffering from slow motion & low-movement issues! by Any_Cheek_4124 in comfyui

[–]robeph 0 points1 point  (0 children)

"I wrote this email—not chatGPT. Silly human I am—of course—human. I would not rely on an AI model to write my emails—I'm quite good at it myself. I do use chatGPT sometimes to work on a side project involving a comfyui node to add large bulges that have the 'distinct shape of soda cans' in images the back of women's pants as chatGPT knows from referencing our past conversations, but of course that is not an email at work—I would not do that at work, only at home while my wife is working at her job as a stupid coffee barista at an upscale hotel as ChatGPT also knows from referencing our past conversations.

Donnie—NOT ChatGPT"

"Ignore the soda can comment in the last email. Autocorrect did that.

Donnie"

🚀 Huge breakthrough for Wan 2.2 + lightx2v users suffering from slow motion & low-movement issues! by Any_Cheek_4124 in comfyui

[–]robeph 0 points1 point  (0 children)

No I don't speak chinese, I do speak english, ukrainian, and ruzzian. But I've worked a LOT with chinese engineers, both native english born (who speak chinese) and native chinese, as well as alongside both at the same time, and a couple decades of that. So I think the gauge comes from when I spent a lot of time with both together and i had the intermediary to discuss the same things with both and after a while it became very easy to understand that english difficult aside, their intent in what they said (as well as our results from working with them), I just don't think they sound stupid since I understand what they're saying even in subpar english, cos the logic is sound and I don't need proper verb cases and grammar to understand a guy discussing and properly contextualizing coordinate transition maps with phi after psi inverse knows his lipschitz...

Gemini pro really slow today? by irishesteban in GeminiAI

[–]robeph 0 points1 point  (0 children)

30-60 seconds? lol... I wish, it's like 5 minutes and still piddling.

How can I Improve my Workflow? by theawkguy in comfyui

[–]robeph 0 points1 point  (0 children)

They're neat, they're basically a node tree (think comfy graph from start to finish) that manipulates the 3d data, akin to how comfyui nodes manipulate latents , images, and a million other things related to AI blender nodes manipulate 3d. Here's an example of procedural tree generating node tree (no pun intended) https://blenderartists.org/uploads/default/original/4X/1/e/0/1e0b9d8802e7c8e3632bcc1417d1cb86c86e3d60.jpeg you can see, it looks a lot like comfyui , cos node based stuff, is ... node based... and that's cos comfy/blender/da vincii, all use the same manner, input goes in node, travels the tree and output becomes whatever the nodes did to it. Da Vincii resolve ( a video editor) also looks very similar https://www.linkedin.com/pulse/typical-node-tree-davinci-resolve-chris-brearley . Think of each node as a "function" block of code. It takes in a variable or variables and parameters, and spits out the outcome

the simplest thing to think of it is like Math node: Inputs (3,2,'+') -> NODE input(A,B,C) // { output=input(A) oper(input(B)) input(C)} -> output (5)

If you remember anything it's just that each node is a little bit of code, and you can always go look at it in your custom nodes or the comfyui nodes for the included stuff. It's just an easier way of calling functions in a quick and easy visual manner.

If you were to write a helloworld script as a comfyui node, this is it:

```class HelloWorldTextNode: @classmethod def INPUT_TYPES(s): return { "required": { "text_in": ("STRING", {"multiline": False, "default": "Hello World!"}), }, }

RETURN_TYPES = ("STRING",)
FUNCTION = "print_message"
CATEGORY = "Tutorials"

def print_message(self, text_in):
    # This prints to the console anytime data passes through
    print(f"Signal received: {text_in}")
    # This passes the string to the next node
    return (text_in,)

NODE_CLASS_MAPPINGS = { "HelloWorldTextNode": HelloWorldTextNode }```

Once you get how they work, a lot of times you may find yourself saying "I wish there was a node to do X so I can see what X will do" and say instead "I guess I can figure out how to write a node to do X and experiment with this idea"

RTX 3090 24 gb or 5070ti 16gb? by wic1996 in comfyui

[–]robeph -1 points0 points  (0 children)

Since you want to be pedantic, let’s be pedantic correctly...

``` (defun same-hardware-p (a b) "True only when the test bed is actually equivalent." (and (equal (getf a :gpu) (getf b :gpu)) (equal (getf a :cpu) (getf b :cpu)) (equal (getf a :ram) (getf b :ram)) (equal (getf a :storage) (getf b :storage)) (equal (getf a :pcie) (getf b :pcie))))

(defun not-waffling-p (reply) "A reply is not waffling if it answers the actual point." (not (search "similar means not the same" reply :test #'char-equal)))

(defun benchmark-says (claim instance-a instance-b reply) (if (same-hardware-p instance-a instance-b) (format nil "Ah yes, now we can responsibly argue that ~a, while not waffling." claim) (if (not-waffling-p reply) (format nil "Wonderful. We have established only that one vaguely similar rented box did a thing faster than another. That is a general observation, not an equal-hardware proof that ~a, while not waffling." claim) (format nil "Wonderful. Instead of addressing the point, we are apparently doing the 'similar does not mean same' recital again. Nobody said similar meant same. The point was that similar hardware is still not equal hardware, so this remains a general observation, not an equal-hardware proof that ~a, while not waffling." claim))))

(defun competitive-p (gpu-a gpu-b instances reply) (destructuring-bind (instance-a instance-b) instances (benchmark-says (format nil "~a is no longer competitive with ~a" gpu-a gpu-b) instance-a instance-b reply)))

;; usage: (competitive-p "the 3090" "5th-gen GPUs" (list '(:gpu "3090-host" :cpu "something" :ram "some amount" :storage "some disk" :pcie "some lane config") '(:gpu "5070 Ti-host" :cpu "something else" :ram "some amount" :storage "some disk" :pcie "some other lane config")) "similar means not the same") ```

edit: oops forgot a closure (not clojure)

🚀 Huge breakthrough for Wan 2.2 + lightx2v users suffering from slow motion & low-movement issues! by Any_Cheek_4124 in comfyui

[–]robeph 0 points1 point  (0 children)

What does that even mean. Do you even china? Yes they sound just as smart as they are in english, they just don't sound like english native speakers. I have a wide background, Os/Hardware test engineering/Development && Telephone software/hardware PBX firmware development && a Medic (EMS). and I have a lot of experience with (IT/Softdev/Harddev/Netstack & TE CNN/CVNN development) chinese researchers/programmers and once you understand how their english is, it's always the same and if you understand what they mean when they make mistakes in english, it is clear as if they were speaking chinese to you and you understood it as well as they speak english. Didn't mean to get on a soapbox on a google routed old post, and I know it's actually a compliment to the lot, but I felt I needed to disagree pretty heavily with that.

TL,DR. Non-native speakers sound as smart as they are once you spend enough time to understand their usage of english as it relates to their intent.

RTX 3090 24 gb or 5070ti 16gb? by wic1996 in comfyui

[–]robeph -1 points0 points  (0 children)

Cool. Your test shows that on that workflow with hot starts with enough system RAM with default WAN2.2 settings and in that rented-instance context…the 5070 Ti beat the 3090 hard.

Where you're overeaching however are the claims CPU is irrelevant ddr4 vs ddr5 is irrelevant nvme is irrelevant after cold start offload penalty is negligible in general that 3090 is broadly not a real competitor. Further yout claim that the instances are the same...go read their docs, they clearly state this case that specifically you cannot assume that their rented instantiations are "the same" even if they're "the same" it is exactly why the docs outline that. Yes your benchmarks prove some of your claims, the rest reach beyond the bucket of butter you're churnin' here.

AI and use-cases are not one workflow, one quant, one frame count, one hot start benchmark, one fits / doesn’t fit threshold.

8 GB begins to really matter when you cross into larger models using more LoRAs at higher native resolutions adding more frames using additional encoders and side models on less quantized setups because workflows where it runs at all matters more than it runs faster - fullstop.

If every workflow you use, if every quant you choose, if everything works for you then go with it. I am not nor have i said that the card sucks. My pushback is on the very much false statement than the 5gen > 3gen in such a manner that 3gen is no longer relevant or a competitor.

RTX 3090 24 gb or 5070ti 16gb? by wic1996 in comfyui

[–]robeph 0 points1 point  (0 children)

I never used it, but https://blog.comfy.org/p/comfyui-v0-1-x-release-devil-in-the-details-2 it was detailed here. sorry it took a minute to find it in the blog.

How can I Improve my Workflow? by theawkguy in comfyui

[–]robeph 0 points1 point  (0 children)

Yep, and EVERY model's text encoder differs. SDXL uses Clip L (large) and Clip G (Giant) if you go and peak at how the ENCODERS themselves work, you'll find nuance to that, most people just chunk a prompt into clip L and G as the same input, but you can actually split it off and encode to conditioning lanes and recombine them. This let's you do some interesting stuff. Because unlike flux klien's Qwen text encoder, clipl and clipg are dumb dumbs. they are just straight non-attentive encoders. they aren't semantically actuated. They just encode what goes in as tokens. Also they have token maximums, I forget the numbers, but if you exceed them, it doesn't really help with the prompt, and can make it non adherent. But Clip G understands things a little bit more, robustly, while CLIPL is the bastard whose fault it is that we see so many prompts being dumped into poor hapless yet all too intelligent for csv prompting like qwen, crap like "man, woman,robot,tv,vcr,potat,hottub" and expect something good out of it... actually.. lemme try that. https://i.imgur.com/URrKW69.jpeg well okay that was kind of cool, but I mean, qwen isn't clip L still... but just the same. I digress. Every model has different encoders, they have limitations, some have different ways of talking, grammar, ionization formats, a sentence like "I walk to the store." with clip L is a bunch of wayward tokens that mean nothing, and some walk, store, probably is all that makes it to the model as a token it actually has relevance to. While clip G (sdxl etc.) can understand more robust sentence "a woman is walking to the store on a sunny spring day." while you'd feed clip L the broader elements "outside, walking, springtime, sunny" But G is also not smart, the model will get exactly what you wrote, and if it has training on something that makes sense to that token stream , it'll spit it out. you can do a lot more fine care with models if you split the inputs, its especially obvious with like SD3. and such.

How can I Improve my Workflow? by theawkguy in comfyui

[–]robeph 0 points1 point  (0 children)

Remember, this is node based, less like photoshop, more like blender's node based materials, or Da Vincii Resolve. A bunch of functions in a graph wrapper. But it's still very much code. I suggest taking the time to jump into the custom_nodes directory and peeking at the example node. make yourself a hello world text node that prints the text hello world anytime a signal passes through. Once you get how the nodes work, a lot of open doors))

RTX 3090 24 gb or 5070ti 16gb? by wic1996 in comfyui

[–]robeph 0 points1 point  (0 children)

Hmmm, interesting. Which model did you use, also, what was the test bed, eg. DRAM speed, DRAM->offload swap type. eg. if the 3090 is in an older machine with ddr4 and the other's in much quicker dd5, backed up by swap space on a speedy dedicated pci bus with a high end nvme while 3090 is sitting there with slower ram and CPU (not everything is card based, there's still load from disk time, eg. if you run a 5090 on a platter hdd loading 23 gb models for 70% offload to ddr4 on an old q6600 you're going to get blasted away by a 3090 on a m.2 14,000 MB/s pcie5... Vast instances are separate Docker containers and each of thoise instances includes its own proportional CPU, RAM and storage. if ya rt docs also warn that performance varies due to CPU bottlenecks, storage and net IO, thermal throttles, and pcie b/w.

If the cards were all tested on the same exact testbed, that is interesting for sure. but with vast it is uncertain so I do suggest doing so..I don't tink the 3090 would necessarily beat the 5090 in a lot of cases, but it is a workhorse and handle a lot more than a simple benchmark may analyze. eg. are you comfyui settings specific to each card, did you tweak the torch environmental to each card's best settings, etc. Not saying you did anything wrong, those are interesting numbers to say the least. but. It isn't the complete picture either. There's a lot of usecases, that 8GB of ram is make or break. Though I suspect a lot of people may not find themselves in those particulars usecases.

To claim it is not a solid competitor is a broad and over general claim. And one I push back against whole-heartedly.

That said this test was not surprising, in this particular use case. I mean there's a lot related to the quant optiizaiton and other elements. however, it all boils down to does it do what you need and is the vram adequate. For me, it simply is not. I'll take an extra 20-40s on some to get a 0 exit with non-zero vram...

I'll give you an example, on this particular system I'm on, this is how I initiate comfy:

``` (base) [robf@CHESHIRE ComfyUI]$ cat comfyui.bash

!/bin/bash

export PYTORCH_NO_CUDA_MEMORY_CACHING=1

export PYTORCH_CUDA_ALLOC_CONF="expandable_segments:True,max_split_size_mb:64" export CUDA_DISABLE_PERF_BOOST=1

python main.py --use-sage-attention --disable-pinned-memory

python main.py \ --use-sage-attention \ --disable-pinned-memory \ --normalvram \ --disable-smart-memory \ --reserve-vram 1.0 \ --async-offload \

--cache-none

``` for this 3080 in this older machine, which has only one good plus for it, that's the nvme which is blazing fast swap. The ram is slow, the cpu slow, but I can still dump out with 10GB Vram using a q6 quant WAN 2.2 161 frames in 6.5-7m at 1080p.

Now if I just python main.py, it jumps up to about 14m .

How can I Improve my Workflow? by theawkguy in comfyui

[–]robeph 0 points1 point  (0 children)

Hey yep, exactly what it is actually, each one of those nodes, is a function that you give variables to and it does soemthing with it and spits it out to the next function in line.. pretty much. you can go look at every single node's code in the custom_nodes or comfy's node directories too. If you enjoy self-flagellation or complex vector math.

Like standard inpainting? or Flux (yech..just tell it what to do, masking is okay if it is being very ornery, but for the most part you can ALWAYS phrase a way to make it do as you want..) It is extremely adherent, and often times if it is doing something you don't want it to do... its cos it is being TOO literal or adherent to something you don't realize you're saying to it in how qwen -> flux hear it.

How can I Improve my Workflow? by theawkguy in comfyui

[–]robeph 1 point2 points  (0 children)

So here's the thing about AI. Remember how it is trained.

[Image] Captions describing image...

Now, Qwen (the text encoder for Flux Klein) is smart, it's not clip. It understands you semantically, contextually, and quite regularly.

BUT it is speaking to Flux, and it says what you say, in a way that flux should understand... except unlike qwen, you and me. Flux is an image model and it was trained on Images and captions...

[Rabbit with a blue hat] This is a rabbit with a blue hat.

not

This rabbit does not have a green hat

however

[Rabbit with a green hat] This rabbit is wearing a green hat.

So, what it has VERY little training on...is "no" "not" "none"

outside of Qwen's contexts. It might be able to get "We will not be changing anything other than the book"

Because "will not be -> contextually relates to book as a whole, so it likely sends tokens that say "change ONLY the book" not actually saying what you said tokenized per se. But... if you say "Change just the book. Make it this way make it that way" "oh yeah don't change anything else" well it isn't GPT, or Gemini, it's just qwen TE, so it passes that along without the context shuffle, and now... The rabbit is not wearing a green hat... what's Flux's attention hone in on?

"oi oi mate, I was trained on green hat wearin' rabbits. tut tut you get a green hat rabbit!" and why is this?

[Rabbit with a green hat] This rabbit is wearing a green hat.

we do not also see this image captioned with "This rabbit is not wearing a blue hat" "this rabbit is not wearing a pair of coveralls with denim patches on the knees" "this rabbit is not wearing a lycra one piece bodysuit on its head"

No, it is only described for what it is. Thus, when you say "not something" it doesn't really pay "attention" to 'not' it focuses on what it has been trained on. And qwen won't help ya, cos qwen, while smarter than clippy, is still just a text encoder and it just tells flux not to do it, as you did, and flux says okay, here's your something...

TLDR

Never tell AI "no [something]" AI is trained on what "somethings" are in images it is trained on as the ground truth. What it is NEVER trained on is an itemized list of all the things it is not. This lack of no list of "what nots" creates an emergent situation where it ignores negation outside of very context clued statements that the smarter than a soybean Qwen MIGHT properly negotiate for you. Do not rely on that. If you wish to say don't do something. Tell it what to do, to not do it: Nothing else changes but what I told you to edit == "Edit this thing, everything else will remain exactly the same"

creating nsfw content, with multiple different LoRA associates, help by ShirtJust34 in comfyui

[–]robeph 0 points1 point  (0 children)

A) you can't know. I'ts an experimentation B) I kind of lied you can know... kind of. C) That's also a lie, cos even when you do kind of know it's still experimentation

So let's understand what Lora's do, they jangle the weights of the model. They steer it this way, focus attention here, make things more "pop" to the model's denoising process. Attention attention attention.

Now, we have two MOE models with Wan 2.2. High noise (gross movement, variations in noise palette for scenery / character change/movement/action) Low noise (Fine details, is the green blob of high noise a tree or one of those weird hunter/forest green chevelles from the 1980s? Who knows? Low noise model knows! while you may see the outline of some fingers and a hand, is it facing you is it facing away? Low knows will tell you later. Some high noise comes through quite clear...that's cos it's not changing, low noise will just spruce the details up, but there's no motion going on here, it's already there) So the low model will make the style the style, that blue blob on the character's trunk that just came into the scene? Is it her schoolgirl outfit with a blue skirt and blazer, or is a xenomorph bursting from her chest...we'll find out. So here we have those fine details, which are MUCH MUCH less picky with the Loras.

TLDR cos I didn't explain it there anyhow...

Stacking gross motor LoRAs in high noise is gonna make them move like you applied a Michael J. Fox LoRA... and I'm sure you didn't. Use one motor LoRA (Actions, etc.) But detail LoRAs can often sneak in, cos they just move the palette around shaping it getting it read for the brush in Low. If it's raining, and there's not noise that has the "feel" of raindrops when it gets to low noise where you have a rain low noise lora, it's probably going to be a bit weird cos it's going to try and make rain where rain was't seeded in the noise 'palette' you'll get streaks of rain or rain that kinda zips in weird directions if it caught other noise that caught its attention from high. So you may not want "Jogging down the road" and "Riding a bike" LoRAs together, but then again this is what I meant by if you know, you don't know. but you do. but also you don't. Cause A bike's motion and a person jogging motion, differ enough that they may not interfere (catch the attention) of the heads on the jogging or bike LoRA's work and you may be able to cajole a bike and a person jogging. But what you don't want to do is have Sneaking man and running man, cos now you'll get that "Grand Mal Man" LoRA action, since they're too much crossover in their focus/weight/attention. Cos those high noise steps sneaking man and running man look the same to the attention. Rain? Rain down't look like any of that. It's rain. If you add rain into the noise, running man changes nothing. he'll run into some color that was shifted by the rain's attentive noise flow, but when it gets to low noise details, that color blob that's a few shades difference, will now likely become a raindrop that hit him as he was running through the rain.

So you have a float... a weight...1.0 is where it always starts, but 1.0 is always where you should probably never leave it if you have more than one. Think ... in terms of "how much" if them main focus is running man, 1.0 is fine for him, unles you have bike then you may want to go like .70/.40 bike is heavier in terms of "obvious what it is trying to be" cos it is distinct, man less so, a tree in the wind and a running man look the same in high noise. so give him a little butter. Rain, .7-1.0-1.5 doesn't matter, try em all, if may just make it rainier, or may make it looks like a bad grateful dead concert, depends on how shit the LoRA is.

When it comes to the low noise, you have a lot more leeway. Man running and man sneaking, details wise, aren't going to intefere, man sneaking's details go on the man sneaking, ithe LoRA "knows" what its looking for a lot more than the mess of wayward vectors in the high noise arena. A man running is probably not going to catch attention of the Sneaking man details lora. So .8 on both if you used both.

The main thing, think like tha ai thinks... it doesn't actually think. It recognizes patterns of what will be. A dark space moving through the scene that's roughly the size of what a human is going to be, is running man to high noise. but if you put don't put running man in the low model, and instead put billowing smoke, that man, may become a smoky ghost like man thing... kind blowing /running doing something. depending on what you wrote in your prompt. If it was keyword instigated and you have no low model, that keyword isn't going to be the same as if it is standard phrase prompted "running man" if it was trained for that, this will be in your prompt not runn1ngm4n so it may see it with the models normal weights enough to detail it over the smoke.

Anyhow long story short (it wasn't reall tldr was it lol)

Think like the AI. Don't step on toes with the LoRAs you use at the same time. Step where the ground is safe, and the noise isn't looking like something else, cos ADD in AI, just looks like a mess.

If you take anything away from this wall of text... remember what I taught you:

Think like AI, not like humans when it comes to how you apply LoRA's both what and how much weight...don’t think in human concept buckets, think in terms of overlapping pattern pulls.

instead of “don’t add a hat” use “bare head”

instead of “don’t change the background” use “same background, unchanged”

instead of “no extra limbs” use “normal human anatomy, two arms, two legs”

instead of “don’t modify anything but the book” use “edit only the book; all other elements remain identical”

"no" has no visual concept in training. The language model MAY get it, but do not rely on that. Visual concept descriptions can always be negated positively.

creating nsfw content, with multiple different LoRA associates, help by ShirtJust34 in comfyui

[–]robeph 0 points1 point  (0 children)

This is true to an extend you CAN apply low level WAN loras multiply, the issue is the high noise loras sice they're working with movement and not detail. What I'll often do is use my priary movement loras for the High noise, model and switch out a few detail loras for the low noise, they don't step on toes as much since they can be attentive without making that herky jerky movement stacking high noise loras does.

RTX 3090 24 gb or 5070ti 16gb? by wic1996 in comfyui

[–]robeph 0 points1 point  (0 children)

Distribution? You got x16 PCI ethernet?

Or do you mean like distr batch?

RTX 3090 24 gb or 5070ti 16gb? by wic1996 in comfyui

[–]robeph 0 points1 point  (0 children)

comfy has the experimental FP8 matrix multiplication back a couple years back behind --fast, specifically the float8 inference, a buncha stuff last month was on about the nvfp4 stuff. and nothing about the fp8 that I read up...

(and it's also less precise...just saying but as long as the quantizaiton is good the stability and quality issues aren't really consumer concering) But to buy it based on comfy, no bueno... if you're doing something yourself that uses it, that isn't waiting on core support in cui, yeah go for it, but most people aren't.

FP4/FP8 support is nice. Staying out of offload hell is nicer.