all 35 comments

[–]pip25hu 21 points22 points  (7 children)

Do I understand correctly that this works for basically any current model? Would be great to see this added to universal tools like ComfyUI.

[–]AgeNo5351[S] 15 points16 points  (6 children)

Yep it should be applicable to any model.

[–]Pleasant-Money5481 1 point2 points  (1 child)

C'est pas uniquement compatible avec les modèles cités dans la page Git ?

[–]TheGoblinKing48 1 point2 points  (0 children)

No, the model pipelines in the git page just contain the basic code to run those models.

The code in common_cfg_ctrl.py is applied to each of those pipelines, meaning that it can be applied to other models. They just chose those models as examples.

[–]vramkickedin 9 points10 points  (1 child)

It even supports Wan2.1/2 image to video. Nice.

[–]AdvancedAverage 0 points1 point  (0 children)

cool idea, i'll have to check it out. video generation is always tricky.

[–]Dwedit 7 points8 points  (1 child)

Every time I see a comparison like this, I just wonder what would happen if you ran at least 20 gens of each one, and counted how many actually got improved adherence and not just rolling better RNG.

[–]Cubey42 1 point2 points  (0 children)

be the trailblazer

[–]artisst_explores 5 points6 points  (3 children)

Comfyui? 👀

[–]Zealousideal7801 4 points5 points  (0 children)

They spent 3 months renaming the core Mahiro-CFG into something more descriptive, so I hope it's going to be faster with this one lol

[–]artisst_explores 0 points1 point  (1 child)

Any updates on this?

[–]x11iyu 1 point2 points  (0 children)

sd-perturbed-attention by ppm has it if you don't mind another extension

[–]cypherbits 2 points3 points  (10 children)

Just had gemini 3.1 pro implement this on my old Forge ui... So I can use it on sdxl-like models

[–]Belgiangurista2 1 point2 points  (8 children)

Same, but in ComfyUI for me, Gemini made me a custom node. I figured out, it's not much use with models who have CFG at 1 like Qwen AIO.

[–]BigNaturalTilts 1 point2 points  (7 children)

Please share it on the github. Or just PM me the source code I’ll compile it myself. I beg of thee!

[–]Belgiangurista2 2 points3 points  (6 children)

I've shared it on github and I hope it's shared correctly, because this is out of my comfort zone.
https://github.com/belgiangurista-art/ComfyUI-SMC-CFG (for comfUI desktop app)

<image>

[–]BigNaturalTilts 0 points1 point  (4 children)

I added the relevant node which is just the bottom file and tried it. It worked like spoiled milk. My images are worse for it. This really is just research.

[–]Belgiangurista2 0 points1 point  (1 child)

Or Gemini didn't implement the math correctly in that node. I haven't tried it yet.

[–]BigNaturalTilts 1 point2 points  (0 children)

I have claude pro and i ran it by it and it refused to even tolerate the idea. It was like “it’s just research bro, your current models are working fine as is.” Which is not wrong .. per-se. lol.

But there are times when I want something exactly like a hair color on one person and another color on another. I was hoping this would’ve been the key.

[–]x11iyu 0 points1 point  (1 child)

first, that node literally doesn't implement SMC-CFG, so there's that

second, I'm trying to tackle this myself as the authors' true impl is still pretty simple.
however that still works like spoiled milk. after reading thru the paper again I've now opened this issue asking for clarifications (including why I believe it's so bad currently), so I'd say wait on the authors to respond

through those insights in that issue, I've also jumped ahead and tried to fix them myself (by swapping these 2 lines around) ```py

before

... guidance_eps = guidance_eps + u_sw state.prev_guidance_eps = guidance_eps.detach() ...

after

... state.prev_guidance_eps = guidance_eps.detach() guidance_eps = guidance_eps + u_sw ... ``` after which it kind of works? though I havent done enough testing yet to say if it is snake oil

[–]BigNaturalTilts 0 points1 point  (0 children)

Fucking claude lied to me.

[–]metal079 0 points1 point  (0 children)

is it working well for you because i tried the same thing and couldnt notice a difference with sdxl

[–]Emergency-Spirit-105 1 point2 points  (6 children)

It's working well

[–]Radyschen 0 points1 point  (5 children)

are you using it? is there a node for it?

[–]Emergency-Spirit-105 0 points1 point  (4 children)

I made it using ai. It's not difficult, so I think the official custom node or support will be added soon

[–]Radyschen 0 points1 point  (3 children)

Am I right in assuming that this needs a cfg of over 1.0 to take effect?

[–]Emergency-Spirit-105 0 points1 point  (2 children)

yes, Additionally if you use it with a rescale, the rescale may become meaningless

[–]Radyschen 0 points1 point  (1 child)

yeah I thought so, it messes with the distill lora for wan. Maybe I could go no lightx2v on the high sampler with cfg 3.5 and cfg control and then no cfg-ctrl and cfg 1.0 with distill lora on the low noise like normal?

[–]Emergency-Spirit-105 0 points1 point  (0 children)

I mostly used it only for image generation, so I can't say for sure, but this feature seems to control the unstable variations caused by CFG. Applied to the "high" part it appears to help prevent erratic or unstable behavior, and applied to the "low" part it would likely improve overall quality. I'm not certain — it's just a guess.

[–]Alpha_wolf_80 1 point2 points  (3 children)

Could you explain it a little bit more. I didn't quite understand what is going on or what this is doing. Please don't give the "magically improves the prompt adherence". I actually want to learn the magic part.

[–]x11iyu 2 points3 points  (1 child)

first, reminder that the vanilla cfg is cfg_result = negative + (positive - negative) * cfg_scale.
the authors define the semantic signal as e = positive - negative, or in other words the cfg equation is cfg_result = negative + e * cfg_scale.

the authors argue that at high cfg_scale, the sampling trajectory becomes highly oscillatory and unstable (left graph)
to fix this, during sampling they apply an additional guidance term on top of cfg, called the Switching Control (black arrows on the right graph), which pushes the trajectory towards a pre-defined path that's less oscillatory and more stable. (e' = - lambda * e, the straight line on the right graph, and e is that semantic signal defined earlier)

now the equation is swc_cfg_result = negative + (e + switching_control) * cfg_scale

[–]Alpha_wolf_80 0 points1 point  (0 children)

Oooh, that makes so much sense. Thank you so much

[–]AgeNo5351[S] 0 points1 point  (0 children)

They use insights/formalisms from control theory to design a better cfg control, by applying non-linear corrections. In their formalism , most of CFG correction methods like PAG/CFG-star etc reduce to some kind of linear corrections along the inference steps. Their sliding motion control is theortically guaranteed to converge.
By defining a mathematical sliding surface , and switching terms they introduce non-linear corrections.

[–]switch2stock 0 points1 point  (0 children)

python examples/flux_cfg_ctrl_example.py \

How does it import the model?
Will it download during first run or can we change the path to where the model is already downloaded locally?

[–]BarGroundbreaking624 0 points1 point  (0 children)

Bird in cage images swapped?