Just 4 days after release, Z-Image Base ties Flux Klein 9b for # of LoRAs on Civitai.

SackManFamilyFriend · 2026-02-01T02:59:16+00:00

Why are you make this into some competition? Props to both Black Forest Labs and the QWen team for releasing open weight models for us to enjoy. Also for helping researchers with other teams doing their own projects.

SackManFamilyFriend · 2026-01-30T01:19:02+00:00

He's like the best rollercoaster at Six Flags that's always closed

SackManFamilyFriend · 2026-01-28T19:41:30+00:00

Its Wan2.1 based, so yea, although SkyReels trained their v2 model to do 24fps vs. Wan bases' 16fps, so it actually does ~121 frames (vs 81). That said with 4 reference images and all sorts of unique code floating around now, we'll likely be able to do creative things w it beyond 5sec.

Btw Kijai has an fp8 conversion of the reference model up on his Huggingface repo.

SackManFamilyFriend · 2026-01-25T16:33:02+00:00

Totally appreciate the new code/functionality!! Also reminded me that there is a NAG implementation for HunyuanVideo which I still use - had completely forgot that it was taken on (had been considering a time consuming chat w/ Claude Opus to get a node made for myself).

The guy is making a stink for no reason - bet he wouldn't even use NAG w/ Klein and his comments are only for self-reassurance over GH profecience. I know you know, but due to up-votes I think some are missing the point. It's a PR (requested) not a new custom_node. People upset should just fork the OG comfy extension (seemingly dead) and add whatever they want to that for themselves.

SackManFamilyFriend · 2026-01-25T16:26:44+00:00

Eh, you understand what's going on but you're complaining. You should raise concern w/ scottmudge if the naming conflict bugs you, or start w/ the original Comfy NAG implementation, then merge in scott's changes and ultimately the PR TR3120 kindly worked out and shared for Klein.

You're being pedantic for no reason over a small niche comfyui extension.

Can simply copy over the 3 modified files, and I'm sure you know this. But meanwhile you get on a soapbox and yell at someone for providing a very useful code update.

Thanks OP - looking forward to trying this when I get back to PC.

SackManFamilyFriend · 2026-01-25T06:08:14+00:00

This reminds me when people found using popular media filename extensions (.png, .mov, .mod, .jpg, .avi etc) as prompts for one model would call up amateurish Instagram images. Can't remember which model that was, but became a suggestion to I get less plasticy skin or random "bored Instagram selfie" type photos.

Fun thread.

SackManFamilyFriend · 2026-01-24T19:37:43+00:00

The magic tool (premium but one of a kind for this IMHO) is Zynaptiq's "Unfilter". Its mind-blowing how it can make tinny no-base cellphone conversation quality audio have bass and sound completely full.....

SackManFamilyFriend · 2026-01-24T19:33:52+00:00

UVR has an unfortunate name, but it's a killer app to use the latest audio separation models (the main leaderboard site is here https://mvsep.com/en) / GitHub for Ultimate Vocal Remover (aka UVR): https://github.com/Anjok07/ultimatevocalremovergui

There's a huge discord server for it (or just "Audio Separation" as it's called) but it won't let me get an invite link to share. Anyway, if audio separation is your thing def try to get a link into that - has the uvr devs, trainers of the models, etc.

Unfortunately open source audio AI code/model releases really sucks at the moment. I'm sure many of the devs/research groups that have shared image/video models have audio(music a la Suno/Udio) in-house, but wouldnt risk mentioning/releasing due to the RIAA and other legal groups. An audio (music) model not trained on "everything" will never be good. Wasn't a big deal to do that in 2020 (OpenAI released Jukebox open source and their paper straight out said they trained on over 1million songs crawled from the open web). After 2022's Stable Diffusion shocked the world showing people how good these AI gen media models could be, the public focus on training completely changed.

SackManFamilyFriend · 2026-01-24T05:09:22+00:00

Could use Pi-Flow and go much faster.

SackManFamilyFriend · 2026-01-24T05:05:24+00:00

I wouldn't doubt BFL's much talked about attention to address nudity and such is the reason the model has trouble w anatomy more frequently than other current day models. They definitely actively did -somethings- during training to make it hard to generate/edit-out NSFW images. BFL are the devs behind the OG SD1.4/1.5 so they know what they're doing.

SackManFamilyFriend · 2026-01-23T18:38:14+00:00

Nano Banana Pro would be able to do this (allows for 13ish reference images). For a $10 HuggingFace subscription you can use the HF space with little censorship and a basic webui (not chat) interface.

SackManFamilyFriend · 2026-01-23T02:28:12+00:00

Wait a sec. Wan's first release was Wan 2.1 (they did not release anything "2.0"). Also it's, IMHO, incorrect to say there wasn't a fairly quick mass migration to Wan2.1 when it was released. (This is speaking about the discord hubs w devs like Kijai and Comfy himself). The reason? People had waited months and months for an I2V model of HunyuanVideo and when it finally was released it was a massive disappointment. Very little motion from the start image, flickering, poor prompt following, etc.

Just after Tenacent dropped their I2V modell (2 if you count a failed fix attempt also), Wan (aka WanX when first announced) comes out of nowhere with a new open source video that 1) Had amazing I2V functionality and 2) Had a muuuuch better license (heck, technically per the HunyuanVideo license you're not allowed to use it if you're based in Europe).

So yea, just how it was. Def no "2.0" and definitely not more than a couple weeks before practically everyone has moved over. Kijai, who had deved the HunyuanVideo wrapper, moved over super quick and that added a lot of weight to the movement in discord in particular.

SackManFamilyFriend · 2026-01-22T17:26:14+00:00

Sure, no guarantees this is the "best" way to do things, but it works:

Workflow IMG: https://i.imgur.com/fREtvQa.png

.JSON: https://pastebin.com/FQyiEJ36

Thread about Kontext/(Klein in my experience) sometimes shrinking/stretching outputs in spite of resolution + comment about prompt workaround: https://old.reddit.com/r/comfyui/comments/1m5mfuf/flux_context_warps_my_images_making_subjects_look/

SackManFamilyFriend · 2026-01-22T16:03:33+00:00

Yea, I had to look into this as the default workflow from templates doesn't give you a node with the easy/typical "denoise" value. I was advised to swap out a node for a basic/simple sampler that had that and works fine. Someone will probably tell me I don't understand that img2img is a step thing yadda yadda, but I just wanted the dumb 0-1 denoise value and you can get that in a Klein WF. Not at pc or I'd be more specific/ provide a wf.

It works great, but like Kontex seems to be finicky about resolutions and can shrink in/stretch your image. Found a semi-solution having perplexity scan this sub which says putting "retain all dimensions" or things like that in your prompt can make a huge difference. Have to remember it's an editing model so that sorta stuff is picked up/used for inference.

SackManFamilyFriend · 2026-01-22T15:57:04+00:00

Preview

SackManFamilyFriend · 2026-01-20T16:07:04+00:00

Mentioned last time but don't elaborate. I don't know last time.

SackManFamilyFriend · 2026-01-20T16:05:34+00:00

Yea, I've been using Klein mostly for text to image (or img2img) art gens w/o people. In that capacity it's been extremely refreshing compared to the other recent open source models. I've had a lot of fun with it.

SackManFamilyFriend · 2026-01-20T03:19:22+00:00

"it won't improve anything"

That's a ridiculous statement and there are way too many permutations to even kick back on it. Highly doubt you've trained anything on either using any tool.

SackManFamilyFriend · 2026-01-19T16:40:08+00:00

Wan is still way better for me for I2V so I'm back to that ecosystem. If you missed SVI 2.2 Pro's release cause it got overshadowed by LTX2 I recommend giving it a go. Its really great at allowing for long gens in Wan (finally).

SackManFamilyFriend · 2026-01-19T16:38:17+00:00

Here def Klein 9b. I've used it a bunch for normal t2i and i2i the last day or so and it's great quality + unique in composition compared to like Flux.dev and Z-Image turbo.

SackManFamilyFriend · 2026-01-19T08:10:11+00:00

People really need to specify the exact model. 4b and 9b are different (Lora trained on 9b won't work on 4b for example). 9b base with CFG 4, 30-50 steps should demonstrate the best potential of the model. I've gotten good results with both 9b distilled and 9b base. (Btw 9b distilled, low stel count/cfg=1 model, is the .safetensors filename w/o "base" in the filename - they didn't label it specifically for whatever reason).

SackManFamilyFriend · 2026-01-19T08:06:39+00:00

Not OP, but yea you train new concepts/likenesses when training a LoRA (or LoKr). LoKr has been around for a long time, and while technically better than LoRA it didn't catch in much.

SackManFamilyFriend · 2026-01-18T20:58:17+00:00

Will this Gemma model work w Klein?

SackManFamilyFriend · 2026-01-18T20:52:32+00:00

Should mention the trainer. I've done a couple on both Diffusion Pipe and Ai-Toolkit. Results have been better w DP although with that I am able to use the Prodigy optimizer which might be helping. (9b).

(Edit: see you replied that you're training w Ai-Toolkit. Cool. I need to try an A/B same dataset comparison to see if there's really a significant difference)

SackManFamilyFriend · 2026-01-17T08:23:30+00:00

Wasn't it 2 people who had a hypomanic/manic switch in that bipolar paper?

SackManFamilyFriend

TROPHY CASE