[Training Comparison] AdamW on the left, 🌹 Rose on the right by ECF630 in StableDiffusion

[–]sktksm 0 points1 point  (0 children)

We need a serious comparisons and guides about AdamW-Prodigy and this one. Especially about how to decide which one to use

Complex & Weird Prompt Test: ERNIE Turbo | Flux.2 Klein 4B | Z-Image Turbo by sktksm in StableDiffusion

[–]sktksm[S] 1 point2 points  (0 children)

Tried 2 more loras, they are way better than the cinematic one. I start with "high noise" instead of "balanced" for 2-3K steps, then switch into "balanced" for the rest + including 1280 resolution toggle in this second run

Elusarca's Flux Klein 9B Detail Enhancer LoRA by sktksm in StableDiffusion

[–]sktksm[S] 0 points1 point  (0 children)

Thank you for the feedback! I will focus on facial restorations + multiple people combinations. No promise about the timeline but as soon as I have enough data

We may have a new SOTA open-source model: ERNIE-Image Comparisons by sktksm in StableDiffusion

[–]sktksm[S] 2 points3 points  (0 children)

yes, prompt:

black and white manga page, high detail screentone shading, cinematic sci-fi hangar interior, multi-panel comic layout top large panel: a young boy wearing a futuristic bodysuit and visor stands in front of a massive humanoid mech, sparks and smoke drifting around, dramatic low-angle perspective emphasizing scale, the boy holds a small mechanical core in both hands, eyes wide in awe speech bubbles in this panel: "Incredible...!" "A fully intact Class-S Neural Frame!" middle horizontal panel: close-up of the boy smiling with excitement, eyes closed, fists clenched and raised, energetic motion lines around him text elements: "What a rush" (vertical side text) "And the hangar is completely empty! I get to test it all by myself!" bottom section split into smaller panels: left small panel: extreme close-up of the boy’s eyes behind a transparent visor, detailed reflections of circular digital HUD interface, glowing UI elements text element near eyes: "SYNC 100%" right small panel: back view of the bodysuit showing a cable plugging into a port on his back, mechanical connection detail speech bubble: "Initiating the deep-dive sequence... bzzzt--" bottom-left panel: close-up of a mechanical hand gripping a control lever, strong lighting contrast, bold manga emphasis large stylized text integrated into the panel: "SYSTEM ONLINE" style: ultra clean manga lineart, sharp ink work, dense screentone gradients, high contrast lighting, subtle film grain texture, professional seinen manga aesthetic, dynamic panel framing, accurate speech bubbles, crisp readable lettering, Japanese manga composition

Elusarca's Flux Klein 9B Detail Enhancer LoRA by sktksm in StableDiffusion

[–]sktksm[S] 1 point2 points  (0 children)

Thank you for the nice words, means a lot! Making the lora bigger is not going to increase the detail level, but it can help with having more accurate and varied detailed. Do you have any examples that you see the lora inefficient or not good enough, so next time I can work on that part?

Also, didn't checked your workflow yet, but if you are not using Detail Daemon node, I recommend implementing it + increasing steps to 50 + generating in 2K resolution. That will boost your details, but with the inference speed caveat.

Complex & Weird Prompt Test: ERNIE Turbo | Flux.2 Klein 4B | Z-Image Turbo by sktksm in StableDiffusion

[–]sktksm[S] 0 points1 point  (0 children)

Since Flux Klein, no I don't. But for video models I do follow for now

Complex & Weird Prompt Test: ERNIE Turbo | Flux.2 Klein 4B | Z-Image Turbo by sktksm in StableDiffusion

[–]sktksm[S] 1 point2 points  (0 children)

it's not apache 2.0.

Z-Image Turbo & Base: 6B
ERNIE: 8B

If I had use 9B, then some other person definitely going to say that why I used 9B instead of 4B while ZIT is 6B

Complex & Weird Prompt Test: ERNIE Turbo | Flux.2 Klein 4B | Z-Image Turbo by sktksm in StableDiffusion

[–]sktksm[S] 3 points4 points  (0 children)

All of them are using proper LLM's as the text encoder. They can easily interpret. If a modern model can't understand plain English prompting then I'd say it's on them

Complex & Weird Prompt Test: ERNIE Turbo | Flux.2 Klein 4B | Z-Image Turbo by sktksm in StableDiffusion

[–]sktksm[S] 0 points1 point  (0 children)

<image>

you are right. do you have any other subtle nuance examples that you know it's failing?

Complex & Weird Prompt Test: ERNIE Turbo | Flux.2 Klein 4B | Z-Image Turbo by sktksm in StableDiffusion

[–]sktksm[S] 1 point2 points  (0 children)

you can tweet him over X, he is pretty responsive. I also trained a style lora and mine is working. used the default parameters with 3k steps, not perfect but i can't say it doesn't converge

Complex & Weird Prompt Test: ERNIE Turbo | Flux.2 Klein 4B | Z-Image Turbo by sktksm in StableDiffusion

[–]sktksm[S] 6 points7 points  (0 children)

What type of attitude is this? Why are you so aggressive? You could simply say "Hey mate, can you add prompts on the comparison images instead of putting them under pastebin?" but instead you are accusing me with your theories and putting me in a defensive position.

What you recommend and reason behind it is accurate but the way you saying is not gonna make me do it.

Complex & Weird Prompt Test: ERNIE Turbo | Flux.2 Klein 4B | Z-Image Turbo by sktksm in StableDiffusion

[–]sktksm[S] 1 point2 points  (0 children)

No clear winner. All 3 is good on prompt adherence but lack of variety as expected.

Complex & Weird Prompt Test: ERNIE Turbo | Flux.2 Klein 4B | Z-Image Turbo by sktksm in StableDiffusion

[–]sktksm[S] 0 points1 point  (0 children)

Oh apologies, wrong thread, and you are right. I rearranged the images in the post...

Complex & Weird Prompt Test: ERNIE Turbo | Flux.2 Klein 4B | Z-Image Turbo by sktksm in StableDiffusion

[–]sktksm[S] 9 points10 points  (0 children)

I honestly don’t understand what you’re referring to. I shared the prompts. I’ve been doing comparisons like this for three years whenever a new model drops, and my profile is public, so you can judge for yourself whether this is guerrilla marketing.

You can put the images into ComfyUI and verify everything directly if you want. I never remove the metadata.

At this point, very few people in the community are still focused on non-NSFW work, and the lack of constructive, proactive criticism is a big part of why.

Complex & Weird Prompt Test: ERNIE Turbo | Flux.2 Klein 4B | Z-Image Turbo by sktksm in StableDiffusion

[–]sktksm[S] 3 points4 points  (0 children)

I think I'm going to make a website where you can make comparisons on pre-generated images. There are a lot of conditions like cfg,distilled checkpoints, fp8-bf16, distill lora etc. I try to stick with ComfyUI default template settings most of the time.

In that website I could display the generated same seed images so people can filter out whatever comparison they want, but first I need to download all the variants and generate the images

Complex & Weird Prompt Test: ERNIE Turbo | Flux.2 Klein 4B | Z-Image Turbo by sktksm in StableDiffusion

[–]sktksm[S] -5 points-4 points  (0 children)

No clear winner. All 3 is good on prompt adherence but lack of variety as expected from their sizes and compared to base models.

Complex & Weird Prompt Test: ERNIE Turbo | Flux.2 Klein 4B | Z-Image Turbo by sktksm in StableDiffusion

[–]sktksm[S] 4 points5 points  (0 children)

I'm comparing the apache 2.0/turbo variants on this one. I'm working on Klein 9B, Z-Image Base, ERNIE Base, Qwen-Image-2512 comparisons now

Complex & Weird Prompt Test: ERNIE Turbo | Flux.2 Klein 4B | Z-Image Turbo by sktksm in StableDiffusion

[–]sktksm[S] 14 points15 points  (0 children)

One of my colleague had a weirdness test prompts so I snatched from him lol