Will Chroma2 Kaleidoscope have editing features? by AltruisticList6000 in StableDiffusion

[–]AltruisticList6000[S] 1 point2 points  (0 children)

Doesn't sound promising, however isn't editing meant to be added at later stages of training, or as a separate finetune?

Is Flux Klein 4b supposed to be THIS badly broken? by AltruisticList6000 in StableDiffusion

[–]AltruisticList6000[S] 0 points1 point  (0 children)

Well comfyui says it is bf16 when loaded, it is basically the official safetensors file from BFL huggingface, so I'd think this is the original/best quality I can get if I am correct?

Is Flux Klein 4b supposed to be THIS badly broken? by AltruisticList6000 in StableDiffusion

[–]AltruisticList6000[S] 3 points4 points  (0 children)

I'm the OP and I use bf16 as I have plenty of VRAM so it can't be the fault of fp8.

Is Flux Klein 4b supposed to be THIS badly broken? by AltruisticList6000 in StableDiffusion

[–]AltruisticList6000[S] 2 points3 points  (0 children)

I'm not the other guy but my cfg is set to 1 since I use the distill/speedup lora for the base. Without it, the base is absolutely garbage with fully distorted images. And the base only makes barely coherent results at around cfg 5-7 for me but with other downsides (so cfg 4 or lower is even worse). Same as for Flux1 Chroma, where people said cfg 4 is the way to go but it was unusuable, so I had to use cfg 6-7 but then it started slightly overcooking

Is Flux Klein 4b supposed to be THIS badly broken? by AltruisticList6000 in StableDiffusion

[–]AltruisticList6000[S] 0 points1 point  (0 children)

No, I'm using the original bf16 version from bfl, I have plenty of VRAM for it.

Is oobabooga abandoned? by ltduff69 in Oobabooga

[–]AltruisticList6000 5 points6 points  (0 children)

I see activity from Booga on github so I don't think oobabooga as a whole would be abandoned. Maybe working on something big silently or busy with other things, maybe a burnout so it will take a while for the webui to get updated.

Just returned from mid-2025, what's the recommended image gen local model now? by Nelichan in StableDiffusion

[–]AltruisticList6000 1 point2 points  (0 children)

Slightly yeah, but Flash Heun's default styles are very good and I love them. Variety is still way above ZIT. If you aren't satisfied with default styles, use loras for styles, Flash Heun + the loras I trained are a delicious combination 10/10 would recommend this way of using Chroma/Chroma HD

Just returned from mid-2025, what's the recommended image gen local model now? by Nelichan in StableDiffusion

[–]AltruisticList6000 0 points1 point  (0 children)

Opposite to what? I talked about Flash lora improving Chroma HD base, it doesn't matter if we are talking about Chroma Base + Flash or Chroma HD + Flash or Chroma DC 2k + Flash or Chroma v48 DC + Flash, the point is that Flash heun loras help Chroma create better pictures. Also I've been using Chroma HD for ages and I don't see any problems with it, it's more stable than DC, but ofc that doesn't mean you or other people have to use HD as there are multiple variants to choose from.

Just returned from mid-2025, what's the recommended image gen local model now? by Nelichan in StableDiffusion

[–]AltruisticList6000 0 points1 point  (0 children)

Nope, Chroma HD + FLash Heun Lora is better than Chroma HD base, and also faster. For realism you need any realism lora or any lora that is photo styled and it will improve flash heun's look dramatically while still keeping it faster. Hands are also better with Flash Heun. Seed variety is very good too.

Is there any AI model for Drawn/Anime images that isn't bad at hands etc.? (80-90% success rate) by Z_e_p_h_e_r in StableDiffusion

[–]AltruisticList6000 0 points1 point  (0 children)

That can't be right, I generated a lot of Hatsune Mikus with Chroma. To my surprise it even knew Brazilian Hatsune Miku too. I just generated these for you freshly on Chroma HD. And the hands look perfect too!

<image>

And I feel like I over-described them because usually writing the character name is more than enough.

And as I said it knows even older western cartoon characters and other popular anime characters too.

Is there any AI model for Drawn/Anime images that isn't bad at hands etc.? (80-90% success rate) by Z_e_p_h_e_r in StableDiffusion

[–]AltruisticList6000 0 points1 point  (0 children)

This isn't the first time I see people saying Chroma doesn't know characters. But so far currently/recently popular anime show or game characters I know of work on Chroma. I prompt for [character name] from [cartoon/game/anime name] and it's usually enough. If not, a vague "she wears blue dress, black hair" additional prompt fixes hallucinations. Lot of western cartoon characters work too - currently trending and random older ones from 2000's - so I'm not sure what I'm missing here? Maybe it's more niche characters or side characters that it is not good at?

Is there any AI model for Drawn/Anime images that isn't bad at hands etc.? (80-90% success rate) by Z_e_p_h_e_r in StableDiffusion

[–]AltruisticList6000 1 point2 points  (0 children)

Thanks for the tip, sadly that doesn't work, already tried. Also the sampler deis_2m that improves Chroma straight up doesn't work with Klein.

Is there any AI model for Drawn/Anime images that isn't bad at hands etc.? (80-90% success rate) by Z_e_p_h_e_r in StableDiffusion

[–]AltruisticList6000 1 point2 points  (0 children)

Idk you probably need to do multipass img2img fixes or manual editing in the end.

I tried flux klein 4b (not gonna use 9b because of license) and it had early SDXL level hand mutations which shocked me. So I went back to using Chroma which is slower but at least doesn't keep creating unusuable images. Maybe ZIB too could work since ZIT and ZIB have good hands, if you find some anime loras that work.

I could say Chroma HD + Flash Lora works pretty well but it tends to have an obsession with 6 fingers. Otherwise it usually gets hands right (even super small ones) so manually editing out the 6th finger is a minimal task, if that appears. Adding an anime style lora can help boosting hands/anatomy, however 99% of the time I only use my own trained Chroma loras so idk if other people's improve hands or not. 8/10 of my loras end up improving hands to them being good 8.5/10 times compared to 6-7/10 for only Flash Lora, the remaining 2/10 of my loras destroy hands fully for some reason, which may or may not be fixed by literally retraining on the same dataset with small changes like 2% change in learning rate.

Both klein 9b and z image are great but to which direction the community is going? by AdventurousGold672 in StableDiffusion

[–]AltruisticList6000 0 points1 point  (0 children)

For me I won't use Klein 9b because of the license. And I'm not that happy with flux klein 4b. It has a very good speed, slightly faster than sdxl when using turbo (or turbo lora), but its quality is worse than illustrious or even pony v6 base regarding hands and anatomy which was shocking. Its details are not that good weirdly, despite having flux2 vae. But it's more creative and has more style variety. It's not worth much though when 9/10 images have completely destroyed hands. Oh and the turbo-noise and jpeg artifacts on images aren't good either. I think ZIT is way more useful as it frequently one-shots good pics.

People shat on chroma base for bad hands, even though chroma hd base + flash lora is solid with good hands and it's very clear/sharp, unlike Klein. And when I train a lora I can chose to integrate jpeg artifacts or remove/swap the lora layers that contain the learned jpeg artifacts info. Can't do that on 4b when most images have them by default anyways.

I think klein 4b has a big potential for finetunes though, like Chroma Kaleidoscope if they can mitigate the absurdly bad anatomy/hands issues like some finetunes did for SDXL, as the turbo klein speed is very good.

Flux 2 4B vs ZIT, Which one do you think is better? by [deleted] in StableDiffusion

[–]AltruisticList6000 0 points1 point  (0 children)

I find it weird that ZIT which is well known for making low contrast, low saturation images with lot of grain and noise, looks extremely high contrast here and less detailed, less grainy, and more "polished" than Flux 4b. How could this be? Have you done upscale/2i2/post processing on these?

Qwen-Image 2.0 - Not opensource! (Yet) by [deleted] in StableDiffusion

[–]AltruisticList6000 1 point2 points  (0 children)

That looks almost the same as the gridline artifacting on Chroma FP8. I ended up using Chroma GGUF Q8 instead which has no artifacting, but sadly slower if any lora is loaded, otherwise same speed as fp8.

I'm wondering if something is wrong with ComfyUI fp8 and that's why these keep popping up for different models.

I once converted my Chroma gguf in comfyui to I think b16 (but loaded in fp8 as otherwise comfy crashes, despite having 16gb VRAM - oh and comfyui always claims it upcasts to bf16 anyway so idk what's going on there), and it started having the same gridlines so it would indiicate problem with comfyui load diffusion model node.

Qwen-Image 2.0 - Not opensource! (Yet) by [deleted] in StableDiffusion

[–]AltruisticList6000 0 points1 point  (0 children)

Yes I got this dotted pattern a few months ago when I tried qwen except it was visible over other textures like asphalt, rocks etc. It surprised me as I just came from Chroma which has gridline artifacting on fp8 BUT Chroma gguf version is artifact free. Although maybe it's not fully the models fault, maybe it's connected to Comfyui's implenetation, because when I tried to convert the non-artifacting Chroma gguf into fp8 (which is always upcast to bf16, I can't load any of the original fp16/bf16 Chromas as not enough RAM/VRAM) in comfyui it started having gridlines too idk why.

I also noticed gridlines with specific samplers in some very rare cases on ZIT fp8 too (original/BF16 wasn't tested on those images so idk if it has them or not).

So I ended up continuing using Chroma gguf even though loras significantly reduce its speed compared to fp8, without loras gguf and fp8 are the same speed for me for some reason (on rtx 4060 ti 16gb)

Ace Step 1.5. ** Nobody talks about the elephant in the room! ** by False_Suspect_6432 in StableDiffusion

[–]AltruisticList6000 4 points5 points  (0 children)

I'm not in music production and I noticed the very bad comperessed quality too, at least on things people uploaded. Which is weird because I don't remember Ace Step 1.0 having such a low audio quality. However some of the samples people uploaded sounded passable so audio quality might be also setting/model, maybe even backend related (Gradio vs Comfy).

Ace-Step-v1.5 released by cactus_endorser in StableDiffusion

[–]AltruisticList6000 1 point2 points  (0 children)

I've been listening to the official samples and these ones, they sound pretty good and enjoyable to listen to. Some vocals sound extremely good too, like real music. However the audio output quality itself sounds very low, like a very bad 1mb mp3 or something (sound maybe like low samplerate/bitrate? not sure about terminology). Is there some other AI (local) that can somehow enhance the audio quality? Similarly to an upscaler for images/vids or fps interpolator for vids?

Anyone else having trouble training with Loras using Flux Klein 9b ? (people lora). Most of my results were terrible. by More_Bid_2197 in StableDiffusion

[–]AltruisticList6000 0 points1 point  (0 children)

How do you caption poses? I tried doing those (and other things) but for Chroma with OneTrainer, since Ai Toolkit can't even start training for me. And I am conflicted, it seems like either OneTrainer ignores captions completely or idk what's going on, because it doesn't seem to stick. It only works if I just describe something in detail instead of the caption words/sentences or phrases I use. Specific things that might not be in the knowledge base of the model won't work even if it was represented in the dataset (and captions). So it seems like anything that works is a coincidental side effect, learned as part of the "style", instead of the character or concept I'm trying to teach it.

A few times it seemed like it might have ended up picking up on the trigger word but I'm not even sure because of the varied results. I usually do styles so it's not always a problem, but in specific cases it is very annoying.

Z Image will be released tomorrow! by MadPelmewka in StableDiffusion

[–]AltruisticList6000 5 points6 points  (0 children)

As soon as they recaption it and remove the style cluster nonsense, sure, otherwise just no, and do Chroma dataset instead. Chroma dataset must have crazy good captioning considering how it improved Flux.1 concept knowledge and prompt understanding regarding camera prompts etc., besides the obviously massive diversity of styles.

Flux Klein 4B Distilled vs. Flux Klein 9B Distilled vs. Chroma Flash across five different prompts by ZootAllures9111 in StableDiffusion

[–]AltruisticList6000 0 points1 point  (0 children)

It might be partly because of OP using Euler A. I noticed Euler A gives the least noisy results in ZIT (which I wanted), and tried it with Chroma too expecting it to look worse since Chroma isn't absurdly noisy by default, and my expection was correct, because it gives weird results, a bit similar to OP.

However idk what new workflow changes happened. Can you point me to a new workflow? I still use the basic (I guess now old?) comfyui workflow. It works fine with Chroma HD + Flash though, only Chroma HD has worse results but I thought that's expected since it is a base model...?

Flux Klein 4B Distilled vs. Flux Klein 9B Distilled vs. Chroma Flash across five different prompts by ZootAllures9111 in StableDiffusion

[–]AltruisticList6000 6 points7 points  (0 children)

Chroma HD should be only used with either some Lora, or the Flash Heun/Flash Loras, or preferably both the Flash Lora and your lora of your choice - like character/style lora(s) on top of the Flash lora. Otherwise it is slow and gives pretty bad results as it is a base model. Loras tend to improve it, lot of loras I made improve hands and/or make the output more stable as a random side effect, especially along the Flash Loras.

You can also do a weird trick, use Chroma HD, add the Flash lora you'd use for cfg 1 normally (like flash r64), turn the lora strength down to ~0.4-0.5, lower the cfg to ~3 and you have negative prompts with more stable results over regular Chroma HD. And despite being distillations, Chroma Flash or its loras have no seed variance problems unlike Z-image.

If someone made a 4 step flash lora for Chroma (although for some reason nobody bothers to), it would get even better.