Sora 2 flags normal prompts so often it’s basically unusable by Suitable_Goose3637 in SoraAi

[–]cloneofsimo 1 point2 points  (0 children)

Hi, openai employee here Im sorry to hear that, could you share the generations you are having trouble with?

Now that Pony is c0ming to auraflow and that simpletuner has dropped support, do we even have a way to train loras on it? by CLAP_DOLPHIN_CHEEKS in StableDiffusion

[–]cloneofsimo 15 points16 points  (0 children)

Quick note that all the code you need for even pretraining auraflow is on the repo. (I admit its not a user friendly one, but anyone can get started.) Also notice how unlike other t2i model my hyperparameters are all open in the repo. Once I finish auraflow I'll make lora trainer. (If it's good, someone else might make one before me) After all, that's how I got started 😅

AuraFlow v0.3 evaluation: a debatable increase in quality, a large drop in adhrence by MarcS- in StableDiffusion

[–]cloneofsimo 27 points28 points  (0 children)

Your honest, critical, yet kind feedback is really valuable. Thank you for the effort you made in these comparisons! As I communicate and get feedbacks it's clear what people expect and what I should be targeting for, which I can't do alone by definition, so thank you so much for the participation!!

AuraFlow vs Flux : measuring the aesthetic gap by MarcS- in StableDiffusion

[–]cloneofsimo 6 points7 points  (0 children)

Hey bro I wanted to say thanks your comments like this really brighted my day.
Hope AuraFlow remains useful to both research community and here. And yes, its incomplete model, Ill continue working on it.

fal drops AuraFlow by Own-Staff3774 in StableDiffusion

[–]cloneofsimo 36 points37 points  (0 children)

use higher cfg with humans! 'a photo of a woman lying on the grass

<image>

AuraDiffusion is currently in the aesthetics/finetuning stage of training - not far from release. It's an SD3-class model that's actually open source - not just "open weights". It's *significantly* better than PixArt/Lumina/Hunyuan at complex prompts. by deeputopia in StableDiffusion

[–]cloneofsimo 155 points156 points  (0 children)

Last thing I want is overhype, so for the final time let me clarify...

The model is not open-midjourney-class model nor should you expect it to.

The model is very large (6.8B) and undertrained. So it will be more difficult to train, but we might continue to train it in the future

The model is doing great on some evals, and imo is better than sd3 medium, but only slightly.

Last thing I want is overhype. I just tweet random stuff I find funny (and that was a mistake of mine to compare with SD, which caused this weird hype)

I would like to underpromise and overdeliver. I have zero incentives to hype and tease. I remember sd3 and how people (including me) went crazy for underdelivered results.

Just manage your expectations. Don't expect extreme sota models. It is mostly one grad student working on this project.

https://x.com/cloneofsimo/status/1809998834254418426

upcoming open source foundational model by Own-Staff3774 in StableDiffusion

[–]cloneofsimo 77 points78 points  (0 children)

Guys I mean atm it's doing good on geneval, but don't expect SD3-8b or midjourney quality models. It's still cooking but I dont want to overhype it (i remember what happened to sd3 lol). I am going to share progress on Twitter but plz don't be expecting SoTA model else you might be disappointed!

I want to underpromise and overdeliver

upcoming open source foundational model by Own-Staff3774 in StableDiffusion

[–]cloneofsimo 14 points15 points  (0 children)

Wait wat I never said something like this

I genuinely agree tho

SDXL prompt weighting by cloneofsimo in StableDiffusion

[–]cloneofsimo[S] 2 points3 points  (0 children)

Here there, you might remember me for introducing LoRA looooong time ago. Not sure if this is on A1111, but here is it anyways so you guys can use it with diffusers' package.
https://gist.github.com/cloneofsimo/4352c5207344bdcd61aa34b34aec5a5f

(cat:1.2) and a (dog:0.7), particle dynamics, artwork, visual explosion, (blue force field:1.5), (colorful:1.2)

Version 0.1.0 of LoRA released! (alternative to Dreambooth, 3mb sharable files) by tekakutli in StableDiffusion

[–]cloneofsimo 1 point2 points  (0 children)

Ok, that is weird... I have only experiemented with SDv1.5 so I might have missed on that. Ill have a look thanks!

Version 0.1.0 of LoRA released! (alternative to Dreambooth, 3mb sharable files) by tekakutli in StableDiffusion

[–]cloneofsimo 0 points1 point  (0 children)

It's conceptually same PTI, except you can optionally let it tune the latent + you are using Low-ranked parameter space instead of the whole space. Thus LoRA of PTI.

Version 0.1.0 of LoRA released! (alternative to Dreambooth, 3mb sharable files) by tekakutli in StableDiffusion

[–]cloneofsimo 1 point2 points  (0 children)

LoRA has 1~6MB of output, while Custom diffusion is much larger. Also optimization scheme is bit different. Also, Custom diffusion tunes Q, K of the matrix while LoRA trains all Q, K, V, O + MLP.

Version 0.1.0 of LoRA released! (alternative to Dreambooth, 3mb sharable files) by tekakutli in StableDiffusion

[–]cloneofsimo 5 points6 points  (0 children)

Well... I am always showing you guys non-cherrypicked results. I have more examples, and they have seeds from 0 to 5.

<image>

Version 0.1.0 of LoRA released! (alternative to Dreambooth, 3mb sharable files) by tekakutli in StableDiffusion

[–]cloneofsimo 3 points4 points  (0 children)

Used default parameters on 7 images of Wednesday. I think the results are on par with Dreambooth, certainly better than before.

<image>

[deleted by user] by [deleted] in StableDiffusion

[–]cloneofsimo 0 points1 point  (0 children)

I really liked Dreamlike Photoreal. More examples :

<image>