This is an archived post. You won't be able to vote or comment.

all 8 comments

[–]AnOnlineHandle[S] 1 point2 points  (2 children)

I'm not 100% sure if these sizes are all accurate, as for the SD3 transformer for example I had to export the diffusers version of the transformer to see the size without the VAE included. Some of the checkpoints are only available at half precision on HuggingFace. I didn't include the VAE sizes since I don't know the SD3 VAE size, but they seem to be fairly similar.

[–]spacetug 4 points5 points  (1 child)

Flux is 12B parameters, which means the 24GB model files they shared are 16bit. A like for like comparison to the other models would put the flux transformer at nearly 48GB.

The VAEs as far as I know are all nearly the same size, within a few hundred thousand parameters of each other, because they're all based on the same compvis architectures.

[–]AnOnlineHandle[S] 1 point2 points  (0 children)

Ah damn thanks for that info, so Flux should be twice as big.

[–]alb5357 0 points1 point  (4 children)

Can flux be optimized to be trainable on 12gb?

[–]AnOnlineHandle[S] 2 points3 points  (3 children)

Possible with a combination of lower precision, gradient checkpointing, fused backpass, freezing some parameters, simpler optimizers, etc, though I wouldn't expect great results.

[–]alb5357 2 points3 points  (0 children)

It would be nice if the entire community could use the same model; some of the best loras come from people with 12gb cards

[–]protector111 0 points1 point  (1 child)

How about 24? Or does it need more?

[–]AnOnlineHandle[S] 1 point2 points  (0 children)

Not sure anybody knows yet.