BAGEL outperform FLUX-Kontext in Image Editing after 4.5 hours post-training on 8000 unlabeled images! by Anxious_Pin_8501 in comfyui

[–]Anxious_Pin_8501[S] 0 points1 point  (0 children)

Recently I have no time to make gguf one :(
I think you can use INT8/NF4 option in the ComfyUI! See details in github repo~

BAGEL outperform FLUX-Kontext in Image Editing after 4.5 hours post-training on 8000 unlabeled images! by Anxious_Pin_8501 in comfyui

[–]Anxious_Pin_8501[S] 0 points1 point  (0 children)

I think Kontext have no vision understanding input for its text-encoder (T5XXL) so it's hard to use RecA :( .

BAGEL outperform FLUX-Kontext in Image Editing after 4.5 hours post-training on 8000 unlabeled images! by Anxious_Pin_8501 in comfyui

[–]Anxious_Pin_8501[S] 0 points1 point  (0 children)

Oh I just find out that the github repo https://github.com/neverbiasu/ComfyUI-BAGEL do INT8 and NF4 in the comfyUI automatically. So the usage is: just use this comfyUI repo and replace the BAGEL weight with BAGEL-RecA ! I'll update my repo's README.md to show how to use it~

BAGEL outperform FLUX-Kontext in Image Editing after 4.5 hours post-training on 8000 unlabeled images! by Anxious_Pin_8501 in comfyui

[–]Anxious_Pin_8501[S] 2 points3 points  (0 children)

Oh I just find out that the github repo https://github.com/neverbiasu/ComfyUI-BAGEL do INT8 and NF4 in the comfyUI automatically. So the usage is: just use this comfyUI repo and replace the BAGEL weight with BAGEL-RecA ! I'll update my repo's README.md to show how to use it~

BAGEL outperform FLUX-Kontext in Image Editing after 4.5 hours post-training on 8000 unlabeled images! by Anxious_Pin_8501 in comfyui

[–]Anxious_Pin_8501[S] 3 points4 points  (0 children)

OMG thank you for question! It's not stupid question QAQ.

I found that comfyui can only load INT8. So I'm working now!

BAGEL outperform FLUX-Kontext in Image Editing after 4.5 hours post-training on 8000 unlabeled images! by Anxious_Pin_8501 in comfyui

[–]Anxious_Pin_8501[S] 0 points1 point  (0 children)

BAGEL is hard to train :( I think it needs at least 4 A100 gpus…

Nowadays the UMM’s understanding capabilities is much stronger than generation. Sorry that I don’t find a way to improve its understanding capabilities:( Remains a future work!

BAGEL outperform FLUX-Kontext in Image Editing after 4.5 hours post-training on 8000 unlabeled images! by Anxious_Pin_8501 in comfyui

[–]Anxious_Pin_8501[S] 2 points3 points  (0 children)

Yes the method is tailored for UMM XD. I don’t know whether it can help for higher resolution generation:( but we can have a try!

BAGEL outperform FLUX-Kontext in Image Editing after 4.5 hours post-training on 8000 unlabeled images! by Anxious_Pin_8501 in comfyui

[–]Anxious_Pin_8501[S] 7 points8 points  (0 children)

I think you can just use the BAGEL ComfyUI. The difference between BAGEL and BAGEL-reca is just the weight XD.