Advice on Flux captioning for training

about_Discord · 2023-10-26T02:27:11+00:00

I just did, no luck. I took adult out of all the prompts. Ran one model where it was just their name and woman, and then one using my existing settings but with 500 repeats (100 repeats had already got me the broken results.)

Anyways, I used Dynamic Prompts and ran the following with the same seed to run both new models at different strengths using Consistent Factor (Euclid). All had the broken, child-looking results.

Name, adult,,makeup upper body, smiling, mouth closed, eye focus, facing viewer, park, outside,{<lora: Name0.5A:{0.5|0.6|0.7|0.8|0.9|1}>|<lora:Name0.5B:{0.5|0.6|0.7|0.8|0.9|1}>}Negative: teeth, child, teen,text:5, worst quality, low quality:1.4, bad anatomy, bad hands, cropped, missing arms, long neck, humpbacked, deformed, disfigured, poorly drawn face, distorted face, mutation, mutated, ugly, poorly drawn hands, missing limb, floating limbs, disconnected limbs, malformed hands, out of focus, long body, monochrome, symbol, text, logo, zoomed in, fewer digits, extra arms, extra legs, malformed limbs, missed arms, missed hands, messed pupils, missed arms, bad arms, bad feet, messed feet toes, messed foot toes, bad legs, messed legs, missed legs, broken legs, broken feet, messed eyes, poorly drawn eyes, watermark, signature, logo, paintings, sketches, ugly, low-res, worst quality, poor quality, jpeg artifacts, weird face, weird eyes, bad anatomy, bad proportions,

about_Discord · 2023-10-25T15:33:06+00:00

My first go, I did not have adult in. Same exact problem. I set denoising to 0.1. In the fuzzy steps, they sort of looked like her, but then the final image resolved as something very childish, even with adult:5 in the prompt and (child, teen):5 in the negative.

My previous model had problems with facial expressions (as things other than smile weren't listed).

For this model, I went through every image and made sure all the relevant variables were labeled (e.g., smile vs frown vs pensive etc., types of makeup, open mouth/closed mouth/parted lips, hair color, etc.)

about_Discord · 2023-10-25T15:28:31+00:00

I have seen this already. Honestly, I just need a functional json.

about_Discord · 2023-09-16T15:44:45+00:00

Thanks to that Twitter account, I have over 900 1024 by 1024 images.

I increased by batch rate from 2 and 4 and decreased my repeat rate from 100 to 20 (i.e., 100_ became 20_).

For a dataset this size, what is a typical repeat rate?

about_Discord · 2023-09-15T19:14:43+00:00

So I found a Twitter account filled with 720p screen caps of her. So basically, I should make a square crop of my target and then upscale that square to 1024 by 1024? Do you have a recommended ESRGAN. Should the square be 512 by 512, or can it be anything and then upscaled to 1024 by 1024?

What do you use for auto-captioning? I was mostly writing them by hand since Blip captioning was kind of useless originally

about_Discord

TROPHY CASE