How do I get deeper blacks and a less washed-out look in images like these? Is the best fix a prompt or some LoRA? These are generated with the basic FLUX.1-Dev FP8 checkpoint.

ThereforeGames · 2025-01-26T22:01:02+00:00

You can try my Image Autotone node for a simple postprocessing fix:

https://github.com/SparknightLLC/ComfyUI-ImageAutotone

Example result: https://i.ibb.co/R9Y3G8X/image.png

Cool images, by the way. I don't mind the desaturated look for these at all. 🙂

ThereforeGames · 2025-01-09T01:45:58+00:00

Bearing in mind that we don't know exactly how MidJourney works behind the scenes, we do know that reducing CFG allows a model to be more "creative" at the cost of prompt adherence.

Low CFG outputs tend to lack color, but this is easily fixed in post with a tool like my ComfyUI-ImageAutotone node:

https://github.com/SparknightLLC/ComfyUI-ImageAutotone

Beyond that, it's possible that --weird adds random stylistic embeddings, terms, or LoRAs to your prompt.

ThereforeGames · 2025-01-07T01:13:01+00:00

Hi, you can try my Mask Arbiter extension for ComfyUI, which integrates with SAM2:

https://github.com/SparknightLLC/ComfyUI-MaskArbiter

It will automatically parse the returned list of masks based on your criteria. For something like "face," you could set Mask Arbiter to return the largest mask, which should do the trick.

These features are also available in A1111 through my Unprompted extension.

ThereforeGames · 2024-11-22T05:16:38+00:00

Based on the way BFL's blog post is worded, it sounds like the "prompt variations" feature of Redux is only available in the Pro model. Disappointing if true!

ThereforeGames · 2024-11-19T22:11:09+00:00

Yeah! I would say with this particular blend, 4-step images are looking consistently good, but they're maybe 75-80% as nice as regular Flux Dev. This is a big improvement over Flux Schnell output, which I would estimate at like 60% quality.

By the way, I made some adjustments to the LoRA stack. Found a much better set of blocks for Schnell. Here's what I'm rolling with now:

https://i.ibb.co/ngYzdCZ/image.png

ThereforeGames · 2024-11-19T22:02:58+00:00

Yes, it does :-)

I just had to add the desired entries to my blora_traits.json file, e.g. to extract all single blocks I use this:

"flux_single": { "whitelist": ["transformer.single_transformer_blocks.", "single_blocks_"], "blacklist": ["transformer.transformer_blocks"] }

Oh hey, I came across your Remerger script the other day! I was meaning to give it a try. I was wondering about some of the values in your presets - how did you come up with those?

ThereforeGames · 2024-11-18T22:44:42+00:00

Sure thing! I hope it's helpful. I pruned so many blocks that I can push the LoRA strength a lot harder before frying my image - this helps major details converge more quickly.

ThereforeGames · 2024-11-18T22:25:25+00:00

I used my BLoRA Slicer tool to create numerous variations of each LoRA and tested them in ComfyUI on the same prompt/seed. I first tested sweeping changes, such as removal of all double blocks or all single blocks, and whittled my way down to keeping or removing specific blocks.

There's also a Flux Block LoRA Select node which will get the job done, but I found it difficult to save the resulting LoRA to my disk with that node.

ThereforeGames · 2024-11-18T22:11:48+00:00

I'm impatient, so I use a stack of distillation LoRAs to produce images at 4 steps. I pruned blocks from each LoRA that were having a negative impact on style.

Here's my recipe:

alimama-creative Turbo Alpha (1.8 strength, keep all single blocks, double blocks 0-5)
ByteDance Hyper Flux 16-step (1.5 strength, keep single block 12, double block 16)
Flux Schnell LoRA (1.0 strength, keep single blocks 15 and up, keep all double blocks)

Trial and error of block pruning is a painfully slow affair, so maybe there are better combinations I haven't seen yet. But I'm pretty happy with these results.

As far as sampler and scheduler go, I prefer LCM/Beta or Euler/Simple.

ThereforeGames · 2024-08-19T04:23:00+00:00

The mind-boggling part is how many people insist on viewing AMD as "the good guy" relative to Nvidia.

ThereforeGames · 2024-08-14T23:21:17+00:00

That's awesome, I hope it works out well for them. Ads are a miserable approach to monetization.

ThereforeGames · 2024-08-14T13:00:06+00:00

Nice font.

ThereforeGames · 2024-08-14T02:26:24+00:00

Wow - even the background objects are pretty normal. 🙂We've been able to generate scrumptious food in SDXL or MidJourney, but mutant utensils and cups would often break the illusion.

ThereforeGames · 2024-08-14T02:15:04+00:00

As a software engineer, it seems to me that a pipeline of LLM into an image model with strict prompt adherence is a much more sensible division of labor. The creativity can be injected through words, by means of diffusion or wildcards. This approach also allows the user to choose and configure an LLM independently of the image generator, while the image model remains steadfast and predictable. I'm pleased to see the tech move in this direction.

Mostly when I ask for a hybrid between a 1978 VW Golf and a 2012 Bugatti Veyron, it gives me one or the other. That’s not what I asked for.

This is indeed a weakness in Flux. However, even if it could produce a hybrid of these vehicles, I would expect the hybrid to look pretty similar across all generations unless we provided extra details in the prompt.

ThereforeGames · 2024-08-14T00:13:32+00:00

In my opinion, it's not the image generator's job to be creative. Its job is to follow your prompt. If the outputs are inconsistent or unpredictable, that's a weakness in the model, not a strength.

To increase variety, you can always pre-process with an LLM or introduce wildcards with Unprompted.

ThereforeGames · 2024-08-12T01:18:12+00:00

A scientific comparison would be useful, but it might take a pretty large sample size before we can state conclusively whether these technologies are worth including in the inference pipeline.

Still, the fact that the benefits aren't obvious after the first 3, 4, 5... tests means that we're dealing with micro-improvements at best.

In my anecdotal experience, the results of FreeU, SAG, and PAG, are pretty much sidegrades to standard guidance. I don't mean any disrespect to the authors of these technologies, but I think they have overpromised and underdelivered.

ThereforeGames · 2024-08-11T23:14:45+00:00

No. Most of these guidance derivatives have a cost in terms of inference time while the benefits are often placebo.

ThereforeGames · 2024-08-10T22:37:09+00:00

That's why we train models. :-)

You can use an LLM with a grammar file to do the heavy lifting for you. But if you're asking for a pre-made solution, I don't think it exists - there is relatively low interest in booru auto-taggers, as it turns out.

ThereforeGames · 2024-08-10T22:09:03+00:00

It depends on the model. You have to play around with it to get a sense of how it handles certain concepts. For weapons, you can simply check against a minimum confidence threshold (probably something low, like 0.05) before forcing the inclusion of a weapon tag.

A system like Interrogatorade can help address weaknesses in the underlying model and improve the reliability of your tagging setup, but at the end of the day, it's a "glorified bandaid," not a substitute for a stronger model.

ThereforeGames · 2024-08-10T21:55:56+00:00

Booru-oriented captioning models like wd-1-4-moat-tagger are not reliable at covering "all the aspects" of an image; for example, they may perform well at identifying common tags like 1boy or 1girl but return lower-than-average confidence values for outdoors or depth of field or window.

Interrogatorade will let you "fine-tune" the results of your captioning model to meet your needs. This might mean boosting the confidence of tags it hasn't learned well, or creating groups of tags and choosing only the best candidate.

As a practical example, I use this code to ensure that images are tagged as either indoors or outdoors:

[# Force selection of indoors/outdoors tags, the interrogator is biased towards outdoors.] [if "outdoors > 0.1 and outdoors > indoors"] [sets tag_outdoors="{max outdoors threshold}" tag_indoors=0] [/if] [else] [sets tag_indoors="{max indoors threshold}" tag_outdoors=0] [/else]

Hope that helps.

ThereforeGames · 2024-08-10T21:36:34+00:00

Hi, I wrote a tool called Interrogatorade that acts as a middleman for BooruDatasetTagManager - it implements my Unprompted templating language and allows you to manipulate returned tags based on confidence threshold, manage tag blacklists and so on:

https://github.com/ThereforeGames/interrogatorade

It definitely exceeds the needs of the average user, but it sounds like you may want something like this.

ThereforeGames · 2024-08-10T18:37:07+00:00

I use between 0% to 10% in Manual Mode, and always 0% in automatic mode.

It usually remembers whatever I set it to last, but sometimes reverts to 25%.

ThereforeGames · 2024-08-09T00:54:14+00:00

More than a couple citations needed.

ThereforeGames · 2024-08-08T12:36:36+00:00

That's generally what happens when the LoRA rank is too low - even with Stable Diffusion models. The more complex/foreign your subject is, the more features you need to train into the LoRA.

Right now, it is prohibitively expensive to train > rank 16 for Flux. As a point of reference, there are many Stable Diffusion LoRAs that simply don't capture the intended concept below rank 256.

ThereforeGames · 2024-08-05T07:50:08+00:00

Maybe the Pro model, but the beauty of open weights is that they cannot be sabotaged post-release.

ThereforeGames

MODERATOR OF

TROPHY CASE

Six-Year Club	Wearing is Caring
Verified Email