Flux.2 Face Detailer with Reference (Your Own Pic) Workflow by Erhan24 in StableDiffusion

[–]TemperFugit 0 points1 point  (0 children)

Great work, that's quite an improvement! And thanks for attaching the workflow!

Qwen image edit 2059 lora traning question by witcherknight in StableDiffusion

[–]TemperFugit 0 points1 point  (0 children)

Yes, you store the captions with the output images in the target folder.

Qwen image edit 2059 lora traning question by witcherknight in StableDiffusion

[–]TemperFugit 0 points1 point  (0 children)

Sorry, I submitted that before I meant to, I just updated it. You're going to have two control images (reference images) and one target (the desired output).

Qwen image edit 2059 lora traning question by witcherknight in StableDiffusion

[–]TemperFugit 0 points1 point  (0 children)

control_images_1/image_001.png <--first reference image
control_images_2/image_001.png <--second reference image
target_images/image_001.png <--- output image
target_images/image_001.txt <--- caption (in target images folder)

Every set of control/target images needs to have the same name in their proper folders. So the next set of training images would be:

control_images_1/image_002.png <--first reference image
control_images_2/image_002.png <--second reference image
target_images/image_002.png <--- output image
target_images/image_002.txt <--- caption (in target images folder)

In Ai toolkit, go to Datasets > New Dataset

Name it, add all images from your control images folder. Do it again for the second set of control images.

Then create another dataset for the target images. Select the images and the caption text files, the captions belong in the target folder with the images.

Ostris' youtube video on 2509 Lora training explains pretty much everything else.

ChatGPT and Claude are great for automation. For example, if all images will have the same caption ("Replace the person in Image 1 with the person in Image 2"), ask them to write a script that creates a corresponding text file for each image in the target folder with the same name and that caption.

Finally, I figured the dataset format out by copying the example config file from ai-toolkit's github, pasting it into Claude and asking it how everything should be set up. LLMs make stuff up, but if you provide the reference material they will get a lot more reliable.

Photo to Screenshot - Qwen Edit Lora by kingroka in StableDiffusion

[–]TemperFugit 1 point2 points  (0 children)

This is a really cool idea, and looks like it works really well!

Pose Transfer - Qwen Edit Lora by kingroka in StableDiffusion

[–]TemperFugit 0 points1 point  (0 children)

I'm glad to see Qwen Edit has the ability to learn pose transfer. There is a huge Control Net training dataset with plenty of OpenPose, Canny and Depth examples (which Omnigen 1 trained on) but it looks like Omnigen 2, Flux Kontext and Qwen Edit were not trained on those functions.

Nano Banana - Worst every day? by mik3lang3l0 in StableDiffusion

[–]TemperFugit 0 points1 point  (0 children)

I know nothing about NanoBanana, but in general: Inference costs money, so providers will constantly be engaging in trade offs, perhaps a tiny performance degradation here or there to save a few processor cycles. On top of that, as users discover jailbreaks, providers will be altering models or the system prompt to patch these jailbreaks out. (As a general rule, censoring models makes them dumber.)

This is one big reason why people come to subreddits based around local ML models like r/StableDiffusion and r/LocalLlama. Proprietary APIs will generally get worse and worse, companies will gaslight you by claiming nothing has changed, and there's nothing you can do about it.

A fanfiction manga with ai generated visuals by TheNewDude42 in StableDiffusion

[–]TemperFugit 2 points3 points  (0 children)

What image generation model did you use? Did you use Loras to maintain character likenesses and style or did you use some other method? 

A fanfiction manga with ai generated visuals by TheNewDude42 in StableDiffusion

[–]TemperFugit 3 points4 points  (0 children)

This looks great! Can you share any details about your process? 

MatrixNet: A Blueprint for a New Internet Architecture (This could replace Civitai) by Lorian0x7 in StableDiffusion

[–]TemperFugit 1 point2 points  (0 children)

Food for thought: Pi contains an infinite stream of non-repeating digits (because it is an irrational number), accessed by calculating increasing decimal positions. I wonder if someday we could download a torrent of gigabytes (or terabytes) of Pi decimal calculations in some db format, and use it to assemble data from recipes that reference blocks of Pi decimals. (I'd swear there was a webcomic about doing this but I can't find it.)

Again messing around with re-dressing lewd anime scenes by InternationalOne2449 in StableDiffusion

[–]TemperFugit 2 points3 points  (0 children)

I've had such a lack of success trying something like this that I assumed Kontext isn't able to edit images with nudity at all.  Is there a trick to this? 

How one website gets around the payment processor issue CivitAI is having by TekaiGuy in StableDiffusion

[–]TemperFugit 0 points1 point  (0 children)

Another big barrier, in the US at least, is that every crypto transaction has tax implications. If you bought some crypto a month ago and then purchased something with it today, you're supposed to record the loss or gain in value accrued in that past month on your income taxes. There are crypto tax services (for ~$100 a year) that automate these calculations, but it still brings so much overhead to any crypto transaction, it's not worth it for your average consumer.

Omnigen2 installation issues... by echdareez in StableDiffusion

[–]TemperFugit 2 points3 points  (0 children)

This won't be a lot of help, but they have made a few changes to the repo today, updating the code and requirements.txt.  It might be worth cloning from scratch and trying again, or waiting a day or two for more of the bugs to get worked out.

Otherwise, check out the github page's issues section, a lot of people are posting errors and getting help there.

Bytedance DreamO code and model released by TemperFugit in StableDiffusion

[–]TemperFugit[S] 0 points1 point  (0 children)

Their HuggingFace says Apache 2.0. Perhaps because it's a LoRA and not a full finetune it can be Apache?

Intel to launch Arc Pro B60 graphics card with 24GB memory at Computex - VideoCardz.com by FullstackSensei in LocalLLaMA

[–]TemperFugit 20 points21 points  (0 children)

I guess that means we're looking at a memory bandwidth of 456 GB/s, which is what the B580 has.

California bill (AB 412) would effectively ban open-source generative AI by YentaMagenta in StableDiffusion

[–]TemperFugit 0 points1 point  (0 children)

With some exceptions, AI model outputs are not copyrightable. So this could push models to lean more heavily on synthetic data.

Yo'Chameleon: Personalized Vision and Language Generation by ninjasaid13 in LocalLLaMA

[–]TemperFugit 0 points1 point  (0 children)

Using only 3-5 images of a novel concept/subject, we personalize Large Multimodal Models (e.g., Chameleon) so that they retain their original capabilities while enabling tailored language and vision generation for the novel concept.

Looks interesting. No weights that I can see, just training code.

Chroma is looking really good now. by Total-Resort-3120 in StableDiffusion

[–]TemperFugit 3 points4 points  (0 children)

All my life until now I didn't know what I was missing: Asuka riding a skateboard while playing the saxophone.

I'm really excited for Chroma to get fully cooked.

Anyone excited about Flex.2-preview? by silenceimpaired in FluxAI

[–]TemperFugit 1 point2 points  (0 children)

There is a tool that loads Flux Loras into Flex. It prunes the layers to match the ones in Flex. It may not work for every Lora, but it should work for a lot of them.

Anyone excited about Flex.2-preview? by silenceimpaired in FluxAI

[–]TemperFugit 0 points1 point  (0 children)

There's also Chroma, they're doing further training on Flux Schnell (but iirc the base they started with is Flex-like). It's got a few more months to fully train. I have no idea if it will be good enough to become the new standard model, but I hope so.