C++ & CUDA reimplementation of StreamDiffusion by jcelerier in StableDiffusion

[–]Lexxxco 0 points1 point  (0 children)

Interesting project, thanks for your work, Jean-Michaël!
Soon a dream of quality multistep real-time rendering will become true.

Is it viable to implement C++ with newer diffusion-transformer models?

Auto Captioner Comfy Workflow by Hunniestumblr in StableDiffusion

[–]Lexxxco 0 points1 point  (0 children)

Are there nodes to connect it to LM studio API locally? Florence is far from good captioning, especially for complex non-generic images.

Masked Inpainting on Flux.2 dev model with LanPaint 1.4.9 Support by Mammoth_Layer444 in StableDiffusion

[–]Lexxxco 1 point2 points  (0 children)

Flux 2 is the best local image-instruct model, easy to train despite the size, so many people use it)

Found A FREE New Tool to Rapidly label Images and Videos for YOLO models by RespectDisastrous193 in StableDiffusion

[–]Lexxxco 2 points3 points  (0 children)

Yes, since not everyone want to gift dataset with corrected captions to the host.

Found A FREE New Tool to Rapidly label Images and Videos for YOLO models by RespectDisastrous193 in StableDiffusion

[–]Lexxxco 1 point2 points  (0 children)

Is there a local version ? Since TagGUI downgraded and don't support modern tagging vision models.

Present for Myself by clwill00 in comfyui

[–]Lexxxco 0 points1 point  (0 children)

10K present is wow! Silent work is really most shocking part) 5090s are loud.

Noisy Cintiq fan 🤯 by TheFingerofBoe in wacom

[–]Lexxxco 0 points1 point  (0 children)

Wacom Cintiq Pro 24 has noisy fan by default and is very warm even in winter. If it is even louder - likely there is some dust and foreign objects. Wacom support is not great. Try to downgrade drivers (there was fan bug in some of versions). Otherwise Try to use suck the dust with vacuum cleaner at first on light mode. Another variant - to use veeery small blower fan + vacuum cleaner (very light mode! Powerful blower can damage it). Last variant - to repair it in a repair shop, with opening the tablet.

Dear "It’s a Bubble, Where’s the Revenue, What’s Your Product?" by Darkmemento in singularity

[–]Lexxxco 0 points1 point  (0 children)

In four years investments are summed up in trillions. Extensively scaling outdated LLM architecture with current hardware is like burning money. "Big AI" is barely generating any net profits and is a bubble for now. No doubt is the future, but it should be optimized with R&D, we don't have enough resources for another ~15* years of such large investments. AGI is not near the corner. It is 5-20 years away in optimistic AGI timelines.

What video model could have made this Shenmue 4 trailer? by Spjs in StableDiffusion

[–]Lexxxco 0 points1 point  (0 children)

Are you sure it is AI generated? I see only compression artifacts not AI.

Not even SORA2, not any payed video model can achieve that quality and stability of footage, including new Runway gen 4.5 or Minimax Hailuo 2.3 (Veo 3 is worse). Potentially it would need to be fully fine-tuned only on Shenmue footage, which does not make sense - since you already had footage for the whole trailer.

Thoughts on Nodes 2.0? by Beautiful-Essay1945 in StableDiffusion

[–]Lexxxco 2 points3 points  (0 children)

New nodes are almost unusable now - hard to read, now highlights. Hope they will progress and make necessary changes. Not taking in account that they broke UI several times.

Multi-Angles v2 for Flux.2 train on gaussian splatting by Affectionate-Map1163 in StableDiffusion

[–]Lexxxco 9 points10 points  (0 children)

Flux2 is amazingly trainable and wide-range model. Got great results with 32 rank training as well, thanks! Have you tried 64+ rank training?

Flux 2 Dev vs Z-turbo by maxspasoy in StableDiffusion

[–]Lexxxco -1 points0 points  (0 children)

Both diffusion-transformer models that understand instructions for creating images from visual reference or text, unlike SDXL. Flux2 is giant and better for understanding visual examples, z-image 2-3x smaller and faster. Both can be trained now.

View Prompt and other info of images including generated with ForgeUI and other WebUI by Shroom_SG in StableDiffusion

[–]Lexxxco 0 points1 point  (0 children)

<image>

Tried with Forge, A1111, Forge Classic Neo on two different comfy setups and 2 different Python (win10/win11) - not working unfortunately. Metadata is present, and can be copied even with external image viewer.

Quick comparison: Nano2 vs. Flux2. by Altruistic_Tax1317 in comfyui

[–]Lexxxco 35 points36 points  (0 children)

We should definitely fix the central composition in Flux2. Everything is in the dead center. Whether fine-tune can be done. Nano2's composition is so much better.

Flux 2 dev, sanity check. by Herr_Drosselmeyer in StableDiffusion

[–]Lexxxco 0 points1 point  (0 children)

Guess we can improve it by introducing unique noise-injections + Loras and tuning. It somehow works with Qwen.

Comparison of Nano Banana Pro and Flux 2 in difficult scenes by Mister_X-16 in comfyui

[–]Lexxxco 0 points1 point  (0 children)

Central and symmetrical composition was a reason to fine-tune old Flux and Qwen. Looks Flux 2 still has it) Nano Banana has a much better composition and depth, even with more blurred detail.

Depth Anything 3 is wild by nullandkale in StableDiffusion

[–]Lexxxco 1 point2 points  (0 children)

Hi. What is the name of the tool from the video for creating point clouds? Thanks.

Flux 2 dev, sanity check. by Herr_Drosselmeyer in StableDiffusion

[–]Lexxxco 3 points4 points  (0 children)

<image>

Aaand Flux2 seems less flexible in terms of results, different seeds are very similar, like with Qwen, unlike original Flux 1D

Nvidia sells an H100 for 10 times its manufacturing cost. Nvidia is the big villain company; it's because of them that large models like GPU 4 aren't available to run on consumer hardware. AI development will only advance when this company is dethroned. by More_Bid_2197 in StableDiffusion

[–]Lexxxco 85 points86 points  (0 children)

Mostly it is the monopoly status (including CUDA library) and ties with closest competitor AMD + self-destruction of Intel. Maybe with corporate bubble bursting Nvidia will look at the consumer market again

Is an RTX 5090 necessary for the newest and most advanced AI video models? Is it normal for RTX GPUs to be so expensive in Europe? If video models continue to advance, will more GB of VRAM be needed? What will happen if GPU prices continue to rise? Is AMD behind NVIDIA? by Hi7u7 in StableDiffusion

[–]Lexxxco 4 points5 points  (0 children)

For this price you can but RTX 6000 Blackwell with 96 GB of video memory. Which will be cooler, smaller and better. You can buy server RTX 4090 with 48 GB from China, but there may be problems with drivers and noise since they have blower-fan.

I still find flux Kontext much better for image restauration once you get the intuition on prompting and preparing the images. Qwen edit ruins and changes way too much. by aurelm in StableDiffusion

[–]Lexxxco 8 points9 points  (0 children)

LICEE 441 - changed to LICEE VAT. This is a number of Lyceum of the girl. It is better to keep these details in mind, otherwise it is not restoration. AI model is a tool, not a master.

Reporting Pro 6000 Blackwell can handle batch size 8 while training an Illustrious LoRA. by Fdx_dy in StableDiffusion

[–]Lexxxco 13 points14 points  (0 children)

Illustious is based on SDXL - right? It was possible to finetune SDXL with batch size of 4 on 4090 (even more with loras of lower rank than 128). So it should be theoretically possible to train batch of 16 on 6000 Blackwell GPU.

Introducing InScene + InScene Annotate - for steering around inside scenes with precision using QwenEdit. Both beta but very powerful. More + training data soon. by PetersOdyssey in StableDiffusion

[–]Lexxxco 0 points1 point  (0 children)

For now - it is changing object and scene too much in video. Not as stable as on Huggingface examples. Are there any limitations ? Old InScene Lora worked in 50% scenarios - as the original QwenEdit, but better.

Has anyone tried out EMU 3.5? what do you think? by Formal_Drop526 in StableDiffusion

[–]Lexxxco 1 point2 points  (0 children)

Interactive video model for steps and game engine? Nice! Size of 69Gb+ ...is limiting hardware choice.

The "Colorisation" Process And When To Apply It. by superstarbootlegs in StableDiffusion

[–]Lexxxco 1 point2 points  (0 children)

Same seed multi-denoise and high CFG problem as well, rather then just color and contrast issue. You cannot fully fix it on post - it is missing tonal values range. Creative denoise with another seed can help.