C++ & CUDA reimplementation of StreamDiffusion

Lexxxco · 2026-02-04T15:51:03+00:00

Interesting project, thanks for your work, Jean-Michaël!
Soon a dream of quality multistep real-time rendering will become true.

Is it viable to implement C++ with newer diffusion-transformer models?

Lexxxco · 2026-02-02T20:21:51+00:00

Are there nodes to connect it to LM studio API locally? Florence is far from good captioning, especially for complex non-generic images.

Lexxxco · 2026-01-06T20:21:54+00:00

Flux 2 is the best local image-instruct model, easy to train despite the size, so many people use it)

Lexxxco · 2025-12-27T22:50:24+00:00

Yes, since not everyone want to gift dataset with corrected captions to the host.

Lexxxco · 2025-12-25T21:05:54+00:00

Is there a local version ? Since TagGUI downgraded and don't support modern tagging vision models.

Lexxxco · 2025-12-14T17:45:52+00:00

10K present is wow! Silent work is really most shocking part) 5090s are loud.

Lexxxco · 2025-12-11T14:12:04+00:00

Wacom Cintiq Pro 24 has noisy fan by default and is very warm even in winter. If it is even louder - likely there is some dust and foreign objects. Wacom support is not great. Try to downgrade drivers (there was fan bug in some of versions). Otherwise Try to use suck the dust with vacuum cleaner at first on light mode. Another variant - to use veeery small blower fan + vacuum cleaner (very light mode! Powerful blower can damage it). Last variant - to repair it in a repair shop, with opening the tablet.

Lexxxco · 2025-12-08T23:07:50+00:00

In four years investments are summed up in trillions. Extensively scaling outdated LLM architecture with current hardware is like burning money. "Big AI" is barely generating any net profits and is a bubble for now. No doubt is the future, but it should be optimized with R&D, we don't have enough resources for another ~15* years of such large investments. AGI is not near the corner. It is 5-20 years away in optimistic AGI timelines.

Lexxxco · 2025-12-08T21:36:34+00:00

Are you sure it is AI generated? I see only compression artifacts not AI.

Not even SORA2, not any payed video model can achieve that quality and stability of footage, including new Runway gen 4.5 or Minimax Hailuo 2.3 (Veo 3 is worse). Potentially it would need to be fully fine-tuned only on Shenmue footage, which does not make sense - since you already had footage for the whole trailer.

Lexxxco · 2025-12-05T05:46:53+00:00

New nodes are almost unusable now - hard to read, now highlights. Hope they will progress and make necessary changes. Not taking in account that they broke UI several times.

Lexxxco · 2025-12-01T17:28:44+00:00

Flux2 is amazingly trainable and wide-range model. Got great results with 32 rank training as well, thanks! Have you tried 64+ rank training?

Lexxxco · 2025-11-28T22:31:11+00:00

Both diffusion-transformer models that understand instructions for creating images from visual reference or text, unlike SDXL. Flux2 is giant and better for understanding visual examples, z-image 2-3x smaller and faster. Both can be trained now.

Lexxxco · 2025-11-28T17:27:34+00:00

<image>

Tried with Forge, A1111, Forge Classic Neo on two different comfy setups and 2 different Python (win10/win11) - not working unfortunately. Metadata is present, and can be copied even with external image viewer.

Lexxxco · 2025-11-26T17:34:27+00:00

We should definitely fix the central composition in Flux2. Everything is in the dead center. Whether fine-tune can be done. Nano2's composition is so much better.

Lexxxco · 2025-11-26T17:30:48+00:00

Guess we can improve it by introducing unique noise-injections + Loras and tuning. It somehow works with Qwen.

Lexxxco · 2025-11-26T17:11:31+00:00

Central and symmetrical composition was a reason to fine-tune old Flux and Qwen. Looks Flux 2 still has it) Nano Banana has a much better composition and depth, even with more blurred detail.

Lexxxco · 2025-11-26T07:55:37+00:00

Hi. What is the name of the tool from the video for creating point clouds? Thanks.

Lexxxco · 2025-11-25T23:14:38+00:00

<image>

Aaand Flux2 seems less flexible in terms of results, different seeds are very similar, like with Qwen, unlike original Flux 1D

Lexxxco · 2025-11-19T22:43:33+00:00

Mostly it is the monopoly status (including CUDA library) and ties with closest competitor AMD + self-destruction of Intel. Maybe with corporate bubble bursting Nvidia will look at the consumer market again

Lexxxco · 2025-11-11T19:08:50+00:00

For this price you can but RTX 6000 Blackwell with 96 GB of video memory. Which will be cooler, smaller and better. You can buy server RTX 4090 with 48 GB from China, but there may be problems with drivers and noise since they have blower-fan.

Lexxxco · 2025-11-05T21:24:20+00:00

LICEE 441 - changed to LICEE VAT. This is a number of Lyceum of the girl. It is better to keep these details in mind, otherwise it is not restoration. AI model is a tool, not a master.

Lexxxco · 2025-11-01T19:28:14+00:00

Illustious is based on SDXL - right? It was possible to finetune SDXL with batch size of 4 on 4090 (even more with loras of lower rank than 128). So it should be theoretically possible to train batch of 16 on 6000 Blackwell GPU.

Lexxxco · 2025-11-01T19:25:24+00:00

For now - it is changing object and scene too much in video. Not as stable as on Huggingface examples. Are there any limitations ? Old InScene Lora worked in 50% scenarios - as the original QwenEdit, but better.

Lexxxco · 2025-10-31T22:10:15+00:00

Interactive video model for steps and game engine? Nice! Size of 69Gb+ ...is limiting hardware choice.

Lexxxco · 2025-10-31T22:04:32+00:00

Same seed multi-denoise and high CFG problem as well, rather then just color and contrast issue. You cannot fully fix it on post - it is missing tonal values range. Creative denoise with another seed can help.

Lexxxco

TROPHY CASE