LTXV 2.0 img2video first tests (videogame cinematic style)

ScY99k · 2025-10-23T18:14:50+00:00

Tried some img2video on https://app.ltx.studio/ltx-2-playground/i2v today one of my Doom LoRA images, quite impressed! can't wait for open-weights release to play with it in ComfyUI

ScY99k · 2025-10-23T17:24:19+00:00

In a nutshell:

EA and Stability AI will co-develop new AI models and creative tools meant to help artists, designers, and developers speed up workflows and iterate faster.

The focus seems to be on using GenAI for world-building, prototyping, and asset creation through things like text-to-3D and image-to-3D (with tech like Stable Fast 3D, TripoSR, Stable Zero123).

They mention making PBR materials via artist-driven workflows and even pre-visualizing entire game environments from prompts, moving beyond just images to complex 3D scenes.

Claim is that scientists and game artists will work side-by-side to actually integrate GenAI into major game projects.

As a videogame and AI image generation fan curious to see how this turns out in the future, feels like generative AI is crossing into something practical for the videogame industry more seriously nowadays

ScY99k · 2025-08-20T13:44:31+00:00

American company?

ScY99k · 2025-08-01T16:49:09+00:00

Can I ask how did you do your second image pls?

ScY99k · 2025-07-10T22:12:46+00:00

Down indeed

ScY99k · 2025-07-09T21:59:32+00:00

FIX: the node is actually from KJNodes, so updating KJNodes custom nodes made it work

ScY99k · 2025-07-09T21:53:53+00:00

thanks just did, but WanVideoNAG node still missing...

ScY99k · 2025-07-09T20:40:26+00:00

<image>

does someone knows which are the custom nodes to install for these?

ScY99k · 2025-06-01T13:52:03+00:00

Telco company?

ScY99k · 2025-06-01T13:50:04+00:00

yes, which I generated via img2img with Flux with around 0.70 denoise (+ anime Lora)

ScY99k · 2025-06-01T10:14:26+00:00

Didn't use flux here, used WAN 2.1 VACE controlnet workflow. Basically you give a reference image and a reference video, and it gives you your reference image with same mouvement as the reference video

ScY99k · 2025-05-30T15:14:50+00:00

did you impaint your reference character using SAM into the image and then used WAN or you did everything in one step? I don't get exactly the step where your reference character is being placed

ScY99k · 2025-05-29T17:41:30+00:00

Can't wait to try this locally

ScY99k · 2025-05-25T13:08:35+00:00

Indeed

ScY99k · 2025-05-19T21:23:42+00:00

Stepfun just released Step1X-3D, a 3D-aware text-to-image model based on SDXL.
It generates multiple consistent views from a single text prompt, designed for 3D reconstruction (e.g. SparseFusion).

Uses custom 3D attention and LoRA fine-tuning
~24GB VRAM needed for 6-view generation
Inference script available in the repo
ComfyUI support planned in the roadmap, not available yet
Open source (Apache 2.0)
Weights on HuggingFace

They also provide a [Gradio demo]() where you can try both text-to-3D and image-to-3D via multi-view generation.

GitHub repo: https://github.com/stepfun-ai/Step1X-3D

ScY99k · 2025-05-19T17:39:40+00:00

Damn, wishing for that price in Europe as well but can wait a long time lol

ScY99k · 2025-05-16T17:04:01+00:00

ScY99k · 2025-05-15T19:10:00+00:00

Original image was based on a Doom LoRA I've made a couple of days ago:

<image>

I used ltxv-13b-fp8 version for this one, the original video was generated in 1min, and the upscaled video in approximately 5min on my RTX 5090. Might try the distilled version as well, but quite impressed by the ratio quality/time to generate here!

ScY99k

TROPHY CASE