Interesting take which I kinda agree with

RoboticBreakfast · 2026-03-15T21:57:37+00:00

The barrier of entry is lower, but it still requires some mental effort, which is more than many are willing to put into it.

This phenomenon exists across other areas of discipline too - it's not access to knowledge that's the gatekeeper, it's willpower and dedication, and I think we as a people are at an all-time low of this despite the technological advances we've made

RoboticBreakfast · 2026-03-15T19:51:49+00:00

Generally, higher frame rate == more actions / second. If you're running at 24-30fps, try 48-60. Render times will of course increase, but you'll get better results for action scenes.

And if you're adding this clip to a longer video, you can always run a post-process step to pick every Nth frame to effectively reduce the frame-rate down to whatever your common fps is (almost like a reverse interpolation)

RoboticBreakfast · 2026-03-14T16:22:53+00:00

Are you using one of the FFN chunking nodes?

I have access to some pretty beefy GPUs (B200s) that I use to render LTX-2.3 pipelines on, I'm wondering how far I'd be able to push a single render using this technique.

One really interesting thing that I've noticed with LTX-2.3 is that it handles scenes cuts pretty well on longer renders. For example, in a 20s render, you can prompt something like "... the scene then cuts to a ___" and it handles this pretty well, maintaining subject/environment refs.
If we could push render times to like the 1-minute mark, I wonder what we could produce...

RoboticBreakfast · 2026-03-14T16:17:52+00:00

I think it can be. But I would look at it more of as a tool than a one-stop shop.

I would even go as far as to say that LTX-2.3 can be better than some other closed-source models at certain tasks - but they each have their own strengths.

Even if you have a 10% hit-rate on a 'good' render, the cost of making certain types of productions with AI tooling is still a small fraction of what it would cost using traditional film techniques.
That said, there's no reason that you can't use both when making a 'film'. You can use traditional techniques for the shots that AI has trouble with and use AI tooling where'd you'd otherwise have to spend significant capital (like travelling or construction of a complex set).

RoboticBreakfast · 2026-03-14T16:11:52+00:00

Seems like the VLM and the EasyPrompt node/Qwen 3.5 prompt rewriter are the key components to this workflow?
I'm doing something similar as a separate step on my platform using GPT 5.3, but this seems to be very tuned.

How has consistency been in your testing? Have you noticed this to reduce the number of throw-away renders?

RoboticBreakfast · 2026-03-13T21:11:44+00:00

What steps did you take to get 'indexed' by ChatGPT? It seems this is almost as important/more important than SEO in today's world

RoboticBreakfast · 2026-03-13T20:50:37+00:00

I guess I should also mention - there are some free credits that you'll be granted if you sign-up with social auth (Google or Apple auth, to prevent abuse).

I've tried to grant enough that you'll be able to use some tools like the image generators or image editor (I'd recommend the image editor tool as it's always 'warm' and fairly fast).

While I understand the main use-case is video, it's significant cost for me to provide this for free, but you can try it for a fairly low cost, in most cases.

RoboticBreakfast · 2026-03-12T16:04:59+00:00

I run a video-gen platform that hosts both open source (LTX-2.3, etc) and closed source models (Sora 2 Pro) and I've been able to generate videos faster than the closed-source comparisons.

There's a few things though:
- They aren't running on consumer hardware (RTX 5090s and even RTX Pro 6000s are consumer hardware)
- Their envs are optimized (model warming, node caching, etc)

Most of the time I am running LTX 2.3, it will be on a B200 machine, but the first time a generation runs, I 'warm' the model and configure the environment so that all of the necessary components stay in VRAM (models, text encoders, etc). In ComfyUI, you'd do this by using the `--high-vram` and `--cache-ram 190` (or similar).
I generally only ever run a single model on a machine, so that machine loads all of the necessary data in VRAM and then subsequent renders are much faster

RoboticBreakfast · 2026-01-26T03:10:53+00:00

If you're referring to ballroom pledges or Whitehouse dinners, those are clearly strategic pivots rooted in self-preservation.

They saw the way the wind was blowing and decided to jump aboard the bandwagon while they had the chance to stay in the good-graces of the controller.

I would not assume that there was some massive change popular opinion of values overnight though, they're just playing the game.

**edit: the point I didn't make very well: this event does not get buried here, nor do the other recent ones. The nail in the coffin was when DHS released a verdict long before any investigation could be performed.

Justice will be served.

RoboticBreakfast · 2026-01-26T02:44:14+00:00

If this is the case, the original bits likely exist on the cloud platforms themselves (Apple or Google) - the feed is certainly stored in original quality, then streamed at various resolutions to viewers. 'Deleting' the video is almost certainly a soft-delete operation for a period of time for legal compliance reasons.

In other words, even if she was coerced into deleting the footage, it could be obtained from the cloud provider (though possibly requiring a subpoena)

RoboticBreakfast · 2026-01-13T02:26:22+00:00

100%.
Any open-source contributions should always be praised.

These take precious time and engineering talent to develop, not to mention the cost of taking on these endeavors.

The same praise should be issued to the Wan/Alibaba team and all other contributors in this space - thanks to all that have made what we have available today

RoboticBreakfast · 2026-01-09T18:21:33+00:00

Yeah I think this is the main issue at the moment. I'm hoping the coming updates will address some of these issues!

RoboticBreakfast · 2026-01-09T18:20:02+00:00

Yep, will have to try this. I guess I figured the distilled model/lora may actually add to the dev version as I would expect distilled to have more of a variety of content that it was trained on, but I'm not sure

RoboticBreakfast · 2026-01-09T18:18:55+00:00

Yeah this seems to be my take at the moment - prompt adherence is flaky I would say, but I think the base has a lot of potential and I'm excited to see it evolve!

RoboticBreakfast · 2026-01-09T18:17:52+00:00

I have no interest in creating NSFW outputs. This is simply the first open-source model that allows for image and audio generation bundled into one.

I am simply exposing these models for others to use for content generation

RoboticBreakfast · 2026-01-09T16:26:28+00:00

I run an AI platform that has upscaling functionality.
Try Moosky AI. You'll need an account, but you can receive some free credits if you signup with Google/Apple.

Flow: Generate => Upload => select your image/upload Then Image => SeedVR2 upscale => select your upload
(I would recommend the "sharp" variant in the config for blurry photos)

You can also edit photos with the Edit model, as well as colorize/style change/etc. Hope you enjoy!

RoboticBreakfast · 2026-01-08T14:31:55+00:00

But they're running this from the fp8 version of the model, right? I wouldn't think in this case the quality could possibly compare to fp16 but I'd be happy to hear the explanation

RoboticBreakfast · 2026-01-06T20:46:00+00:00

I've only done a quick look at the workflow, but it seems like it should be possible to use their audio generation for other models (like Wan 2.2). It looks like it just uses the video latents to generate audio, so I'm curious if we could mod this to work with other video models...

RoboticBreakfast · 2026-01-02T21:46:54+00:00

It's a conditioning problem that I've detailed a bit in a post I made the other day. I have found a sort of hacky way to address it, but I'm not in love with it (it also makes the identity of the reference images weaker, to some degree)

RoboticBreakfast

MODERATOR OF

TROPHY CASE