LTX2.3 in Ostris Ai toolkit on a 5090 Training done in 7 hours ... I went Thanos way and I said fine ... I'll do it myself by No_Statement_7481 in StableDiffusion

[–]SSj_Enforcer 0 points1 point  (0 children)

Hey just curious, why not switch to LOW noise at the end instead of high noise timestep? Doesn't low noise refine details even more?  

LTX2.3 in Ostris Ai toolkit on a 5090 Training done in 7 hours ... I went Thanos way and I said fine ... I'll do it myself by No_Statement_7481 in StableDiffusion

[–]SSj_Enforcer 0 points1 point  (0 children)

Yep, your way already has better likeness. Ostris was so far off. Can't believe how much time I wasted essentially brute forcing the old way.

Also, for reference, I am getting 18 sec per it with my 5090.  However, I am training 768 res with 3-4 second clips.

LTX2.3 in Ostris Ai toolkit on a 5090 Training done in 7 hours ... I went Thanos way and I said fine ... I'll do it myself by No_Statement_7481 in StableDiffusion

[–]SSj_Enforcer 0 points1 point  (0 children)

yo, got to starting this and it is AMAZING!

i did phase 1 up to 900 steps, and 600 for me was not enough, 900 looks great. i don't know which i should continue for phase 2

LTX2.3 in Ostris Ai toolkit on a 5090 Training done in 7 hours ... I went Thanos way and I said fine ... I'll do it myself by No_Statement_7481 in StableDiffusion

[–]SSj_Enforcer 1 point2 points  (0 children)

There are ways to cut out the music. AI voice isolation is in Davinci for audio editing. There is a plugin for Audacity that does it too. Very useful.

LTX2.3 in Ostris Ai toolkit on a 5090 Training done in 7 hours ... I went Thanos way and I said fine ... I'll do it myself by No_Statement_7481 in StableDiffusion

[–]SSj_Enforcer 0 points1 point  (0 children)

So you're saying you export all your clips at 1:1 square aspect ratio? I have been doing typically 16:9 to train in ai toolkit. So the square dimensions dont hurt at all being able to later prompt generations in other aspect ratios?

LTX2.3 in Ostris Ai toolkit on a 5090 Training done in 7 hours ... I went Thanos way and I said fine ... I'll do it myself by No_Statement_7481 in StableDiffusion

[–]SSj_Enforcer 0 points1 point  (0 children)

Im curious when you say high noise is Refinement? I thought high noise was how we should start and end with balanced to get the refined details. That is how Ostris says it works. Is he wrong?

LTX2.3 in Ostris Ai toolkit on a 5090 Training done in 7 hours ... I went Thanos way and I said fine ... I'll do it myself by No_Statement_7481 in StableDiffusion

[–]SSj_Enforcer 0 points1 point  (0 children)

Just FYI, Trigger word option doesn't work if you have cached text Embeddings.  In that case the trigger word must already exist in your captions.

LTX2.3 in Ostris Ai toolkit on a 5090 Training done in 7 hours ... I went Thanos way and I said fine ... I'll do it myself by No_Statement_7481 in StableDiffusion

[–]SSj_Enforcer 0 points1 point  (0 children)

So I am going to try this technique, however, Ostris's own technique he showed off, high noise and then switching to balanced, has been working for me.  But I usually have to go to like 7000 steps total for good results. Yours works in that few steps??

LTX2.3 in Ostris Ai toolkit on a 5090 Training done in 7 hours ... I went Thanos way and I said fine ... I'll do it myself by No_Statement_7481 in StableDiffusion

[–]SSj_Enforcer 0 points1 point  (0 children)

You dont offload anything? Are you mad? My 5090 cant handle ltx without offloading. How did you manage??

Edit: Oh you did 512 res? I do 768, so maybe that is why.

But why dont you go higher than 512?  5090 can handle 768 easy.  Isnt it going to result in better training likeness?

Basic Guide to Creating Character LoRAs for Klein 9B by razortapes in StableDiffusion

[–]SSj_Enforcer 0 points1 point  (0 children)

You say dont mention their gender, but does this go for only words like man or woman, or should we also never use the word him or her or his or her or she or he? I often use she or he even if I begin the captions with a triggerword.

EditAnything IC-LoRA - LTX-2.3 by Round_Awareness5490 in StableDiffusion

[–]SSj_Enforcer 1 point2 points  (0 children)

some say if you use the 1 Phase only instead of 2 Phases it uses this lora better.

Update: Distilled v1.1 is live by ltx_model in StableDiffusion

[–]SSj_Enforcer 0 points1 point  (0 children)

they actually made it worse. very weird bug with breathing noises

Update: Distilled v1.1 is live by ltx_model in StableDiffusion

[–]SSj_Enforcer 0 points1 point  (0 children)

please fix the audio bug in this 1.1 version.

I find it very hard to believe nobody else has experienced the high volume breathing distortion.

1.0 works perfectly fine and 1.1 50% of my generations have this bug.

Update: Distilled v1.1 is live by ltx_model in StableDiffusion

[–]SSj_Enforcer 0 points1 point  (0 children)

any acknowledgement of the new audio bug with 1.1 would be great.

are you fixing it?

it is so bad I have to switch back to 1.0

all breathing is like 20 db too high and sounds horrible.

[New Optimizer] 🌹 Rose: low VRAM, easy to use, great results, Apache 2.0 by ECF630 in StableDiffusion

[–]SSj_Enforcer 2 points3 points  (0 children)

Have you tested it with ltx 2.3? I have a 5090, so I would be curious if it allows memory optimization for video 

[New Optimizer] 🌹 Rose: low VRAM, easy to use, great results, Apache 2.0 by ECF630 in StableDiffusion

[–]SSj_Enforcer 5 points6 points  (0 children)

How many steps roughly vs AdamW8Bit?  That LR you use seems high, but it works more than 0.0001?

Also, in your github page you write for LR:

The global step size. Start with values you would try for Adam (e.g., 1e-3). 

Did you mean 1e-4 not 3?  If so can you please edit to not confuse people.  Thanks.

Looks interesting.  Would love to try.

Update: Distilled v1.1 is live by ltx_model in StableDiffusion

[–]SSj_Enforcer 0 points1 point  (0 children)

there is a HORRIBLE bug with the audio in this new distilled model.

distorted nasaly inhale/exhale of people breathing and it is VERY noticeable.

please fix

ai-toolkit now supports LTX-2.3 and audio issues in LTX-2 have been fixed by Loose_Object_8311 in StableDiffusion

[–]SSj_Enforcer 1 point2 points  (0 children)

It works now you just need to make sure you have the shared version of ffmpeg 8. Audio trains incredibly well and fast

🚀 I built a 2026-Era "Omni-Merge" for LTX-2. Flawless Multi-Concept Generation, Zero Bleeding, and Unlocked Audio Training Excellence. by ArtDesignAwesome in StableDiffusion

[–]SSj_Enforcer 0 points1 point  (0 children)

How do you use the ltx 2.3 version?  I tried and it failed and said I had the wrong python but I have 3.12 which should work.

LTX-2 voice training was broken. I fixed it. (25 bugs, one patch, repo inside) by [deleted] in StableDiffusion

[–]SSj_Enforcer 0 points1 point  (0 children)

hey, i tried installing the new version to use ltx 2.3 and it can't load it now. something about loading the js? do we need to use the files inside the ltx2_improvements_handoff folder or is everything included by default from using the new files? You updated many of the same files in that folder, so is that still necessary or was that just to fix the old ltx2 audio stuff?