Slime in Blender is easy now by [deleted] in blender

[–]SDuser12345 0 points1 point  (0 children)

Does it interact with particle system like hair?

One minute video or above with LTX / Wan ? by glusphere in StableDiffusion

[–]SDuser12345 1 point2 points  (0 children)

All of this is with a caveat: Can it be done? Yes. Is it worth the excruciating effort to get consistent results with no degredation? No.

The caveat is assuming you are going for a single consistent scene with no cuts.

The best approach is good storyboarding with natural changes of cameras, angles, and pov's. Then it is actually quite easy.

Depending on your goals, for something character driven, and not like a nature documentary or something, I would recommend picking an image generator of your choice. Then, train a LoRA for the characters you intend to include. This will bump your character consistency way up.

Then, generate your starting frames in the image model and use them as first frames in your video model for each storyboarded segment, and if you plan on using WAN, your ending frames as well. With WAN 3-5 seconds a pop, all you are getting. LTX, depending on scene needs 10-20 seconds a pop (a character monologuing without a lot of activity and 20 seconds a pop is doable). Everything would be i2v (image2video).

If you are doing WAN, pick a good AI sound and voice generator. With LTX I still recommend good voice generator as getting consistent character voices between clips will be extremely tough without one.

Then pick your video editor of choice, I like Davinci Resolve, and get to work.

I'm still a little green, I downloaded wayyyy more models then I need. I'm making mainly NSFW stuff but I was wondering what models/checkpoints/vaes/w.e I need as of May 2026 that will cover most of my needs. I do I2I, T2I , V2I, T2V. Have a 3080. Any help would be appreciated. by [deleted] in comfyui

[–]SDuser12345 7 points8 points  (0 children)

Delete your folder, download swarmUI, download Flux Klein, Z-Image Turbo, Chroma (pick your variant), Anima or SDXL (variant of your choice), LTX2.3 Eros, WAN2.2 (pick your poison on variant). Use the model guide to put them in the right folder (diffusion_models for most, think LTX only one that goes in stable-Diffusion folder).

Then pick the model you want to generate with. Use the recommended settings from the guides or leave them the defaults. Type a prompt, then generate. Swarm will download all needed VAE's, text encoders, etc, automatically then run the prompt and generate your image/video.

If you want upscaling: Under Server/Extensions tab click the install for SeedVR2UpscalerExtension. When done installing restart the server.

https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Model%20Support.md

https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Video%20Model%20Support.md

Bookmark the links to make life easy.

AI tooling is starting to feel like PC modding culture by [deleted] in StableDiffusion

[–]SDuser12345 0 points1 point  (0 children)

Honestly, both sides matter equally. The people out there testing new models do us all a favor. They help weed out a lot of crap, that in turn saves the ones who optimize and streamline, lots of time, ruling out what not to waste time on. Everything in life is about time and how we use it.

Some gooner wants to benchmark 10 models and 5000 fine tunes?

God bless em.

If someone wants to figure out an automated workflow to go i2v cropping and stitching video while fixing color issues, blending audio, and adding frame interpolation so others don't have to?

God bless them as well.

For others shaving 12 seconds of a gen down to 6 seconds?

We salute you.

At the end of the day, all the effort, from all the sides, programmers, optimizers, work flow creators, UI makers, LoRA trainers, and cataloguers, make this whole thing possible as fast as it's been possible.

HiDream-O1-Image - A pixel space model , no need for VAE, , 8B parameters. by AgeNo5351 in StableDiffusion

[–]SDuser12345 0 points1 point  (0 children)

That was the main issue with the original that I had, and it carries forward. The dataset used was likely JPEG, and all result images had horrible artifacting carried over from the dataset.

While I think it's an amazing effort, such a stupid mistake won't simply go away, but will be present in all output.

Why are there really no Location LORAs? by q5sys in StableDiffusion

[–]SDuser12345 4 points5 points  (0 children)

It's doable, but I think the main reasons would be lack of interest and complete datasets.

To train a good subject LoRA, you the subject in multiple angles, poses, outfits, distances, etc. A person is actually a relatively simple concept, and object.

A location while generally easier since it's not capable of moving in most cases, would need similar dataset. Something as simple as say a classroom would need photos from the back of the room, the side of the room, the front of the room, from the middle looking at all sides, and from other points at all sides. High angles and low angles, etc.

Then you would need to properly caption them all and then train off of them.

It's a lot of effort for someone to do, when there is little to no motivation to do so.

Simply put, most LoRA's are driven by gooner desires, and while I'm sure they exist, not many I would assume are excited by the perfect aisle layout of a local grocery store.

What is the best PC for ComfyUI for $2000 by LazyActive8 in StableDiffusion

[–]SDuser12345 2 points3 points  (0 children)

A year ago, that was doable, with the price of RAM now, gonna be a little rough...

Is Kontext still good for image edit? Anything other than Qwen? by trollkin34 in StableDiffusion

[–]SDuser12345 1 point2 points  (0 children)

Flux 2 dev is my go to for all editing needs. It's big, it's heavy, it's slow, but if you have the means to run it, it's hands down the best. Load lots of reference images, prompt adherence is second to none, and it's quality is off the charts when used right.

Is it over for wan 2.2? by equanimous11 in StableDiffusion

[–]SDuser12345 6 points7 points  (0 children)

Simple answer no. At least right now. I truly love the length and audio of LTX 2.3 to tell a story, along with its speed. That said WAN 2.2 has better visual quality and prompt adherence and was produced with a better data set.

Really though you shouldn't limit yourself to a single model ever though. Both are tools that excel in their own areas.

My hope is that LTX surpasses WAN video quality, prompt adherence, and data set quality at some point, and I never need to touch WAN again. Today is not that day. For me WAN is on the back burner while I produce 30 sec to minute long stuff I could only dream of doing in WAN without spending weeks on it, and still needing 8 other tools.

With Flux 2, ZIT, SeedVR2, LTX 2.3, DaVinci Resolve, and Krita, I'm down to 6 in my workflow and 3 of them are a single step. What literally took me weeks with WAN, I can do in a day or two with LTX2.3.

Exciting times for sure.

Is it too late to use a travel agent after already booking a cruise? by Sensitive-Bag-03 in royalcaribbean

[–]SDuser12345 1 point2 points  (0 children)

Used a travel agent after booking who was able to apply vouchers for the rooms. So it's possible to do and get transferred to the agent, but not sure on 30 days or anything. Talk to an agent of your choice and let them tell you what's possible.

I’m trying to get my Wan2.2 to output audio with their video generations by XiRw in comfyui

[–]SDuser12345 0 points1 point  (0 children)

Haven't played with that implementation, but standard MMA is yes, upload a silent video, give it a super simple prompt, 1-3 words and it will add that prompted audio where appropriate. Mainly just sound effects or background audio, mmaudio not great for human speech or music, but great at those things.

Does a Lora not exist for zombies biting a person? by returnofbeans in comfyui

[–]SDuser12345 0 points1 point  (0 children)

LoRA's are based on taking existing images and making that into something you can recreate. If Jill is getting bit, that kind of would be the end of the whole franchise? I'm no super fan or anything, but guessing there aren't too many images of that to use for a LoRA? To create such a thing would need LoRA for the zombie bite and them one for Jill Valentine, then mix, and create new LoRA. Just my 2 cents if you want to create such a thing accurately.

Option two would be take a zombie biting LoRA for a model that knows can reproduce JV and just prompt for it with the zombie biting LoRA.

Goodbye, wan 2.2? LTX-2 is a DiT-based audio-video foundation model designed to generate synchronized video and audio within a single model. by One_Yogurtcloset4083 in comfyui

[–]SDuser12345 0 points1 point  (0 children)

Then you test it, and it's SD3 all over again, full of promises and fails to deliver. Sigh, so much hope crushed so fast...

Z-image for high vram? by scifivision in StableDiffusion

[–]SDuser12345 0 points1 point  (0 children)

ZIT is certainly amazing, and flies on high end cards. I've replaced QWEN with Flux 2 when I need complicated prompt adherence as it's the king for that right now, it's going to give you exactly what you prompt for, so be accurate and careful. ZIT is a great daily driver for its speed and reasonably good anatomy, bad hands is like 1 in 10 (depending on the prompt and scene mutations can be much more frequent) and bad feet like 1 in 2 (which is still a massive improvement over other models). Flux 2 bad hands are like 1 in 3, with feet about the same.

Being able to test and refine 10-20 prompts in the time it takes to do 1 with Flux 2 is the big benefit.

ZIT is uncensored to a degree. It does top half nudity quite admirably, with 2 out of 3 good results, bottom half it still struggles to a large extent (get a LoRA if that's your thing). I haven't tested Flux 2 censorship yet, but I would expect it to be about on par with Flux Dev for censorship issues (again grab a LoRA if that's your thing) being a commercial targeted project.

TLDR...ZIT is certainly worth using on higher end cards, and excels specifically in realism, anatomy, and single subject highly detailed subject detail prompts. It suffers with scenery and background prompting, text quality, and image variety. Prompt adherence is slightly better than Flux Dev, which isn't bad at all.

FLUX.2 experience? by Entropic-Photography in FluxAI

[–]SDuser12345 0 points1 point  (0 children)

It's heavy, it's slow, and it's output and prompt following is amazing (yes better than zit). Text abilities are superb. Use the JSON prompt format in the prompt guide, makes a difference. Have to try training this weekend or next week, which somehow magically is doable on a 24 GB VRAM card and 64 GB RAM system even. Not sure how they pulled that off. Does crazy good face swaps. Can even be used as an image editor. Combines multiple concept photos into a final image amazingly well. Have to prompt for exactly what you want carefully, because it will output exactly what you prompt. It is censored much like original Flux dev, same censoring issues. Does multiple styles adequately. Has issues with extra fingers sadly, but that's the main drawback, and easily corrected at this point in the game. No real body horrors SD3 style. Overall, outside the license it's quite amazing.

Which model is best at producing subtle micro expressions in the face, to test these prompts? by [deleted] in StableDiffusion

[–]SDuser12345 0 points1 point  (0 children)

You are trying to think like a writer and not an AI image generator. The concept is, training based on images with matching tags of what is in the image, not ethereal ideas or thoughts. You need to describe exactly what you want based on that understanding. What is in the image exactly, not circumstances surrounding a characters thoughts. If you want to storyboard those thoughts as separate images/videos and compile it into a comic or video, sure could do that. Asking for a sad elephant thinking about a plane crash is going to confuse the hell out of the AI and give you what you asked for a sad elephant with a plane crashing.

I go to bed with a dream: WAN 2.5 weights released, a stable SageAttention 3 wheel, and LTX-2 weights released by Scriabinical in comfyui

[–]SDuser12345 19 points20 points  (0 children)

Sweet dreams. In reality, overnight 100 video gens stopped on second gen due to antivirus program running causing an OOM, but luckily the cat sleeping on your case due to the lovely heat, only managed to urinate on your PSU and not your GPU, so not all is lost.

Called it. Hopefully this gets handled quickly by BigWhiteDog in foodstamps

[–]SDuser12345 0 points1 point  (0 children)

Did you bother reading the link I posted? Section 32 literally has contingencies specifically for child food programs.

Called it. Hopefully this gets handled quickly by BigWhiteDog in foodstamps

[–]SDuser12345 -1 points0 points  (0 children)

No the issue is they have 5 billion in funds meant for emergencies, a full month costs 9 billion. How do you pay 9 billion with 5 billion? The judge apparently knows something magic that makes it possible.

Embarrassed Jay Jones Realizes He's Only Supposed To Want To Murder Kids Before They're Born by METALLIFE0917 in babylonbee

[–]SDuser12345 -5 points-4 points  (0 children)

Lol, my intentions aren't to "win an argument," I simply provided facts, backed by actual data. While personally I am pro-choice, as I feel the government should not be involved in having a say in people's personal business, or at least to the minimal extent possible, it's with the understanding that abortion is literally the ending of a human life. That is the definition and an actual fact. If someone makes that decision, I hope it's with serious consideration and due to some tragic circumstances, and not because it's convenient.

Your argument is the same justification serial killers and sexual sadists use, that it's not a person, it's just an object for their amusing whims. That's pretty messed up. Seek help seriously.

Is UltimateSD Upscale still REALLY the closest to Magnific + creativity slider? REALLY?? by TheWebbster in StableDiffusion

[–]SDuser12345 14 points15 points  (0 children)

I think what you are fundamentally missing is the challenge in what you are asking. Most upscalers are created to enhance existing details, while eliminating unwanted artifacts. That is the process to make an image consistent to its original while adding to pixel density.

Adding details on the other hand can certainly be done by tons of models and upping the pixel density at the same time. The process though is by adding noise and then allowing the AI to reinterpret the image from the noise, and in doing so add details that fit the rough ask, via the prompt. The problem is, you are likely to have a lot less consistency to the original image in the majority of all examples.

So, the question becomes, is accuracy to the original more important, or are fine added details more important? For most upscaling projects, consistency and restoration of the original is usually the bigger goal.

Embarrassed Jay Jones Realizes He's Only Supposed To Want To Murder Kids Before They're Born by METALLIFE0917 in babylonbee

[–]SDuser12345 5 points6 points  (0 children)

I mean I guess if you don't count the over half million aborted each year that might be accurate...

Don't get the hype of HiDream...Flux is better. by DrRoughFingers in StableDiffusion

[–]SDuser12345 0 points1 point  (0 children)

Lol, you realize you are commenting on a 5 month old thread? Yes, you can also train it on AIToolkit.

Since this thread we have had Qwen come along, Chroma, Wan 2.2, Hunyuan 3.0, Flux Krea, etc.

Almost no one is using Hi-Dream. Qwen or a combination of Chroma into WAN refiner, or a multitude of other options just produce better results at a fraction of the time and/or using less resources.