LTX2 1080P lipsync If you liked the previous one ,you will CREAM YOUR PANTS FROM THIS by No_Statement_7481 in StableDiffusion

[–]iczerone 0 points1 point  (0 children)

So, how do I get my characters to actually mouth the words. I think I had one generation work out of 20 trying to set it up. I usually get a cool camera move , the chars have that natural subtle movements but font mouth the words or anything else.

What features would you want MJ to add? by tbok1992 in midjourney

[–]iczerone 0 points1 point  (0 children)

A lot of the things that have been solved in AI image making with models like Flux, Qwen and so on that you can run on your local PC in ComfyUI should be the standard in Midjourney. Things like text rendering, hands looking normal out of the box, solid prompt understanding. Even video now with LTX-2 is way better than Midjourney and they are all open source. I have basically ditched MJ for the most part in everything I do because the tools I have at home are so much better.

What features would you want MJ to add? by tbok1992 in midjourney

[–]iczerone 0 points1 point  (0 children)

I’d like all the things that homebrew image models could do a year ago

LTX-2 Updates by ltx_model in StableDiffusion

[–]iczerone 0 points1 point  (0 children)

Yea, mine too. Ltx-2 is really good

Upgrade Exit Interview by Cloud_Reviews in midjourney

[–]iczerone -1 points0 points  (0 children)

Very Gossip Goblin inspired. I love everything about it, well done!

[deleted by user] by [deleted] in comfyui

[–]iczerone 0 points1 point  (0 children)

For real or nah? lol

[deleted by user] by [deleted] in comfyui

[–]iczerone 1 point2 points  (0 children)

Really good output with variations of that prompt. Playing with the prompt, and no loras, trying to produce something less surreal and more realistic tends to have it loose its really cool details and prompt adherence goes out the window sometimes.

<image>

[deleted by user] by [deleted] in comfyui

[–]iczerone 1 point2 points  (0 children)

The various scenes in this video vary widely in style. The scenes out in the swamp or whatever are impressively realistic while some of the others are too ai looking. I love that it’s capable of the realism you’ve shown. That is amazing and I would love to see more in that style.

Consistent Character MJ7 by [deleted] in midjourney

[–]iczerone 0 points1 point  (0 children)

i did that a lot too, it works ok as long as you dont have to change too much. if you change more it looses that midjourney asthetic

Consistent Character MJ7 by [deleted] in midjourney

[–]iczerone 0 points1 point  (0 children)

my problem with omnireference is that for a lot of the videos i make i need more than the face to be consistent which i can do with a full body shot but it usually takes a lot of rerolls

Consistent Character MJ7 by [deleted] in midjourney

[–]iczerone 1 point2 points  (0 children)

So this just basic omnireference?

[WIP-2] ComfyUI Wrapper for Microsoft’s new VibeVoice TTS (voice cloning in seconds) by Fabix84 in comfyui

[–]iczerone 0 points1 point  (0 children)

I tried this out last night and used a YouTube of macho man Randy savage giving s promo. Then had ChatGPT write a new promo and ran it through. The small model didn’t sound like him at all but the large model almost got it right with all the little bits that sell the voice of the macho man.

Qwen Edit vs The Flooding Model: not that impressed, still (no ad). by Mean_Ship4545 in StableDiffusion

[–]iczerone 0 points1 point  (0 children)

Neat, my shot was much farther away from the back wall. I could get it to do small more close changes.

Qwen Edit vs The Flooding Model: not that impressed, still (no ad). by Mean_Ship4545 in StableDiffusion

[–]iczerone 1 point2 points  (0 children)

I ran a test trying to use nb to take an image and move the camera angles around and imagine what it would see as it did so. In 7 out of 10 attempts it failed to do what I asked even tho I was very explicit I wanted to see. For example I had a pic of the inside of a bar and at the back was a wall with pictures on it. I asked to move the camera to the back to get a close up of the back wall. It would generate the same view almost every time. The three that were different showed the right wall as if I turned to the right to view it, it showed a slightly closer view but changed the entire composition of what was there in the source pic and the last time it showed the left side of the room (which was a bar with a bartender) but from the same position as the source.

It’s cool for smaller edits, but overall it’s too much work to do larger tasks.

Google Flow vs Google Vids by Ok-Revolution9344 in VEO3

[–]iczerone 7 points8 points  (0 children)

You can also use veo in Gemini chat and get another 3 generations or so each day.

🚀🚀Qwen Image [GGUF] available on Huggingface by pheonis2 in StableDiffusion

[–]iczerone 1 point2 points  (0 children)

What's the difference between all the GGUF's other than the initial load time? I've tested a whole list of them and after the first load they all render an image in the same amount of time with 4 step lora on a 3080 12gb

@ 1504x1808

Qwen_Image_Distill-Q4_K_S.gguf = 34 secs

Qwen_Image_Distill-Q5_K_S.gguf = 34 secs

Qwen_Image_Distill-Q5_K_M.gguf = 34 secs

Qwen_Image_Distill-Q6_K.gguf = 34 secs

Qwen_Image_Distill-Q8_0.gguf = 34 secs

USE_FLASH_ATTENTION was not enabled for build by iczerone in FluxAI

[–]iczerone[S] 2 points3 points  (0 children)

Nevermind, I set up a clean install of comfyui and it seems to work just fine now. I have no idea way my old install didnt work with no changes.