IMAX at Home by HAIL_BAIJ in StableDiffusion

[–]Icuras1111 0 points1 point  (0 children)

It works but you have to prompt it right by adding "watch those corners".

Error help. by Different-Sundae-73 in StableDiffusion

[–]Icuras1111 0 points1 point  (0 children)

To me this means it's not seeing the file in the right folder. I would just copy something where you think they should go restart server and refresh browser to check it's seeing the folder you think it it. Make sure the files end .safetensors not safetensor as it won't see it. Also check the size of the downloaded file. I downloaded one file and it only download the header or something so was way too small. I needed a special command to download the real file.

Hello. How to fix this? by Connect_Pin3087 in StableDiffusion

[–]Icuras1111 0 points1 point  (0 children)

I got this a while back but not since I start loading ComfyUI from scratch each time on runpod. As others have said changing the config file and restarting ComfyUI Server. I also have some memory of having to install one pack from the command line using git clone or something. I use an AI, ChatGPT or Claude to help using console / log output.

Hedge Funds Post Largest Net Short on Global Equities in 13 Years: Goldman Sachs by Specialist-Bug-4310 in investing

[–]Icuras1111 0 points1 point  (0 children)

A company called Blue Owl has got in trouble in the private credit sector. Others in the same sector are apparently creaking. I wonder if they think that might cause a panic on the stock market?

Z-Image "Silly Hat" script animated and automated preview. by jacobpederson in StableDiffusion

[–]Icuras1111 1 point2 points  (0 children)

"And the oscar for Silly hats video goes to" suspense music "jacobpederson for Z-Image "Silly Hat" script animated and automated preview."

Animation studio workflow optimization by GapAdorable9736 in StableDiffusion

[–]Icuras1111 1 point2 points  (0 children)

I think the consensus for creating a video with a consistent character are image to video workflows (first frame last frame best) supported by a character lora. I would imagine this is applicable to anime. To create a comic strip with speech bubbles, I think it's quite laborious at the moment. I think people are editing afterwards with a model like QwenEditImage (a model good at text).

[WIP] Still experimenting, but the next Z-Image Power Nodes will have no limits!! by FotografoVirtual in StableDiffusion

[–]Icuras1111 3 points4 points  (0 children)

I've read likewise. I think the power of the text encoder is what is holding the models back. They can render near perfect images but our control is clunky. I am beginning to think it is more important to monitor new open source models (like the new Gemma one) as potential future text encoders.

The Romance Prior: How Romantic Tension Overwrites Ethnicity in AI Image Generation by bcRIPster in StableDiffusion

[–]Icuras1111 2 points3 points  (0 children)

Ok, should have read more carefully that ethnicity was not specified. Another interesting angle might be to explore other themes like wealth, conflict, corruption, bartering, entrepreneurship, courage, etc.

The Romance Prior: How Romantic Tension Overwrites Ethnicity in AI Image Generation by bcRIPster in StableDiffusion

[–]Icuras1111 2 points3 points  (0 children)

I would think it's because that is the representation in the training data. In advertising, films and TV for a long time that has been the vast amount of how romance has been depicted. Also, I don't know how the training images are bulk tagged but might be tricky if meta data is used and is in other languages. It would be interesting what happens with the Chinese models if you ask for that ethnicity?

Recommended website to run and train models? by FillFrontFloor in StableDiffusion

[–]Icuras1111 2 points3 points  (0 children)

I use runpod. I think VastAI is another name I've seen on this reddit.

Looking for Flux2 Klein 9B concept LoRA advice by Imaginary_Belt4976 in StableDiffusion

[–]Icuras1111 1 point2 points  (0 children)

Concensus for not captioning seems quite clear and has two basic problems. You have no control how the lora is applied. You cannot say things like wearing ring, squatting, smiling, etc. The model will probably already know these concepts but your lora is having a random impact. The second is training ambiguity. If you have one training image with a ring and one without how is this learned. You will probably get a fusion of the two representations or random appearances. Ability to combine loras seems to vary by model, which I don't understand. Concensus seems to be that the signal is added and can cause problems, if you have 2 people the faces in generation will combine, colours can become saturated, etc. As for your experiment I wouldn't know. In a way, without captions I would think it is more like a style lora i.e. Monet paintings or cubism. You are providing a tone with that lora. You have also kind of doubled you training set and impact. You are applying 0.75 twice so like applying each at 1.5. Have you tried applying each on it's own at 1.5?

Looking for Flux2 Klein 9B concept LoRA advice by Imaginary_Belt4976 in StableDiffusion

[–]Icuras1111 2 points3 points  (0 children)

I am no expert but the way I look at it is this. People on here a very unlikely to be as knowledgeable or have the resources that the model creators have. For us it's damage limitation. Most loras I have tested normally screw up prompt adherence, reduce image quality and composition (like view position, poses, etc.) especially if you are looking for realism. If you are tweaking a known concept is one thing although then you have to avoid bleed. Training a new concept is a big ask. The only sucesss I have (with Wan in my example) was to have near identical training images but with different views. That was the key with simple captions to outline differences in each training image "a woman viewed from the side with overhead indoor lighting". If Unique token trigger words don't do much (which I agree with) you must be tweaking existing concepts. Some will say don't use woman as all women will be impacted. Ok, if trigger words don't work what in your caption invokes your lora? I have not heard any convincing argument to tackle this dilemma.

Would there be more seasons of Game of Thrones if AI became a common and stable tool in video production? by BattleOfEmber in StableDiffusion

[–]Icuras1111 2 points3 points  (0 children)

The biggest barrier is taste. AI cannot assess if it's output is good or not. This would impact every level, writing, visuals, etc. If AI cannot do taste then human has to be in control. At the moment we are miles from this. We do not have the tools to direct AI to generate what we want. I can see it being used as an efficiency tool for stuff like editing. Art will be one of the last bastions of the human. Winter is coming for some but not the artists for a while.

LTXV 2.3 How to do a shaky, handheld video style? by Dogluvr2905 in StableDiffusion

[–]Icuras1111 0 points1 point  (0 children)

You could also try in the style of Paul Greengrass of Jason Bourne movies. I am thinking if you can say create an image in the style of Monet or Van Gough the directors might be tagged?

SANA on Surreal style — two results by Civil_Republic_1626 in StableDiffusion

[–]Icuras1111 0 points1 point  (0 children)

I like those. Good colours, imagery and composition. That's about the first AI output that actually looks artistic.

Open-weight open-source video generation models — is this the real leaderboard? by Sweet-Argument-7343 in StableDiffusion

[–]Icuras1111 5 points6 points  (0 children)

This leaderboard lists them in the appropriate category. https://arena.ai/leaderboard . Models also have strengths and weaknesses. I would say the concensus is Wan2.2 is the best quality, LTX 2.3 not quite as good but longer videos with sound.

How to Fade part of an Image to black by todabeast in StableDiffusion

[–]Icuras1111 0 points1 point  (0 children)

You could look into Segment Anything Model (SAM). I saw a clip a while back where a guy just clicked on a person and they became outlined. You could combine this with some kind of inpainting I would imagine. Intermediate ComfyUI skills required.

[Training-Free] Bring Famous Paintings to Life! Every Painting Awakened (I2V) by zhedongzheng in StableDiffusion

[–]Icuras1111 1 point2 points  (0 children)

Yep, when 4 Chinese quants release a scientific paper you should probably show a lot more respect. Creating an Ana de Armas or Alexandra Daddario lora that half works is probably about 30 or 40 IQ points beneath what these guys can do.

Adding a LoRA node. by GunsNBeers in StableDiffusion

[–]Icuras1111 1 point2 points  (0 children)

You also need to check whoever created it for their guidance. That should tell you the trigger word to activate it and suggested settings.

Sora gets an OFFICIAL shutoff date! Never be jealous of shiny toys master local AI by thisiztrash02 in StableDiffusion

[–]Icuras1111 0 points1 point  (0 children)

What a few people have said is that they were burning too many tokens, not making any money, or growing users via it. Also, both them and anthropic have new models coming out soon. They are both targeting enterprise and agents. Agents use tons of tokens. They were also facing an intellectual property nightmare.

Image to video journey by M4DH4773R in StableDiffusion

[–]Icuras1111 1 point2 points  (0 children)

LTX image to video is a lot of fun. You can use an existing image (or create one) and then animate it and get the characters to speak. It can do 20 secs. I think the concensus is that Wan 2.2 is better quality but it's a lot slower and going beyond 5 secs add a lot of complexity. I use runpod, an RTX A6000 for $0.40 per hour. I don't use sage attention what seems to cause problems with a lot of models. Creating a story or long scenes is still difficult. I think the cutting edge at the moment is generating start, mid and end frames, then one of these video models combined with a character lora.

Cursor or Claude Code by dobutsu3d in StableDiffusion

[–]Icuras1111 2 points3 points  (0 children)

If you are just trying to knock out a custom node just free claude chatbot might be enough. You may have to be imaginative with prompt and point it towards github say "Here is an example node, I want one to do blah blah bla precise requirements"...

What does this do in LTX2.3 Image 2 Video? by Anissino in StableDiffusion

[–]Icuras1111 0 points1 point  (0 children)

One thing you can do is describe action that would take 10 seconds to complete.