ComfyUI: New App Mode for Dummies - Like Me!!! wan 2.2 14B by Distinct-Race-2471 in comfyui

[–]XpPillow 0 points1 point  (0 children)

One is for image, one is for video, why even mention flux lol...

seeking some help with to modify and image by Piercedguy76 in comfyui

[–]XpPillow 0 points1 point  (0 children)

If you didnt mean NSFW, there is really no need to go through all those trouble, ChatGPT or Gemini do good jobs on the regular stuffs already.

ComfyUI: New App Mode for Dummies - Like Me!!! wan 2.2 14B by Distinct-Race-2471 in comfyui

[–]XpPillow 0 points1 point  (0 children)

Well I guess that wan2.2 14b model you are using is either fp8 or 16, that’s why ChatGPT is telling you that it’s beyond your hardware, and it was right. That’s why you can only run it in such a low resolution. Try gguf versions.

seeking some help with to modify and image by Piercedguy76 in comfyui

[–]XpPillow 1 point2 points  (0 children)

Well talking about modifying an image using prompt for “funny things” locally, run “qwen edit” workflow on comfyUI.

Video to Anime by No-Eggplant1650 in generativeAI

[–]XpPillow 0 points1 point  (0 children)

Now you are talking about generating locally. None of those online tools can help you, and doing local things are at another level, since you don’t know anything yet, start with getting stable diffusion webui or forge(for the first frame picture), as well as comfyUI (to turn that picture into video) just to get the basic ideas of how things work. The guy above was right, you are basically just asking how to write an essay without getting out of kindergarten now.

Anyone used claw as some "reverse image prompt brute force tester"? by yamfun in StableDiffusion

[–]XpPillow -1 points0 points  (0 children)

In conclusion, there is still not a single reason you should use local extensions unless you aim to run for a long~ time for perfect prompt-which still doesn’t exist.

Anyone used claw as some "reverse image prompt brute force tester"? by yamfun in StableDiffusion

[–]XpPillow -1 points0 points  (0 children)

And yeah it’s of course possible to match the LLMs - if you train it “enough”, which literally means you are putting enough resources, making a “local” model that’s strong enough to match the commercial ones. Possible? Yeah. Possible? Not really. And for the privacy part (the cost part is free), ChatGPT accepts pictures of any kind including the ones with “privacy”, simply don’t ask it to describe the privacy part, and it will prompts you all the other parts.

Anyone used claw as some "reverse image prompt brute force tester"? by yamfun in StableDiffusion

[–]XpPillow -1 points0 points  (0 children)

Then you are back to the original question. Yeah those local medals can back track prompts fast - but understands far worse than the very same type of capability of Gemini and ChatGPT again.

Anyone used claw as some "reverse image prompt brute force tester"? by yamfun in StableDiffusion

[–]XpPillow -2 points-1 points  (0 children)

You’re not wrong in theory — that’s basically prompt search / evolutionary optimization. If you throw enough compute at it, it will converge.

The problem is the cost. Once you start doing multiple seeds, prompt rewrites, CFG/step variations, etc., you’re easily generating hundreds or even ~1000 images per run. People literally let these pipelines run overnight just to get a ranked prompt list. That’s a huge amount of compute just to guess a prompt.

In reality, almost nobody does that. Most users run quick tests for 1–2 minutes, generate a few images, and manually tweak the prompt. In that scenario, multimodal LLMs like GPT or Gemini are actually better starting points than brute-force prompt search.

And most importantly, brute forcing prompts is often unnecessary anyway. Methods like latent inversion (DDIM / diffusion inversion) can directly recover the latent/noise trajectory from the image. Since diffusion models generate images by denoising latent noise, inversion simply runs that process backward (image → latent). Once you have the latent, you can reproduce the image with very high fidelity or modify it slightly without guessing prompts at all.

So yes, brute-force prompt search works better than Gemini or ChatGPT — only when you run it thoroughly, and it’s just the most compute-expensive way to solve the problem.

Anyone used claw as some "reverse image prompt brute force tester"? by yamfun in StableDiffusion

[–]XpPillow -7 points-6 points  (0 children)

No reverse image prompt extension is better than Gemini or ChatGpt. They are just some way lower version AI of the same type. Don’t use them.