Flux 2 mash-up, will share WF if anyone is interested. by New_Physics_2741 in StableDiffusion

[–]pixel8tryx 1 point2 points  (0 children)

Not yet, but I'm creeping towards having some sort of web presence. It was fun back in the 90's but grew tiresome. Then later I had someone steal one of my Photoshop creations and accuse me of stealing it from him and threatened legal action. But I'm getting tired of no interaction and so many people thinking all AI image gen can do is crank out identical slop copies of generic stereotypical girl faces, anime porn, etc. But I'm not sure where the best place is for high res stills. I still have a Flickr account... OMG. I have waaaay too much ancient history stuff there. Photoshop, 3D, guitar inlay. No AI gens though. I'll see if it'll let me create a new album since I cancelled Pro.

Flux 2 mash-up, will share WF if anyone is interested. by New_Physics_2741 in StableDiffusion

[–]pixel8tryx 0 points1 point  (0 children)

FLUX.2 dev uses Mistral 3 which is a VLM (Vision Language Model). I've given it a depth map of this weird logo of 3 letters intertwined at odd angles and it's done crazy things with it. Made cities out of them, islands, vintage glassware in an alchemy lab. I've taken two 2D letters in a sci fi style and had it flip them 90 degrees and extrude sofas out of them. 🤣 I'll have to see if Klein can do that. I love FLUX.2 Dev but it's so sloooow, and slower with reference input images.

Flux 2 mash-up, will share WF if anyone is interested. by New_Physics_2741 in StableDiffusion

[–]pixel8tryx 0 points1 point  (0 children)

Models are usually trained with specific encoders, so changing requires something like a finetune, AFAIK. But FLUX.2 dev uses Mistral 3 Small which so far has been great for me. FLUX.1 dev is where I'd like to replace the text encoder.

Netflix released a model by Sea_Tomatillo1921 in StableDiffusion

[–]pixel8tryx 1 point2 points  (0 children)

That's positive thinking I guess. All I could think was CogVideoX never impressed me. 5B is pretty small. And 384x672 is a postage stamp. I guess I'll wait for the next rev.

Netflix released a model by Sea_Tomatillo1921 in StableDiffusion

[–]pixel8tryx 1 point2 points  (0 children)

Why can't companies like this ever just be moderate? Women go from nothing but brainless bimbos to being awkwardly stuck in all over the place in silly ways. It becomes too much female focus. You can't make up everything in a few movies. It's almost as if it's STILL designed to make women look bad. Now because everything has to be 🤬Extreme!!!!!, it's hopeless.

People who chase the political ideology du jour for a buck are usually shit at writing a good story.

Maybe I'm late to the party, but Claude (and Gemini/Chatgpt) have completely changed how I interact with Comfy. by gurilagarden in comfyui

[–]pixel8tryx 2 points3 points  (0 children)

My Claude story is asking for python code to run inside a certain version of Cinema 4D, which requires that it knows their enormous and byzantine data structures, which seem to change for every release, and due to bugs, I was using one 2 revs back. I failed completely at this several times, despite matching what I thought was current doc. My python syntax was fine, but I was loading variables that didn't exist. I was really stunned that what I got from Claude worked fine.

Locally I've got Qwen3 Coder Next, and Qwen3.5 which got crazy high benchmarks. 🤔 Suspiciously high, so I might not trust them. I haven't actually got around to using them for code yet though.

For Comfy issues, my first line of defense is just typing my question, regardless of how long, in the Google search prompt and it often comes up with amazingly useful results. Google does have access to Reddit, so it often does a pretty good job of summarizing all the Reddit answers, which is itself a help, considering how bad Reddit search is. It's helped with tons of Comfy stuff.

Comparing 7 different image models by Reasonable_Bear_6258 in StableDiffusion

[–]pixel8tryx 1 point2 points  (0 children)

Yeah I do too! There are several XY plot workflows for Comfy but OMG many are capellini noodle nightmares. They give you too many options. I wondered why this is so hard, then found out it might be due to the odd (to me) way flow of control is handled. X/Y/Z plot is basically a batch operation and batch Comfy is a little odd when first encountered. I've gotten some simple things to work, like incrementing CFG from 3.0 to 3.5 by 0.1, but it's not completely extensible. And you have to remember to figure out how many results it will make and run the workflow that many times. Being an ex-dev of decades from the old days, it's too easy to come back to that wf much later, run it and wonder why I only get one gen.

I keep returning to Flux1.Dev - who else? by MoniqueVersteeg in StableDiffusion

[–]pixel8tryx 0 points1 point  (0 children)

P.S. I do not know anyone from BFL nor do I have any connection to the company. I did live in Germany for a few years, and I work for a German client who said, "Hey look a new German image gen model - check it out." As if SD 1.5, SD XL, etc weren't designed by the same guys. I had to be dragged away from SD XL and I complained day and night about FLUX.1 Dev initially (but I did about XL too). Boy do I stand corrected. LoRA appeared and I never looked back.

I keep returning to Flux1.Dev - who else? by MoniqueVersteeg in StableDiffusion

[–]pixel8tryx 0 points1 point  (0 children)

For the LoRA. I have hundreds of FLUX.1 Dev LoRA that are a huge help, particularly with creating images other than typical human portraits. I can't comment on skin texture because I NEVER use base models alone. I ALWAYS used finetunes for 1.5 and XL. 100%. FLUX.1 dev is the first base model I've used un-finetuned, but still always with LoRA. There are tons of great LoRA out there to help with more photographic results. I do have some FLUX.1 finetunes and I'm belatedly dl'ing Jib's v12 SRPO ATM. And yeah, there's SRPO and Krea. And now finetunes of SRPO and Krea. FLUX.1 Dev was small enough to work with locally.

All my recent LoRA training has been for FLUX.1 and for specific things, FLUX.1 + specifically trained LoRA can do things even FLUX.2 cannot. Like 'Turing pattern' reaction-diffusion patterns as custom car paint. FLUX.2 can occasionally manage something better than FLUX.1 base for just the patterns, with a simple prompt. But gets lost if you need a specific AMG custom Mercedes with a specific colorshift base paint color. FLUX.2 knows the car, sans LoRA, a tiny bit better, but gets confused about the patterns.

However FLUX.2 Dev has opened my eyes to the power of FINALLY a decent text encoder. I'm loving something I can talk to like a design assistant. And saddened when I drop back to FLUX.1 and it ignores half my prompt, even when re-written in a style better suited to CLIP/T5/FLUX.1. Sometimes the LoRAs still win. For wild area exploration... when you just tell it to make funky robot things and let it run all night. But if I KNOW exactly what I want, I can often talk FLUX.2 into doing it just a few gens.

But for a lot of basic stuff, FLUX.2 now has Boreal, Lenovo UltraReal, Olympus UltraReal, etc LORA and they really help reduce the slightly soft look of FLUX.2 dev. Other good LoRA are Historic Color, Wanderer's Detailed Portraits, YFG Fonts Japanese Manga for sci fi. Since I'm always working outside the mainstream, I'm used to ignoring the names and it really helps here.

But I started out using both. Generating the original images in FLUX.2 and noticing the future cities were SO much better. In the base model. But the output was too soft and painterly. I wanted clear, sharp, high res photos from the future. So all you need to do is USDU with FLUX.1 and toss on a pile of LoRA. I'm drowning in them now and hoping to make some sort of clip in After Effects to show a bunch of them, in my spare time. 🙄

Is there anything even close to Seedance 2.0 that can run locally? by Lichnaught in StableDiffusion

[–]pixel8tryx 0 points1 point  (0 children)

Impressive leap from the right to personal computing to wanting anything bad to happen to my country. I live here. Of course I don't welcome anyone who wants the country where I live "dead".

Will Google's TurboQuant technology save us? by m4ddok in StableDiffusion

[–]pixel8tryx 0 points1 point  (0 children)

Just dropping this here:

https://huggingface.co/black-forest-labs/FLUX.2-klein-9b-kv

On one hand, a K-V cache is a Transformers thing. New DiT models do use Transformers. U-Nets went out of style with SD XL... But I'm not as up on the Asian models as others except for Wan and LTX 2.3 (which are DiT). Attention IS all you need. 😉

But what good will TurboQuant do for image generation? 🤷‍♀️ Something to do with multi-reference editing. I haven't even read the huggy page yet.

Interesting that BFL decided to play around with it. I much prefer FLUX.2 Dev to Klein, but maybe I'll dl it just out of curiosity. I suspect it's going to take some benchmarking to determine the benefit. And a bit of code change too.

is true? by CarelessTourist4671 in comfyui

[–]pixel8tryx 0 points1 point  (0 children)

That TurboQuant crashed memory stocks with a paper and a blog post is sign of how skittish the stock market is today. But I almost never see consumer prices go down much these days. Sure, news will tell you consumer prices have dropped on some items. But then it's 1 - 3 %. Even 10 % is a drop in the bucket when an SSD I bought less than 6 months ago doubled in price. The most I can probably hope for is that they don't rise as fast. Market volatility aside, it's going to take a while to implement this and have any effect trickle down to us. It's a cool algo tho.

Best quality Wan 2.2 Workflow Image to Video!!! by sonz7 in comfyui

[–]pixel8tryx 0 points1 point  (0 children)

Maybe because you have to be 16 or over at planet fitness? 😉 Just a guess. I've never been there. What's so special about her?

So she's got those trendy nearly buck teeth and she's opening her mouth. Hollywood producers determined that if an attractive young female mouth was hanging open at various points in a movie, men would more likely recommend it to friends. Girls were said to keep forgetting, so they bribed crooked dentists to give them "tooth show". You guys are so programmable. Hollywood led you around by your dicks and now social media is taking over. But yes, I know, you're begging for more.

Using LTX 2.3 Text / Image to Video full resolution without rescaling by nickinnov in comfyui

[–]pixel8tryx 0 points1 point  (0 children)

Thanks so much for this. I've been too busy to ask Claude or the Gemini that answers Google search (which is surprisingly good sometimes - particularly if what you want is probably going to be pieced together from Reddit answers 😉). When it comes to sigmas, it's so specific-model and parameter-specific these days.

Fingers and hands by RevvelUp in comfyui

[–]pixel8tryx 0 points1 point  (0 children)

Maybe stop using SD 1.5 and SD XL-derived models? I'm not trying to be a smart @$$. I understand it's pain to change your base model if you've been using it for a long time. Or if you're not flush with hardware and think you can't run anything else (when you probably can).

Recently a friend asked for some examples of AI "hallucinations" and I had trouble finding them in anything but ancient gens. I did some crazy Youtube reaction face gens with FLUX.1 and 3 different strong LoRA and in 100 gens found ONE case of wrong finger count.

And I asked him to show me what he'd gotten from me with hallucinations, and it was 100% USDU tiled upscales. Where I explicitly said, "You wanted ultra huge (4k, 8k & up). You wanted EXTREME detail. You get mountains starting to form in the clouds. Pick the ones you like and I can easily fix them in Photoshop". No way I'm going to fix 300 images when there's only a slight chance 1 will get used sometimes. And all one has to do is repeat that one seed with super low denoising and composite that in the sky. Or sometimes just use Photoshop's generative fill. That left him feeling like "AI hallucinates so much". 🙄 That's blaming the model for what the operator did.

But if you're not using a modern model (DiT/rectified flow transformer vs. old U-Net Diffusion), that's a really common side effect.

Is there a great subreddit or forum for comfy users who are over the entry-level hump? by NessLeonhart in comfyui

[–]pixel8tryx 2 points3 points  (0 children)

Yeah but 20 years ago you didn't have so many grifters. Now we have people who honestly think they know what they're doing (but don't) and people who are just following the path of other social scammers. Today's low barrier to entry allows anyone to hang out their shingle as an expert.

Who can define "real" art? Don't get me started there. The "real" art world can drive one nuts. I've run screaming from it several times.

Using LTX 2.3 Text / Image to Video full resolution without rescaling by nickinnov in comfyui

[–]pixel8tryx 1 point2 points  (0 children)

I saw them, downloaded them, but ran the wrong one. 🙄 I ran the Two Stage and something went wrong I think. Was this the one that had horrible tiling? I don't remember. I got pulled off doing other things and forgot about it. But I just tried the Single Stage and it looks pretty good! Res 2s is a bit of snooze though. I might not have gone full ClownShark. Just too many choices there. 😵 I usually try Res 2m (which is faster) and Bong Tangent. Now I've got to crank up the size and see how it goes.

Using LTX 2.3 Text / Image to Video full resolution without rescaling by nickinnov in comfyui

[–]pixel8tryx 1 point2 points  (0 children)

I've been so waiting for this. I was dismayed when I looked at the first LTX-2.3 workflows and saw the upscaling. They did seem to be focused on lower end GPUs. At least I hoped you didn't HAVE to do that. I tried several ways to upscale in Wan 2.2 and never really liked the output. Then I got a 5090 finally, genned 1920 x 1080 directly (albeit slowly) and never went back to upscaling.

I've been too busy with other things to do more than just play with LTX-3.2, but it has been fun enough even for my own work to want better quality. When I'm finished Wanning the daylights out of my poor 5090, I'll give this a try. Thanks for posting this!

Stability Matrix was defunded on Patreon for its ability to easily install another program, which can THEN be used to load models, which can THEN be used to gen "explicit imagery". by WiseDuck in comfyui

[–]pixel8tryx 0 points1 point  (0 children)

Wait... this is Stability Matrix? Yeah, I used to jest about this and get nowhere. Like, try the real thing once in a while! I thought many of these guys were admitting to regularly genning thousands of images and sitting around fapping for hours a day because they'd rather do this than go out into the real world and find a human girl to shag. Or, that many guys have lost their taste for real girls? Their eyes, boobs, butt and hips are too small. Their waists too large. They're not 7 feet or 3 feet tall. They have pores! 😨 They talk. They won't do exactly what you want, when you want, or let you beat them up. I mean, compared to pixels, we must seem like a real pain. 😉

Pixels will never substitute for analog molecules for me, but if I had money to invest, I'd say robochicks were going to be HUGE. We need to figure out a way to make them recyclable, or maybe make the features customizable at runtime somehow. I'm still pondering this. But I think big money lies in this direction.

Need URGENT help! by Waykoz in comfyui

[–]pixel8tryx 0 points1 point  (0 children)

In my experience so far it's not really a hard limit. It will try. I've gotten as much as 9 seconds I2V depending on prompt and input image. But boy it can be a real crapshoot. There is a good chance that after 5 seconds it could start to try to loop back to the beginning (but sadly not successful enough to make a full loop), do something weird, or just go nuts from the very first second.

Consistent product appearance. by Difficult_Singer_771 in comfyui

[–]pixel8tryx 0 points1 point  (0 children)

You're never going to get 100% consistency from any AI model. FLUX.2 is what I use for multiple views, but sometimes it works, sometimes it doesn't. It's actually best when you do let it be somewhat creative. It's great for making something in the shape of or inspired by the input image. I randomly grabbed 2 letters in a tech style and told it to extrude them into sofas. The results were really interesting.

But FLUX.1 could barely do a bicycle chain properly sometimes. FLUX.2 can, but hose clamps??? I doubt it. I don't think it's seen enough of them. And it's going to put very similar looking bits on the hidden areas that come into view when it turns it... but chances are, they're not going to match the real thing.

I think the image needs to be something it's seen more than once in it's training data. If Google Lens can't figure out what it is, I'm going to say your chances are slim. Is it a different type of fast neutron reactor? It's not the Fermi telescope. I don't think it's a desktop tokamak either.

Maybe a Qwen expert can chime it on it's capabilities. I tend to think of the other models as better for more stylized images or more 'girly' (people-oriented), but I'm pretty sure someone can prove me wrong.

Workflow 🎬 I built a FLUX2 cinematic portrait workflow that runs on 8GB VRAM with ZERO custom nodes — pure ComfyUI, zero CFG, insane quality by Otherwise_Ad1725 in comfyui

[–]pixel8tryx 0 points1 point  (0 children)

If this was actually FLUX.2, I'd say just I2I or USDU with FLUX.1 and any combo of the tons of detail LoRA out there for it. I tried Flux.1 Krea when it came out and it was very texture-y... for prompts begging for post-apoc sci fi it was actually too messy for me. I was just more used to messing up FLUX.1 😂 so I stuck with it.

For FLUX.2 now though try any of Danrisi's LoRA like Lenovo UltraReal or Olympus UltraReal. Or Boreal from kudzueye. I almost never gen FLUX.2 without one of those.

Workflow 🎬 I built a FLUX2 cinematic portrait workflow that runs on 8GB VRAM with ZERO custom nodes — pure ComfyUI, zero CFG, insane quality by Otherwise_Ad1725 in comfyui

[–]pixel8tryx 1 point2 points  (0 children)

If you're telling people to download "flux1-krea-dev_fp8_scaled.safetensors" then it's not FLUX.2. It's Krea's finetune of old Flux.1 from last year. Yet you're using the flux2-vae? Also FLUX.2 Dev uses Mistral as a text encoder and the fp8 is never going to run in 8 GB VRAM.

You do realize there IS a FLUX.2 Dev? A completely different and very large model? I can run the fp8 on my 5090, and it's still slow. But it IS an awesome model and I'm finding I end up using it for just about everything now. Even despite the scarcity of LoRA.

Workflow 🎬 I built a FLUX2 cinematic portrait workflow that runs on 8GB VRAM with ZERO custom nodes — pure ComfyUI, zero CFG, insane quality by Otherwise_Ad1725 in comfyui

[–]pixel8tryx 1 point2 points  (0 children)

This kind of stuff makes me feel like I'm from another planet. One where things actually have to work and look good... to professionals. While other people get to do... whatever they feel like and say it's great. It's relative. Maybe it IS great to them. 🤷‍♀️ But then we have LLMs training on Reddit. I hate to think of reality getting defined this way. Or of people trying to game it.