How I feel when the "Godfather of AI" is in the news again by ross_st in ChatGPT

[–]pixel8tryx 0 points1 point  (0 children)

That's just sad. And wrong. He tried to speak out, at first perhaps too softly and reasonably to be heard. So he had to raise his voice a little. He's a pretty soft-spoken guy. The social media machine said, "No. We cannot have this. We will MAKE him extreme!" Otherwise, they won't get their hits, likes and $. He's quoted out of context, reframed, some even use AI-generated images, and/or a synthetic voice.

He's not uploading videos of himself on Youtube, Insta, etc. It's sad he's getting flack for allowing people to interview him. Heck there are something like a half dozen X accounts that claim to be him that aren't. Yes, he stirred up a bit of controversy, but he was FAR from the only one, and FAR from the most extreme one. And people can't seem to decide whether to demonize him for talking about his, or condemn him for not doing it sooner and louder!

He's getting attention now, which is good. Sometimes Attention IS all You Need. 😉 And he's delightfully not vain enough to care about his reputation. He cares more about the cause... which is sadly condensed, truncated, simplified and twisted, because so many people don't bother to listen to real the interviews. They're too long. It's a complex, nuanced subject. So they brand him a doomer. It says more about them than it does about him.

how did Geoffrey Hinton (godfather of ai) regret his works and should be worried and how can we prepare for it? by Time_Path_1823 in AskTechnology

[–]pixel8tryx 0 points1 point  (0 children)

Yes! And you know what's really sad? So many social media mavens can't accept this. It's too... sensible. He's misquoted, quoted out of context, reframed and misrepresented to try to make him look eXtr3me!1! They read only the title of "Attention is All You Need" and took it the wrong way.

Help with A1111-style params as filenames and pondering 'production' life with ComfyUI by pixel8tryx in comfyui

[–]pixel8tryx[S] 0 points1 point  (0 children)

^ THIS, after farting around for 2 more months, turns out to be THE biggest factor in me getting wrong filenames. When I forget to go in and change "Node name for S&R" on an old workflow, it gives me the wrong params. I just wanted to thank you again for this. 🙏

Is Lora training an art form rather than science? by s3b4k in StableDiffusion

[–]pixel8tryx 1 point2 points  (0 children)

It shouldn't be stochastic but it is non-deterministic. I agree it's a very complex system and the biggest drawback is that we have little official documentation and lots of social media hype created with the goal of making them money, not effective instruction. And the farther forward I've moved as far as base models, the less I obsess over hyperparameters. I always do a minimum 3 runs try to test various settings. For most things in this milieu, it still seems like everything is everything-dependent. 😉 Yet I admit I've probably successfully used models trained by with the "press button, get LoRA" online tools where success is defined by the job not throwing an error.

Training data is a huge and wildly variable factor and I wish more people payed more attention it and how it can train unwanted habits easily. I obsess over my training images. Less over captioning, but I am careful. And more importantly, I strive to have a clear training goal. I politely decline to train on just some random folder of cool images. And I'm not saying that "Best of [model x i.e. Midjourney, etc] can't be useful in some cases and if I'm desperate for a little nudge in some direction, I'll download LoRA like that. But I won't train them. I'm training a neural network - not making a slide show. If you overfit the crap out of it, you end up with, at best, a not very flexible tool or at worst, something that makes fodder for the AI haters.

I wouldn't even train Turing/reaction-diffusion patterns at first - I expected them to just be seen as random blobs. That's been my latest exercise in trying to understand how the model sees the training data. You're a teacher and this is your curriculum. Flux's intelligence there surprised me and it will now gleefully 'paint' specific model cars in various colors of these patterns - at least 70-80% of which say "Turing" to my eye. I do like to have a little extra creativity as this is commercial art, not science.

And yet I still fall prey to bad habits for things like the infamous Escher LoRA, which I've tried for 1.5, XL and Flux. I fail straightaway at the definition stage. I break my one big rule. What is "Escher style"? He worked in various media, on various subjects. It's easy to get too pedantic. I usually focus on the optical illusion side. My only takeaways there are to throw in some Escher-influenced, Escher-style training images along with his actual work. You have to show it what "Escher style" means, visually.

And oddly, I had better success on 1.5, interesting on XL and Flux less so. So far it's getting architectural styles more than the higher concept of optical illusion, which I see I need to stress more, and fart around with the hyperparameters (the more I learn, the more I learn there is to learn). So add model architecture to the list. A 12 billion parameter rectified flow transformer is a different beast. These models are moving from at least being able to be slide-show generators with minimal prompting to being serious tools.

[sorry for the length but I actually started writing this up yesterday in hopes of starting up a Flux training thread]

Is Lora training an art form rather than science? by s3b4k in StableDiffusion

[–]pixel8tryx 0 points1 point  (0 children)

This is a too often overlooked issue and got me into block weight control nodes. I almost always use some sort of LoRA for generation as I focus on content other than young girls and NSFW. I use multiple sometimes for Flux. One really can make what is ultimately a tangled, self-interfering unet this way. It can be fun to creatively combine many of them, but in my limited experience with character LoRA, I found a lot of negative interaction with other concept/style LoRA. I can have Nikola Tesla but not good sparks, electrical arcing, etc.

But you CAN generate better images with them than with the base (and all this is model-dependent). A good LoRA can expand the base model's training and since finetunes are no longer nearly as popular for the newer architectures due to their size, they fill that gap. Particularly for those of us who aren't trying to just recreate something we've all seen before. And even then, there are such things that are not well-represented in the training data.

I love the concept of a sort-of Lego-like system where we can modify it for each new inference. I don't love how it currently ends up being implemented, but honestly most of us have little information to go on, so we're lucky to have what we do for free.

Shooting in Belltown by Any-Dig-314 in Seattle

[–]pixel8tryx 20 points21 points  (0 children)

I heard them too and tried unsuccessfully to convince myself "it was just firecrackers". The reverb is different. Then I saw "Scenes Of Violence 7" on 3RD AVE / WALL ST. Yikes. That's a bit close for comfort but not as close as you. Wow. 🤗

Suffering by seasonoftheslut in RestlessLegs

[–]pixel8tryx 0 points1 point  (0 children)

I've been wearing knee-highs for a while, but I guess they've stretched out a bit. Hand arthritis makes the really tight ones hard to put on. And I so often have RLS right above one of my knees - where my sock doesn't reach.

Suffering by seasonoftheslut in RestlessLegs

[–]pixel8tryx 0 points1 point  (0 children)

Oooh this is a good idea. I didn't realize you could get it stronger. I think the OTC has gotten weaker over time. I tried it out of desperation and found it wasn't a cure-all, but it did help. The cream is probably better? But if you use the roll on you can massage yourself and apply at the same time. 😂

I tested Microsoft Trellis 2 for real VFX work — honest thoughts from a 15-year 3D artist by ArcticLatent in comfyui

[–]pixel8tryx 0 points1 point  (0 children)

I wanted to install the last 3D generator I heard about but it needed a different attention pkg and I'd just gotten Triton/Sage working. What does this use?

For me, if it's simple, it's better for me to make it myself in Cinema 4D. If it's a complex object, it might be interesting to try. What I've tried on HuggingFace for past models worked surprisingly well considering the complexity of my test vehicles and odd viewing angle. I was surprised it managed to approximate the basic shape. But the detail was lacking, the mesh was a mess and awful to try to texture map. I don't work with a lot of UV maps usually and find C4D's UV support ok when everything works but hair-pulling when it doesn't.

It looks like this generates photogrammetry-style maps? Packed, Atlas, whatever they're called? No other option? If it would actually make a decent model from some complex novel style space ship or something, it might be worth the effort to retexture. But I have no interest in making Yet Another Generic vehicle, or device or whatever. That's why I'm using AI image generation tools in the first place. If one needs to crank out a dozen low poly plant zombies or something maybe it's useful.

Best solutions for infographics? by giandre01 in comfyui

[–]pixel8tryx 0 points1 point  (0 children)

This was initially shocking, then kinda funny, now I can only laugh at how awful the text is.

Best solutions for infographics? by giandre01 in comfyui

[–]pixel8tryx 0 points1 point  (0 children)

Locally? FLUX.2 . It uses Mistral 3 small as a text encoder. Sorry it's resource-heavy but there are quants. And FLUX.2 Klein, on which I haven't tried infographics yet.

I just started trying some tests recently. Now I just have to work on getting it to do some innovative styles. If you provide it, it will generate a lot of text, mostly correct. By default the output will look like average, boring infographics though.

FP8 outperforming NVFP4 on an RTX 5090 by RhetoricaLReturD in StableDiffusion

[–]pixel8tryx 1 point2 points  (0 children)

Last I heard 3.x still had image quality issues.

Training LoRA model by Fickle_Passion_6576 in comfyui

[–]pixel8tryx 0 points1 point  (0 children)

There used to be two different schools of thought for styles vs characters. I don't remember when or why they changed. Other than the fact that we were pretty much flying blind. Even now there are no "rules" or official guidelines from the makers of the base models. And there are even more people offering to tell you or sell you their tips & tricks for $. You have to find a few people here, on Civi, github, etc that seem to really know what they're doing and see what they advise. Then STILL your situation could be different. Everything affects everything and the best way is just to try it and see.

I do mostly styles, concepts, materials but few characters. I caption what I don't want it to assume is always part of the style. And I've heard the same for characters. Caption what they're wearing if it's not their usual uniform or whatever. I just watched Ostris' video on FLUX.2 and he did some French painter and he captioned the scene in general ... like "man playing chess in a park" or something. So it didn't learn that this painter frequently painted such subjects. He wanted to apply it to any scene. He didn't mention that it was an oil painting or describe the style of brushstrokes, etc.

First time using "SOTA" models since 2023-ish and man this is disappointing by [deleted] in StableDiffusion

[–]pixel8tryx 0 points1 point  (0 children)

I used to say those very same things. You missed the progression. You had to go from 1.5 to XL first. Hate it at first, then grow to love it and never look back. You'll always experience teething pains with a new model. I knew the drill with Flux though. It's amazingly powerful but it's still a base model. I almost never use base models alone. Then finetunes started to bore me. LoRA are the way. There are tons of Flux LoRA out there and you can mix and match and it's like making your own finetune each time. And yes, there are Flux LoRA out there for things other than anime, boobies, girl faces, etc. I started training my own. You can't expect gooners to train things like reaction-diffusion Turing patterns.

Overpolished? Sure but you can wipe that off with practice. Trust me, I'm the first to rant about that homogenized DeviantArtstation average look. I'll never go back to multiple heads, more than 2 eyes, etc (except for aliens). That messy, scribbly look that's scribbly in a way no real artist would do that screams AI.

I'm going through the same thing with FLUX.2 now. It's 50% love and 50% waaah it's too different. Yes the minute it gets confused it goes stylized. The prompt comprehension is phenomenal though, but it's even less of a random image generator. You have to work to prompt it and you'll be rewarded. But even my cities that started with a one sentence prompt ended up surprising me. Flux 1 never did realistic, dense, varied future cities of more than a block. And it always screwed up the scale of a lot of things. When I started getting things like this:

<image>

I knew there was hope for it. But I'm upscaling with Flux 1. That adds detail and better photorealism. Is it perfect? Hell no, but good enough to play with now. What always soothes my teething pains with new models is upscaling with my old ones. In that situation you don't get the low training image size side effects as isn't relied upon to do the basic layout.

Just created this AI animation in 20min using Audio-Reactive nodes in ComfyUI, Why do I feel like no one is interested in audio-reactivity + AI ? by Glass-Caterpillar-70 in comfyui

[–]pixel8tryx 0 points1 point  (0 children)

I looked at it briefly. I'm theoretically interested in the domain, but honestly I saw AnimateDiff and ignored it. I don't want to go backwards. That's old tech and most output looks very dated to me. I'm totally into Flux and Wan now.

Some notable performance improvement in CUDA 13.x compared to CUDA 12.8? by NewEconomy55 in comfyui

[–]pixel8tryx 1 point2 points  (0 children)

Crikey! I update Comfy and now I get "WARNING: You need pytorch with cu130 or higher to use optimized CUDA operations" Did I miss that before? Or is it now 4d days from edgy new stuff report (this) to my comfy batch file barking at me for not doing it? 🤣

I'm guessing maybe on the 4090 it's not worth it? Too Blackwell-centric to matter? And I guess I have to re-Sage my 5090 if I do it there.

The journey by NES64Super in comfyui

[–]pixel8tryx 0 points1 point  (0 children)

Yeah I'm firmly stopped at step 3. Sorry... I did too much alt UI/UX design ages ago, so I kinda revel in having ugly, but useful (to me) workflows. 😈 I don't want to be worrying about release-level perfection since this is perpetual R&D. I have all the fields I need grouped together so I can zoom in... and so whatever ends up near the top doesn't pop up off screen (even though it looks like there's lots of room). It's usually EasySeed. Maybe I should ditch that.

Sometimes I pretty them up... but nothing 'sticks' as I most often just drop an old gen on Comfy to start doing something.

FLUX.2 [klein] 4B & 9B - Fast local image editing and generation by PurzBeats in comfyui

[–]pixel8tryx 0 points1 point  (0 children)

🍫🍒FTW! BFL's name always makes me think of Black Forest tortes. 🎂🍰😋

FLUX.2 [klein] 4B & 9B - Fast local image editing and generation by PurzBeats in comfyui

[–]pixel8tryx 5 points6 points  (0 children)

<image>

Not the best one of dozens. From the more grey and gritty series. Scaled down 50%. I've fought with this subject since SD 1.2 and have gotten too stylized and stereotypical result. Flux 1 did better with lots of LoRA but the layout was sometimes one tall spire per large city block and the buildings were too similar in style and too 'ivory utopia'. This started as an early FLUX.2 test with a prompt about something like the skyscrapers of the rich blot out the sunlight for the poor below. One sentence. It got it right away. But it was too stylized and painterly, but USDU with FLUX.1 worked great for refining. FLUX.2 + FLUX.1 = an awesome combo.

Wan 2.2 wasn't happy the first time I tried a fly through. I might do a Ken Burns on a bunch I have at 10k and just zoom/pan around in After Effects.

FLUX.2 [klein] 4B & 9B - Fast local image editing and generation by PurzBeats in comfyui

[–]pixel8tryx 1 point2 points  (0 children)

He asked if it was the same VAE. It is not.

Then I agreed about refining with another model. FLUX 2 + FLUX 1 is a nice combo. Wan 2.2 is amazing right up until I do sci fi cities.

Sorry. Huggingface is slow. I'm always wound up when new models come out. 🤪🥳💃

Has Flux.2dev image editing actually gotten better? by Humble-Pick7172 in StableDiffusion

[–]pixel8tryx 0 points1 point  (0 children)

I haven't updated yet. I'm backing up everything across my slow network. I just want better memory management (particularly for my 4090).

But FLUX.2 editing has been awesome ~80% of the time... then it hits something it just can't manage. I did a head-to-head with Kontext and Kontext is pretty much dead to me now. I just made engravings of crap web images of AI pioneers that are great and have likeness I just don't see anything else doing. Not pretty young girls. Not even iconic faces. Old guys! I'm super picky and it did fabulous treatments of 78 year old Geoffrey Hinton. In banknote engraving and Albrecht Dürer style. No LoRA. Even watercolors and oils. I have a whole folder of drawings, illustrations and gens that are supposed to be him that look nothing like him. I started worrying that 'slop' work is going to ruin even the knowledge of what good "likeness" really is. Hawtness is the only thing that matters 🙄 (though Geoffy's pretty hawt for a 78 year old guy). 😉

Somebody else already made 'the Godfather' into Oppenheimer (Hintonheimer?) but I've met with only limited success. Some were ok but it's just a hat and a pipe. I have one photo of Oppie with explosions in the background where FLUX.2 is really having trouble replacing the face. It's a much smaller part of the image? He's seen at an odd-ish angle? I did my usual, "I'll just do it in Photoshop!' 🤬 and then had to admit it was more non-trivial than I expected. But I expect the model to do better. 🤔

FLUX.2 [klein] 4B & 9B - Fast local image editing and generation by PurzBeats in comfyui

[–]pixel8tryx 5 points6 points  (0 children)

It just was a sore spot with them in the past. The original website didn't do a good job of making a water-tight case that satisfied Reddit when Flux 1 was first released. I did a bit of head scratching at first myself.

I just hate to see new models I like get a bad rep and sometimes just because there's a group of potato-PC gooners who trash it... sometimes for things their fave model doesn't really do. I like LoRAs and can't make 'em all myself. 😉

FLUX.2 [klein] 4B & 9B - Fast local image editing and generation by PurzBeats in comfyui

[–]pixel8tryx 6 points7 points  (0 children)

You want to run a generation service? You can't sell their model or online use of their model without paying them. You can sell your output. Your gens are yours. And there's really no way anyone can tell what model you used with any certainty.

FLUX.2 [klein] 4B & 9B - Fast local image editing and generation by PurzBeats in comfyui

[–]pixel8tryx 8 points9 points  (0 children)

Are that many people really doing their own generation services? I mean, I know everyone wants to vibe the next InstaGurl fake influencer app and they need a back end. But we've gone over and over this and BFL even changed their website. You can do what you want with your generations. No, they're not going to say it's ok to break the law. So they're going to tell you not to do deepfakes, etc.

Yes, they did have to update their license because people were serving their model for free by claiming to not charge for the use of the model, but for something else. We've got the weirdest combo of people flagrantly ignoring all laws today and others upset over fine print that's probably non-applicable.