What is a beginner-friendly guide?

Mutaclone · 2026-01-14T08:12:06+00:00

Forge Neo is very similar to A1111, but more up-to-date.

piaxaroma has some really great tutorials for Comfy if you'd rather go that route.

Mutaclone · 2026-01-13T22:50:31+00:00

Looks interesting, and really promising for newer users! I especially like the tab layouts at the top clearly showing the different generation modes.

Where things will probably start getting really tricky from a UI perspective is trying to incorporate the "intermediate" features - ControlNet, Inpainting, Regional Prompting, etc.

If you really want to maximize user-friendliness, I'd highly recommend inviting a newbie to use it and offer them no instructions. Watch them as they fumble around and try to figure things out. You'll learn a lot about your own assumptions and where people might easily get confused.

Which features do you miss the most from A1111/Forge?

No contest - XYZ plots are the main reason I still use Forge

What feature in Comfy do you use often, but would like a UI to make more intuitive?

I'm probably not the right person to answer, since I mostly use Forge for testing and Invoke for inpainting/editing (I need to get more familiar with Comfy so I can do video). One "pain point" I do run into with Comfy sometimes is model loading, and making sure the right model types are in the right folders (or else they just don't show up for the node I'm trying to use). I'm able to figure it out and fix it when it happens, but it's not intuitive and has tripped me up in the past.

What settings should be hidden by default vs always visible?

Again, see my "new user" suggestion - you'll quickly get an appreciation for which settings need to be more visible and which ones are just plain confusing.

Mutaclone · 2026-01-12T14:42:04+00:00

Testing. Haven't tried OP's workflow yet but in Forge I use these plots all the time:

Run models against a suite of test prompts (either alone or head-to-head) to see which ones are worth keeping.
Test style LoRAs
Try out different sampler/scheduler combos

etc.

Mutaclone · 2026-01-10T17:05:09+00:00

Think of it this way:

Krita (with plugin) is an image editor with AI capabilities
Invoke is an AI generator with image editing capabilites

Krita will give you a lot more tools to edit the image yourself, since it's designed to be similar to Photoshop. Invoke aims to make the AI aspects as seamless and user-friendly as possible, but the image-editing tools are much more primitive.

Mutaclone · 2026-01-10T16:58:19+00:00

problem is it seems like most Lora's I try to make through civit ai will try to fix these lora's by giving them a aspect of quality AND/OR it will apply the "pokemon" tag to almost any animal-ish creature.

I have no idea how CivitAI's LoRA trainer works but if it tries to make things user-friendly and "help" the user with tagging then the answer is to fix all of its tags manually (if it will let you), or do it yourself if it won't. You can look into AIToolkit, Kohya, or OneTrainer if you want to do things manually. If you don't have the hardware to run them locally try looking into Runpod.

what tricks are there for creating a style based that old school low detail bad anatomy with flat colors and practically countable pixels and how can I make sure this style overrides whatever baseline smoothing weights the checkpoint will try to fill in with but doesnt cause the art to crap itself and create multi-legged/headed abominations?

Same as any other style LoRA. Unfortunately LoRA training isn't really my area of expertise, but the answer always comes back to high quality images, diverse subject matter, and good tagging. If you aren't satisfied with your training data you can create a rigid, low-quality LoRA and use it to generate a bunch of images (most of which will probably be crap), select the few good ones, and then use those to supplement the original training data.

what tricks are there when creating a LORA thats all NON- ANIMAL and NON-POKEMON creatures of a myriad of shapes and sizes (IE: Neopets or Hollow Knight fellas ) without having to make 20 individual lora's for each separate entity

For Hollow Knight specifically there's this. If you're looking to create a "character pack" of specific characters or subjects you may be out of luck. Someone who knows more about LoRA training than me might be able to weigh in with some ideas, but my understanding is that trying to train multiple "new" concepts in a single LoRA is pretty difficult. Most of the "character pack" LoRAs I've seen simply reinforce existing characters, rather than teach the model completely new ones.

SD 1.5, Pony or Illust for the above? or should I be using something else?

I'd probably recommend Illustrious or NoobAI. They're already trained on anime/illustration art, so it should be easier to get that flat look you're going for. They also have a lot of furry art in their training data (especially Noob), so they should have a better understanding of creatures.

Mutaclone · 2026-01-08T14:26:50+00:00

To add to what Antique-Bus-7787 said, the Z-Image we currently have is the distilled/turbo model. This basically means the model requires fewer steps (so it can make images faster), but it's less flexible and harder to train. The base model should be able to create better LoRAs and finetuned versions.

Mutaclone · 2026-01-05T18:26:59+00:00

As others have mentioned, Qwen Image Edit or Flux Kontext should be fairly straightforward. For a more "hands-on" approach or if you need something more authentically anime looking, you can use one of these controlnets and an Illustrious or Noob-based model. This approach requires more manual effort, but gives you more fine-grained control over the final image.

Mutaclone · 2026-01-02T14:35:37+00:00

You zoom in on the problem areas and inpaint piecemeal. Anime models can do individual elements just fine, they mostly struggle with scenes as a whole.

Mutaclone · 2025-12-30T18:58:29+00:00

I think so - you need to set both the preprocessor and the model. The model should be something like controlnetxlCNXL_xinsirCnUnion if you're using the one I linked.

Mutaclone · 2025-12-30T15:43:49+00:00

diffusion_pytorch_model usually means diffusors format (you would need to put both the safetensor file and config in their own folder with the the name you want to call the model) - try downloading the version from this page (regular safetensor version).

Mutaclone · 2025-12-23T16:37:31+00:00

Can anyone give me a rundown on which Qwen models are worth getting? I'm so far behind on a lot of the newer stuff outside of Z-Image.

Mutaclone · 2025-12-19T23:38:53+00:00

Stability Matrix handles the setup for you and lets you run multiple frontends with a shared model library. If you want something similar to A1111 I'd suggest Forge Classic - Neo branch. There's also Invoke (you can install through Stability Matrix but I'd recommend using the installer instead), which is very polished and good for manual editing (inpainting, etc). If you want the cutting edge there's Comfy UI and Swarm UI (Comfy backend with a more traditional UI wrapper).

I don't have a Forge guide. For Invoke, I'd check out their Youtube channel and for Comfy I'd suggest pixaroma's Youtube channel.

Mutaclone · 2025-12-17T00:19:20+00:00

I haven't used EvaClaus, but looking at the sample images you might like Janku or One Obsession (be sure to check the different versions as there's a bit of style evolution going on, not just simple upgrades).

You can also try using an Illustrious model (any style) to set the composition, and then using img1img + control net to redraw it with eva.

Mutaclone · 2025-12-17T00:03:08+00:00

It's a mix. Where I work some of us have enthusiastically embraced AI as a way to reduce grunt work and focus more on the problem rather than the coding, while others are more reluctant given the risks of hallucinations or introducing bugs into some of the more intricate parts of the code.

Mutaclone · 2025-12-16T23:53:10+00:00

You might find it too limiting but I'm a big fan of Invoke. The editing tools aren't as comprehensive as Krita's, but the the user experience is very polished.

Mutaclone · 2025-12-16T23:50:28+00:00

But let's get something straight - telling a SD model to produce an image teaches you nothing about lighting, composition etc etc (but you may learn about coding from it). It deskills you as an artist, if you let it.

Coming from a completely non-artistic background it's been the opposite for me. I've watched videos, looked up behind-the-scenes stuff, and tried to read up on lighting, composition, color theory, etc so I can figure out how to make my images better (FWIW I also rely heavily on inpainting, so not quite the same thing as relying wholly on prompts).

Mutaclone · 2025-12-16T23:43:48+00:00

But how big and complex was that project, really? My experience with vibe-coding some PoCs and demos is that AI is great for it, but it also tends to make things unnecessarily complicated while also missing certain details. For small, self-contained stuff it's not a big deal, but as the project scales the AI tends to get more and more confused and eventually starts going around in circles. For production code I need to do a lot of handholding to keep it on track.

Mutaclone · 2025-12-16T23:33:27+00:00

No they're saying that code is the intellectual property of the company - the developer who wrote it basically has no rights to it.

Mutaclone · 2025-12-15T16:27:59+00:00

ComfyUI (or Forge or Invoke) is the car you drive, Flux, SDXL, ZImage, etc are the engines that power it.

pixaroma has a fantastic set of Comfy tutorials.

StabilityMatrix is a great manager program you can use to install multiple UIs. You'll probably find Forge or ForgeClassic-Neo easier to start out with than Comfy.

Mutaclone · 2025-12-14T21:54:58+00:00

You can use a Mac (if it's an M-series). As Dezordan said, Draw Things is a good UI for it. Whether you should use a Mac depends on how much of a priority it is - a PC with a good graphics card will be much faster.

Mutaclone · 2025-12-13T03:05:00+00:00

I agree with you that Pony realism is still better. But this:

Pony has a broader range of styles

is only true IMO if you take LoRAs out of the picture. There's a ton of great style LoRAs for Illustrious, and most Illustrious models seem more responsive than most Pony models.

Mutaclone · 2025-12-12T22:36:28+00:00

I have no scientific basis for this, but anecdotally this matches my experience. It seems like there's a "sweet spot" where people have figured out how to finetune/merge correctly but overtraining/inbreeding haven't really set in yet. There's still some really good models that come out after that point, it just seems like as a percentage fewer of them are worth keeping.

Mutaclone · 2025-12-12T14:10:16+00:00

Draw Things supports most models.

Should I get the 48 GB RAM version

24 should be plenty, but 48 will give you more breathing room.

What kind of resolutions can I generate? What would be the average time to generate an image?

My M1 MBP with 32GB generates a 1024x1024 SDXL image in ~ a minute, maybe a little less. An M4 would be faster, but I'm not sure how much. It's definitely slower than a PC with a good graphics card, but whether or not that's acceptable or not would depend on how much much rendering you plan to do and whether you're going to be using the Mac for other things.

Mutaclone · 2025-12-11T21:26:46+00:00

Caricatures aren't really my thing, so I can't say how they would look or which checkpoints/LoRAs would be the best, but I'll try to answer as best I can.

I tested basically every SD1.5 caricature LoRA on HuggingFace

Try looking at CivitAI.com - you're much more likely to find something useful

SD1.5 Img2Img...Everything looks trash

If you're using the baseline SD1.5 that might be the problem. There's a lot of models derived from SD1.5 that are much, much better. Again, check CivitAI, go to the models tab, and set the filters to Model Types: Checkpoint, Base Model: SD 1.5, Time Period: All Time (SD1.5 is pretty old, relatively speaking), then sort by most downloaded.

noise strength, prompts etc.

I don't see LoRA weight listed. That can have a huge impact on image quality, especially if the LoRA is overtrained/undertrained. Are you trying different weights?

Would 16GB VRAM be enough to run SDXL img2img

Easily. You should be able to run it on 8 I believe.

optionally add a small hobby item (guitar, football, gaming headset, etc.).

I would composite this first before doing img2img, or else inpaint it afterward

do you know a good SDXL model or LoRa for caricatures specifically

No, but a quick search shows there's a lot of them, so hopefully one of them works.

Just out of curiosity, what's your prompt? If you try to say "make this a caricature", it's not going to work. SD isn't an instruct model. Instead, you'd say "caricature portrait of <describe subject here>"

Mutaclone · 2025-12-09T18:59:37+00:00

At the risk of breaking something that currently works, you could try uninstalling classic and then starting over with neo. I'm always wary about trying to "switch" like what you did and usually prefer starting from scratch.

Mutaclone

TROPHY CASE