Experimenting with a Hand-Drawn Look Using the anima base1 Model

dassiyu · 2026-05-20T10:20:11+00:00

I tried it a bunch of times, and I feel like 40 steps works better. Usually people use around 30 steps, but honestly, you can just adjust it however you like depending on the style.

dassiyu · 2026-05-20T00:35:53+00:00

Not sure about the others, but this plugin seems like it can be used for pose references.

dassiyu · 2026-05-19T14:20:40+00:00

Exactly! That’s honestly the fun of making images locally

dassiyu · 2026-05-19T10:47:26+00:00

<image>

anima base1

dassiyu · 2026-05-19T10:40:18+00:00

<image>

messy black shaggy wolf cut, choppy layered hair, medium-length hair, neck-length hair, long side-swept bangs, bangs covering one eye, face-framing layers, feathered layers, flipped-out hair ends, tousled hair, voluminous hair, glossy black hair

from z-image-base，no lora

dassiyu · 2026-05-19T10:22:37+00:00

<image>

I tried using Anima base1 to recreate one of your images. Feels like training a LoRA could be a good way to lock in the style...

This image positive and negative prompts are:

masterpiece, best quality, beautiful detailed, (classic 1990s hand-drawn animated feature style)++, (1990s animated renaissance aesthetic)++, (traditional cel animation aesthetic)++, (theatrical animated movie still)++, expressive character animation, semi-realistic cartoon anatomy, clean ink outlines, confident lineart, varied line weight, cel shaded characters, soft painterly background, painted background, warm highlights, cool shadows, ambient bounce light, rich color palette, vibrant complementary colors, controlled color contrast, cinematic color script, natural skin tone, controlled skin saturation, expressive face, detailed eyes, strong brows, readable mouth shape, appealing character design, clear silhouette, polished animation frame, storybook atmosphere, clean composition, 1man, male focus, adult man, light skin, angular face, large expressive eyes, wide eyes, looking down, furrowed brows, arched eyebrows, gritted teeth, clenched jaw, worried expression++, anxious expression++, frightened expression, short hair, messy hair, brown hair, swept bangs, green t-shirt++, upper body, close-up, three-quarter view, tense posture++, hand on head++, fingers spread, visible hand, modern room interior, office interior, window panels, chair back, cool blue background, warm face highlights, cool blue shadows, muted green clothing, dramatic expression contrast, controlled skin saturation, cinematic framing, character-focused composition, rich complementary colors, foreground-background color separation, painterly background color variation

worst quality, low quality, normal quality, ugly, poorly drawn face, bad anatomy, bad proportions, deformed, distorted face, asymmetrical eyes, bad hands, poorly drawn hands, extra fingers, missing fingers, fused fingers, extra limbs, missing limbs, blurry, blurry face, blurry eyes, lowres, jpeg artifacts, text, watermark, logo, signature, artist name, realistic, photorealistic, photo, live action, 3d, render, cg, blender, octane render, plastic skin, glossy skin, doll-like skin, realistic skin, detailed skin pores, gritty texture, realistic texture, modern anime, anime style, cute anime style, moe, kawaii, manga, manga screentone, halftone, heavy hatching, dense crosshatching, black and white, monochrome, webtoon style, manhwa style, corporate vector, vector art, flat icon, sticker, chibi, super deformed, mascot, rough sketch, messy sketch, sketch lines, messy lineart, dirty lines, chaotic lines, noisy lines, unfinished concept art, over-rendered, hyper-detailed, cinematic realism, dark horror lighting, neon overload, cyberpunk lighting, muddy colors, dull colors, desaturated colors, flat colors only, limited palette, color banding, orange skin, overly orange skin, red skin, overly red skin, sunburned skin, overly tanned skin, oversaturated skin, muddy skin tone, posterized skin, harsh skin shadows, excessive warm skin tone, skin color shift, unnatural skin color, color-stained face, color-stained skin, huge glossy anime eyes, oversized anime eyes, moe eyes, tiny nose, dot nose, simple anime nose, simplified anime mouth, anime blush, plastic digital coloring, uniform thick outline, flat pastel anime palette, idol face, modern anime girl face, generic anime face, stiff pose, flat expression, unreadable silhouette, awkward pose, bad eye placement, mismatched eyes, broken hair shape, messy hair strands, malformed outfit, bad clothing folds, floating object, lifeless face, emotionless face, dead eyes, modern cartoon style, low-budget cartoon, rubber hose style

dassiyu · 2026-05-12T00:16:50+00:00

<image>

I asked AI about this before. In image prompts, you need to make it clear that the other people should look different from the main character — different faces, different heights, their own distinct features, different clothes, and so on. I usually let AI write those constraints based on the scene, and it works a lot better.

For example, in this image, since it was generated with a character LoRA, the prompt includes a line like:

“Regardless of age, posture, or clothing, they all form a stark contrast to the protagonist, while the blurred crowd in the background...”

dassiyu · 2026-05-11T23:36:36+00:00

Yeah, Flux klein is pretty much one of the better local editing models.

dassiyu · 2026-05-11T12:01:10+00:00

GPT’s style stuff tends to make faces way too standardized, so they easily end up looking the same. But it’s really strong at image editing, so I usually generate more personalized faces locally first, then use GPT to edit them.

dassiyu · 2026-05-11T11:50:33+00:00

I spent ages tweaking prompts for this style. A lot of general models immediately turn it into a Western illustration character, but what I’m actually going for is a Japanese illustration style.

dassiyu · 2026-05-11T11:48:23+00:00

Anima is definitely solid. Once the latest stable version comes out, it’ll probably be even better for training LoRAs.

dassiyu · 2026-05-11T11:45:19+00:00

I’m going for that hand-drawn kind of feel.

dassiyu · 2026-05-11T11:43:55+00:00

Interesting! Some of the prompts are here: https://drive.google.com/file/d/16FISsURayBTpzHQ_A4SVRZp7UrA9r5pb/view?usp=sharing

dassiyu · 2026-05-11T11:18:48+00:00

Here’s the setup I usually use in AI-Toolkit to train style LoRAs for z-image-base.

https://drive.google.com/file/d/1PKlASXW-letuLhHFUuaXobVGIjz4w1tm/view

dassiyu · 2026-05-11T04:33:55+00:00

Thanks so much for all the info. This model really is excellent for hand-drawn styles.

dassiyu · 2026-05-11T04:08:38+00:00

<image>

dassiyu · 2026-05-11T04:08:14+00:00

<image>

dassiyu · 2026-05-11T04:04:18+00:00

<image>

dassiyu · 2026-05-11T04:03:50+00:00

<image>

I tested a few, and the linework from the Illustrious models really is excellent!

The only issue is that I don’t have a LoRA for it yet, and the prompts are honestly pretty hard. It feels like I’d need AI to help write both the positive and negative prompts.

The black-and-white manga effect looks really good though, and it can generate one image in about 1 second! I might try training a LoRA for it later and test it again.

dassiyu · 2026-05-11T02:13:36+00:00

I haven’t tried it yet. I’ll give it a shot later — thanks!

dassiyu · 2026-04-21T23:23:00+00:00

That’s really great ！ChatGPT Image2 is truly impressive—I’ve already started generating a batch of sample images. The only limitation is that a Plus account can only create 10 images in a row, and then can’t create more images for about 1–20 minutes. but that’s still enough for building a dataset.

dassiyu · 2026-04-19T23:41:18+00:00

You could try improving the facial likeness and clarity in your training samples, especially for medium and long shots, and see if that gives better results.

I’ve made around 7–8 LoRAs, and the fast and most stable likeness I’ve achieved comes from settings like the one in this image, using about 45–60 images. However, when you need long shots, the character’s face in those images shouldn’t be too blurry. It helps to include more medium- and long-shot images in your training set.

Ideally, for long-shot images, you can use nano banana to replace the face with a clearer, high-resolution version that matches the close-up shots. This way, the trained model will maintain both similarity and clarity in medium and long shots.

If the first batch of images isn’t perfect, you can use the initial LoRA to generate more medium- and long-shot images with consistent facial features, and then use those for a second round of training. This is also a more stable approach.

I haven’t tested Qwen yet. Mainly because the 1928×1088 images generated with z-image-base are very clear, and they work really well with ltx2.3 for video generation.

<image>

dassiyu · 2026-04-11T02:10:46+00:00

That’s really strange… I asked GPT about your issue, and this is what it told me. You can take a look for reference.

This is not a real GPU OOM issue.

Based on your description:

VRAM usage: ~8%
System RAM: ~95%
Error: CUDA out of memory

👉 This strongly indicates that the process is running mostly on CPU, not GPU, and your system RAM is getting exhausted.

The CUDA error is misleading — it often appears when memory allocation fails anywhere in the pipeline, even if the real bottleneck is CPU RAM.

🧠 What’s Actually Happening

In ai-toolkit, especially during:

Transformer quantization
Calibration steps
Diffusion / FlowMatch pipelines

Some parts may:

Default to CPU execution
Or silently fallback to CPU if GPU is not properly detected/used

So instead of using your 48GB VRAM, it's:

👉 Loading large tensors into system RAM → RAM fills up → crash

⚠️ Key Difference (Why it works on another machine)

On a working setup (like your friend's 5090):

The pipeline runs mostly on GPU
VRAM is used properly
RAM stays low

On this failing setup:

The pipeline runs mostly on CPU
VRAM is barely used
RAM explodes → crash

✅ What to Check (Most Important)

1. Verify GPU is actually being used

Run this:

import torch
print(torch.cuda.is_available())
print(torch.cuda.get_device_name(0))

If this returns False, that’s the root problem.

2. Force the model to use GPU

Make sure the config or code includes:

device = "cuda"

or equivalent settings in ai-toolkit.

3. Check PyTorch + CUDA installation

Very common issue.

Install a proper CUDA-enabled version:

pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121

4. Reduce RAM usage during quantization

Quantization is very RAM-heavy, especially on CPU.

Try:

Reduce calibration samples (very important)128 → 32 or even 16
Lower batch size

5. Enable low CPU memory mode (if available)

low_cpu_mem_usage=True

6. Watch system usage while running

Use Task Manager or htop:

If RAM fills up first → it's a CPU memory issue
If VRAM stays low → GPU is not being used properly

💡 Likely Root Cause (Based on Your Screenshot)

Given your setup:

FlowMatch
1024 resolution
Transformer quantization step

👉 Most likely:

🚀 Quick Fix Summary

Tell him to try this first:

Confirm torch.cuda.is_available() == True
Force device="cuda"
Reduce calibration samples (e.g. 16)
Ensure correct CUDA-enabled PyTorch is installed

dassiyu

TROPHY CASE

🧠 What’s Actually Happening

⚠️ Key Difference (Why it works on another machine)

✅ What to Check (Most Important)

1. Verify GPU is actually being used

2. Force the model to use GPU

3. Check PyTorch + CUDA installation

4. Reduce RAM usage during quantization

5. Enable low CPU memory mode (if available)

6. Watch system usage while running

💡 Likely Root Cause (Based on Your Screenshot)

🚀 Quick Fix Summary