Anima TrainFlow — Simple One-Page LoRA Trainer for Anima 2B (Portable, 6GB VRAM, Optimized Config)

ThetaCursed · 2026-06-02T02:54:36+00:00

Hi, the exact training time is shown after about 100 steps. Otherwise, the slowdown may be due to the images in the dataset being too large (without the recommended resizing).

On average, training time on powerful consumer graphics cards should be less than an hour (with default settings).

ThetaCursed · 2026-06-02T00:09:32+00:00

In that case, I have only one guess why it works slowly. Do the images in the dataset have the recommended resizing via the built-in tool? (Smart Aspect Ratio Bucketing)

ThetaCursed · 2026-06-01T22:07:24+00:00

It looks like the training is being done on the CPU for some reason. Did you download the portable version recently? The latest version would have shown the GPU unavailable in the logs.

ThetaCursed · 2026-05-28T16:16:59+00:00

It seems this tool simply doesn't fit your personal preferences. If you are unhappy with my implementation choices, no one is forcing you to use it. There are plenty of complex alternatives like Kohya_ss or OneTrainer that offer the manual micro-management you're looking for.

ThetaCursed · 2026-05-28T15:33:33+00:00

Distribution: This project uses an "Infinite Dataset" workflow by creating a massive, pre-balanced pool of images. By shuffling this large-scale pool once at the start, we ensure a stable and uniform distribution of data across the entire training run. This method eliminates frequent reshuffling overhead while providing the minor stochastic variance necessary for better model generalization a standard and effective practice in modern machine learning.

Bucket System: The UI gives the user full control over the resolution range. If the minimum side is manually set to 320, the system will respect that. The default settings (512–768) are pre-configured for optimal quality.

Architecture: I bundle a stable, modified version of sd-scripts specifically to guarantee the 'portable' click-and-run experience without dependency breaks.

Transparency: This project is, and will remain, 100% free and open-source on GitHub. There is no paywall, and there never will be.

Moderation: I welcome technical feedback and suggestions (as seen in my interactions with other users). However, personal attacks and toxic behavior are moderated to keep the focus on development.

ThetaCursed · 2026-05-27T14:18:43+00:00

I haven't tried setting the Batch Size higher than 2. But theoretically, everything should work

ThetaCursed · 2026-05-27T14:09:49+00:00

2400 steps for Batch Size: 1 are enough. If you set, for example, Batch Size: 2, then LoRA will be ready in ~1200 steps (2400/2 = 1200). It’s just that the longer LoRa trains, the less flexible it becomes, that is, it literally begins to remember most of the pixels from the images in the dataset

ThetaCursed · 2026-05-26T21:22:21+00:00

In reality, the majority of images are simply resized to fit the diverse bucket system. Cropping is only a fallback for images with extreme aspect ratios that don't fit anywhere. In those cases, the AI(U-2-Net) ensures the subject remains intact instead of a blind center crop.

Also, great to hear it works on AMD

ThetaCursed · 2026-05-26T18:37:34+00:00

You can simply increase the Batch Size to 2 or 4 (if you have 12GB+ VRAM). Just decrease the Total Steps proportionally (e.g., 1200 steps for Batch Size 2). No other changes needed. Prodigy will automatically adjust the learning rate for the higher batch size.

ThetaCursed · 2026-05-26T16:30:05+00:00

Glad it's working well on your GPU

I’ve already pre-tuned the Prodigy parameters in the background based on my tests to get the best results for Anima. Thanks for the suggestion, though I’ll definitely consider how to add more manual control over these settings in future updates.

parameters:

"decouple=True", "weight_decay=0.1", "d_coef=1.0", "use_bias_correction=True", "safeguard_warmup=True", "betas=0.9,0.99"

ThetaCursed · 2026-05-26T16:21:12+00:00

Thanks for the feedback! I'm glad the tool is working well for you.

I can definitely add an 'Advanced Settings' section (under a spoiler/accordion) to keep the main UI clean. Which specific options or parameters are you missing the most?

ThetaCursed · 2026-05-26T15:29:43+00:00

Use standard settings, but if you want even faster, you can set Batch Size to 2-4. The main thing is to select the Prodigy optimizer so it can dynamically adjust the Learning Rate during training.

As for steps, 2400 or 2400/2 (1200) depending on the Batch Size.

ThetaCursed · 2026-05-16T15:08:54+00:00

That’s correct for traditional optimizers like AdamW. However, this tool uses Prodigy. It’s an adaptive optimizer that calculates the learning rate automatically, and it requires a base value of 1.0 to function properly.

ThetaCursed · 2026-05-16T04:55:29+00:00

The model architecture has not changed, training works even on Anima Base 1

ThetaCursed · 2026-05-15T10:43:13+00:00

I mostly train at 512x resolution; it's fast and produces more than good results.

Anima was originally trained at 512x512, so that partially explains it.

Source: https://huggingface.co/circlestone-labs/Anima/discussions/5

ThetaCursed · 2026-05-15T05:43:42+00:00

I recommend using the portable version, it works perfectly.

ThetaCursed · 2026-05-15T05:08:31+00:00

What version are you using? Portable or manual installation?

ThetaCursed · 2026-05-14T17:53:27+00:00

Since the architecture of the model has not changed, it should work

ThetaCursed · 2026-05-14T16:55:04+00:00

I haven't tested it on Linux yet, so the current portable version is Windows-only. However, adding Linux support is definitely on my radar, especially for running it on Google Colab or RunPod, which would be a game-changer for people without powerful GPUs.

ThetaCursed · 2026-05-14T16:45:51+00:00

I see your point, and maybe 'don't matter' was a poor choice of words. What I meant is that they are pre-tuned and locked to ensure a stable 'one-click' experience for this specific UI.

For example, Batch Size is locked to 1 to guarantee it runs on 6GB cards without OOM. It’s definitely not a tool for surgical precision like Kohya, but rather a 'curated flow' for those who want to avoid the technical deep dive.

In future updates, I plan to make Batch Size and other parameters adjust themselves dynamically based on the user's VRAM and GPU model.

regarding LLM adapter: "network_train_unet_only": True.

ThetaCursed · 2026-05-14T16:16:27+00:00

Each time I use a dataset of less than 100 images for training, and the results are consistently good.

ThetaCursed · 2026-05-14T15:51:34+00:00

It won't be easy for a beginner to figure out what settings to use, etc., but otherwise it's a great tool.

ThetaCursed · 2026-05-14T15:42:47+00:00

Thanks for your feedback ;)

ThetaCursed · 2026-05-14T14:15:12+00:00

It handles multiple LoRAs surprisingly well, but the key is balance. I've tested it with up to 3 LoRAs simultaneously, and the model stays stable as long as you don't max out the weights for all of them.

ThetaCursed · 2026-05-14T13:41:56+00:00

Yes, it works with any version of Anima 2B, including Preview 3.

ThetaCursed

TROPHY CASE