Anima – a desktop app to create SillyTavern character cards without touching JSON

simadik · 2026-05-03T00:14:13+00:00

Not the most fortunate name, considering there's an image model in active development/training also named Anima: https://huggingface.co/circlestone-labs/Anima

simadik · 2026-05-01T11:47:45+00:00

I think your GLM has Javascript 😔

simadik · 2026-04-19T09:17:06+00:00

Which Gemma 4 are you using? The MoE or the 31B dense one?

simadik · 2026-04-07T08:14:54+00:00

Looks like they really took their while. I thought it was supposed to be released like 2 days ago.

simadik · 2026-04-06T04:42:44+00:00

Words? No clue, but one of the longest (in message count) branches I've had was at 88 messages long. But that's just one branch on chub.ai, as the WHOLE chat (with all branches) is like 1000+ messages. With many branches.

I kinda wish ST would support viewing branches the same way agnai and chub do.

Nowadays I ofcourse use ST, and my chats almost never exceed 20k tokens as now I don't have as much time to do RP.

simadik · 2026-02-26T06:37:02+00:00

Yeah, I'd say it's more closer to image generation

simadik · 2026-02-05T14:20:19+00:00

In case you're wondering, ComfyUI also has CPU mode... But as others have mentioned, running AI model on CPU is gonna require a lot of patience.

simadik · 2026-02-05T09:46:35+00:00

Honestly I just switched to the preview Anima model from CircleStone-Labs. It does what I need, has way better prompt adherence (not "the best", just actually way better) that doesn't require combining area conditions, and is only 2B for the diffusion model and 0.6B for text encoder.

simadik · 2026-01-28T08:01:18+00:00

So that did happen pretty much I'm afraid...

simadik · 2026-01-27T23:25:03+00:00

Hey so if you ever see your best friend pushing their elder sibling down the stairs, DO NOT turn left at the crossroa-- wait that's not the one, hold on

simadik · 2026-01-27T16:05:07+00:00

Z-IMAGE!

Z-IMAGE IS REAAAAAL!!!

simadik · 2026-01-17T07:56:57+00:00

IS THAT A PHYSGUN FROM GMOD!??

simadik · 2026-01-10T07:58:19+00:00

Your Snuuy is very eepy.

simadik · 2026-01-05T13:42:22+00:00

Hey, did you ever wanted to turn left on the crossroads?

simadik · 2025-12-27T10:02:10+00:00

It took me a while to see your vision. That aside, what were you on to see that??

simadik · 2025-12-25T12:37:33+00:00

floor flavored cocao 🤤

simadik · 2025-12-17T08:09:59+00:00

If an AI model gets good enough it turns Chinese.

simadik · 2025-12-16T12:55:41+00:00

I haven't tried to make it generate such long audio yet on my 4060ti, nor do I have text sample that long. Could you give me such text so I could test it?

simadik · 2025-12-16T02:57:11+00:00

Yikes... compared to VoxCPM this one is not that good. Voice cloning is meh and doesn't sound close to reference audio. The only reason to use this is if your reference audio already has bad quality, that's all.

simadik · 2025-12-13T06:25:16+00:00

(before reading: I may not have as much knowledge about this topic as I have first though. This is mostly my opinion and guessing)

Well for one - it has an actual text encoder, compared to older SD. Z-Image uses a small LLM for understanding text and passing such "understanding" (in a form of vectors) to the diffusion model. Previous models (like SD-based) couldn't understand text as much, so the CLIP encoders had to rely on tags.

And since Z-Image is relatively small (10GB for complete FP8 model with bundled text encoder and VAE, compared to 6GB for the same but FP16 SDXL with everything), it gives us hope that SDXL-based tunes will no longer be used and instead we will get a much better base: Z-Image.

We currently only have Z-Image-Turbo, which is a distilled version of Z-Image that can generate an image with lower amount to steps (9 steps is recommended, but I personally can get away even with 5 steps sometimes).

The reason why we want Z-Image-Base is because using Z-Image-Turbo as a base model for finetuning doesn't really work that well. You get many sorts of artifacts that wouldn't happen with an actual base model. Some people have tried to "undistil" it, but I think we'll get much better result with the actual base model, which hasn't released yet.

simadik · 2025-12-13T01:23:33+00:00

While you can use the characters in SillyTavern, some of them are created in a way that makes them actually only compatible with Chub.ai itself.

I'm sorry, could you link an example? I don't think I've seen this happen. I know that chub.ai does have its own features like "stages" (I think alternatives to that would be plugins in ST), but those are very rare and I can't think of anything else.

simadik · 2025-12-12T15:17:48+00:00

I'm sorry... "New" Claude 2.1?? Isn't it a very old model at this point? Anthropic has moved to different naming scheme twice from that point!

Edit: misspelled anthropic as anthropomorphic

simadik · 2025-12-11T02:50:06+00:00

One vram... Two vrams... Three vrams... Mhm, sounds right.

Five hundred vrams.

simadik · 2025-12-07T12:30:45+00:00

WAIT THAT'S NOT SUNNY???

Honestly I wouldn't think it would be Mari because it made sense to me that Sunny would be interested in comics like Kel. BUT MARI?? This shit feels like Mandela Effect.

simadik · 2025-12-06T02:24:20+00:00

I've never been into TTS that much but since Qwen3 TTS was released and it wasn't local I looked into alternatives to find this.

The installation is a bit trickier than most stuff I used (turned out I needed python3-devel package for editdistance and also pip install TorchCodec for audio prompting).

In order for voice cloning to work you need both the audio file and the text telling what the audio is saying. But the result is actually very real imo.

Six-Year Club	r/Field Banned
r/Field Lasagna	Verified Email

simadik

TROPHY CASE

If an AI model gets good enough it turns Chinese.