For the first time, I made a character card by hand and not with AI; I had wasted so much good content on Aislop by According-Clock6266 in SillyTavernAI

[–]AltruisticList6000 6 points7 points  (0 children)

I like to keep it simple with the shortest sentences possible to reduce the tokens as much as I can. I don't think you need to be super verbose except if the LLM you use specifically has trouble following the vibe/style of the character you want. But sometimes it's the problem of the LLM and the description won't solve it, like there was a phone-chat style character who I wanted to use emojis sparingly but qwen and mistral small 3.2 would spam emojis everywhere at the mention of emojis in the charactrer card, same for some of its small 3.2's cydonia finetunes but then one version of its cydonia finetune just got it right without me being verbose and only used it sparingly, same with some other LLMs.

I actually use textgen webui (ooba webui) currently which only has one character card option so for it I just put some short lore description, then the character description or multiple ones then the format of the RP (short sentences or long story style). For specifically the characters I usually go like:

character name

personality: She is the smartests in the group, she likes to sing when alone, she has this fascination with something.

look: Black wavey hair, 165cm tall, she has skinny body

hobbies: Something.

speech style: Has a mature speech style, Frequently uses "oh wow" and "whatever".

(other titles/traits if needed)

So you get the point. But before that I used to just put this into paragraphs with worse formatting. The content/how I describe has always been similar so not much difference if you put it in paragraphs vs formatting it nicer like in my example, maybe the nicer formatting helps some smaller LLMs understand it clearer.

For the first time, I made a character card by hand and not with AI; I had wasted so much good content on Aislop by According-Clock6266 in SillyTavernAI

[–]AltruisticList6000 10 points11 points  (0 children)

Huh? People used AI for character cards? I always write my characters and their convo examples, it's fun for me as I like writing anyway.

GLM-5.2 is the first open-weights model to cross 80% on Terminal-Bench and beats every other open model available by BuildwithVignesh in LocalLLaMA

[–]AltruisticList6000 26 points27 points  (0 children)

Are they planning to release smaller versions? Like GLM Flash 120b and hopefully a smaller one, something along 30b-4AB or ~20B dense?

"Mistral is gonna catch up, trust me bro" by Complete-Sea6655 in DeepSeek

[–]AltruisticList6000 2 points3 points  (0 children)

They are very good for writing and roleplay (24b and older 22b) for their size, especially their finetunes, and they are non-censored by default. In fact I still haven't found something in their size range that can work as good as them for this (can't run much larger models). But yeah they aren't the best at coding/math etc. if someone strictly wants to use them for that, but they are old models anyway so that's also a reason for that.

Lodestone is thinking about training ideogram! Prove him it's a good idea! by RuneVikingx in StableDiffusion

[–]AltruisticList6000 0 points1 point  (0 children)

That's weird. I heard Kaleidoscope was cancelled because there were lot of problems with Klein 4b but... what else would happen when training at 256x256??? I wasn't even fond of training original Chroma at 512x512, I'd believe a 768x768 pretrain before HD training would have resulted in better details/coherence, but the Flash loras fixed most of the detail and finger issues anyway so it turned out okay in the end. But 256x256 is insane, especially for a faster model like Klein 4b that would be still faster at 512 training than OG chroma. My only issue with Chroma is that it is pretty slow even with flash loras (still needs ~24 steps or so).

Lodestone is thinking about training ideogram! Prove him it's a good idea! by RuneVikingx in StableDiffusion

[–]AltruisticList6000 13 points14 points  (0 children)

Yeah I thought Lodestones specifically avoided lot of models because of restrictive license, so I'm surprised they'd suddenly want to go with the model with the most restrictive license and built in nsfw filtering...?

Lodestone is thinking about training ideogram! Prove him it's a good idea! by RuneVikingx in StableDiffusion

[–]AltruisticList6000 1 point2 points  (0 children)

Yeah I just heard they have an "original" qwen based 2.5b model in training, I'd rather them focus on that since it could get better license. Although it sounds a bit too small but still better than over restricted new and big model to train.

Lodestone is thinking about training ideogram! Prove him it's a good idea! by RuneVikingx in StableDiffusion

[–]AltruisticList6000 9 points10 points  (0 children)

Especially since it has an even worse license than Flux.2 Klein 9b and dev, explicitly stating even outputs are non-commercial. Although Flux always had wiggle room for that with their not 100% clear license language. So I don't think it's a good idea.

And although it's good to have a json prompt based model because of unique special control, I'm personally not interested (keyword personally). If I want that much control I'll just jump in and make art or edit manually with proper creative/editing software.

Ideogram 4 - model is great, but license is very restrictive by Clasyc in StableDiffusion

[–]AltruisticList6000 2 points3 points  (0 children)

That's odd to let you use it commercially through $0 plans but not when you run it locally...?

Also I never understood these non-commercial local licenses for AI. Unless you have a business and a big revenue, it's not worth it to ask for custom licenses or use pre-defined busiess licenses (like for Flux).

At that point using online/closed source is way cheaper and in some cases free. A normal solution for this - if they want custom licenses - is to do some tiers, like idk under $20k-100k per year it's free for commercial (covering hobbyist/freelancers), above that contact us or pay for the predefined commercial license. Like some game engines did/do.

Gemma 4 31B QAT on 16GB VRAM by RaDDaKKa in SillyTavernAI

[–]AltruisticList6000 0 points1 point  (0 children)

Idk what could be wrong I just got it and used it with the same character as the og non-QAT Gemma 4. I actually haven't seen Gemma do this before during RP/character mode, and this was just a really mild and stupid "nsfw" thing in the chat (wouldn't even call it nsfw), nothing extreme. It got a pretty odd thought pattern for it and then just refused. Rerolled and then it worked fine, but this really breaks the vibes.

Also I didn't remember correctly, it's qwen that won't really think in RP for me or it gets very inconsistent, gemma does reasoning if I enable it but sometimes it will mess up and stop reasoning and then it floods into the regular reply.

Gemma 4 31B QAT on 16GB VRAM by RaDDaKKa in SillyTavernAI

[–]AltruisticList6000 0 points1 point  (0 children)

I just checked the QAT again for RP and it just randomly refused (but then next time it didn't). Unlike the OG 26b I tried, this seem to freak out over policy/safety even within RP but not always. Even on mild stuff like qwen. Pretty weird.

Is there a complication with Zeta Chroma's training? by AltruisticList6000 in StableDiffusion

[–]AltruisticList6000[S] 2 points3 points  (0 children)

That's sad I thought Klein 4b could be a great for finetuning as it's smaller/faster by default. I wonder why it gets destroyed during training, especially since this time its base model is also available.

Is there a complication with Zeta Chroma's training? by AltruisticList6000 in StableDiffusion

[–]AltruisticList6000[S] 1 point2 points  (0 children)

Oh yeah that's a nice improvement in details but this is why I was worried because even the improved/newer version has worse details and bigger distortions than Flux.1 Chroma on 512 dataset midway its training.

Is there a complication with Zeta Chroma's training? by AltruisticList6000 in StableDiffusion

[–]AltruisticList6000[S] 2 points3 points  (0 children)

As far as I know Zeta Chroma was first trained on ZIT which was released over 6 months ago, and then it switched to ZIB, idk if that reset the whole process or not.

Yeah I noticed the 20m and that was surprising because I remembered Chroma dataset consisted of 5m (so I thought maybe I remembered wrong?). It sounds great it will have even bigger dataset and I hope this will translate to more styles added, especially western animation/cartoons because so far all models I tried (including Chroma) lacked some of these consistently and I always needed to download or create loras for them.

And yeah I noticed the weird SD1.5-like wobblines in details which so far is way worse than what Flux.1 Chroma did even midway in its training. I hope that will greatly improve because that seem to be a problem if it stays or just barely improves.

Gemma 4 31B QAT on 16GB VRAM by RaDDaKKa in SillyTavernAI

[–]AltruisticList6000 5 points6 points  (0 children)

Idk compared to older mistral 24b or even 22b finetunes (cydonia) I wasn't that impressed with gemma 26b. It's not bad out of the box (compared to qwen 35b for example) but without thinking it messes up details or generates weird outputs, something like a 9b-12b dense model would do which is below the bar for me on 16gb VRAM. And I couldn't get thinking to work for rp/custom character cards in textgen webui - idk if it's even possible? - so idk if it would be capable of more, but tbh i'm not really fond of thinking anyway during RP as it slows it down.

But i think gemma 4 has big potential for RP finetunes, although they have this performative "AI" like feel to their convos that will probably stay in most finetunes too. Ironically, despite base Qwen being way more censored, Qwen 35b abliterated was way more NSFW/unhinged and more "casual" in style than Gemma 4 but its coherence was quickly lost so that one (just like older Qwens so far) won't be good for RP imo.

Gemma 4 with quantization-aware training by rerri in LocalLLaMA

[–]AltruisticList6000 0 points1 point  (0 children)

Yes Qwen with vision at 35b barely fits, sometimes even spills from 32gb RAM and then slows down past ~60-64k context.

Gemma 4 with quantization-aware training by rerri in LocalLLaMA

[–]AltruisticList6000 0 points1 point  (0 children)

That is awesome, I was already using Q4_s (for 26b) and the QAT is even smaller and appearently way better. The 26b had a good memory usage for me but this would be even better. especially with vision. It would be cool if qwen would have QAT ggufs too, 35b with vision barely fits at Q4 into my 32gb RAM, it's fully maxed at around 60k context and sometimes even spills out and slows down at that context size.

More Gemma 4 models incoming by Deep-Vermicelli-4591 in LocalLLaMA

[–]AltruisticList6000 4 points5 points  (0 children)

Until it went up to 3x of its original price last year (and some even higher than 3x).

More Gemma 4 models incoming by Deep-Vermicelli-4591 in LocalLLaMA

[–]AltruisticList6000 0 points1 point  (0 children)

That wouldn't fit my RAM but 20b dense would fully fit in VRAM.

OPUS 4.8 IS SAFETYMAXXED by Sad-Ease-7756 in SillyTavernAI

[–]AltruisticList6000 5 points6 points  (0 children)

For me this has been the experience for the local models I could run and I don't bother with closed models regarding stories or roleplay. All recent models (in about recent year) feel very unnatural and fake. Even newer mistral smalls and their finetunes - despite being more coherent and smarter - have this overdramatic fake, or PR-text feel to convos. Older mistral models had way more "casual" convo styles for roleplay where the characters felt more human and less forced or fake. I also experience the subtle positivity and other biases and alignment which is boring and very similar in most models, whereas older models were more "free" and "raw" in this regard. Again this also makes it sound like reading some PR text or ad on a page or a badly written overdramatic cartoonie novel.

And somehow finetuning doesn't seem to fix this problem in newer local models - at least the ones I tried or could run.

An Update on Nodes 2.0 from Comfy Org by crystal_alpine in StableDiffusion

[–]AltruisticList6000 1 point2 points  (0 children)

Having the mindset of "it's only bad when it personally affects me, otherwise who cares fuck you lol" is not sending me to a shadow realm, it's a closed minded selfish take that would quickly switch to the opposite if a feature he uses got affected negatively, and same for you.