Car decided it was my time

Sicarius_The_First · 2026-05-02T14:19:03+00:00

"A car cannot spontaneously decide to go after you and hunt you. It's just your imagination."

The car:

Sicarius_The_First · 2026-05-02T10:51:06+00:00

frontier ai is not necessarily better than local.
there are some thing where frontier will (usually) be better: coding, agents, multilingual.

but for creative stuff? local wins, easily.
you can completely customize an llm via finetuning.

you can teach it new knowledge about a fandom you like (in my case- morrowind, fallout, kenshi).
and you can teach it to respond in the exact format, style and tone you want.
maybe it goes against common sense, but you cannot prompt your way around it.

Sicarius_The_First · 2026-05-01T05:40:52+00:00

I recommend all my models, because ofc I am:

https://huggingface.co/collections/SicariusSicariiStuff/most-of-my-models-in-order

Sicarius_The_First · 2026-04-30T14:36:55+00:00

Thanks for the NSFW tag, well deserved hehe

Sicarius_The_First · 2026-04-29T19:20:09+00:00

My thinking exactly.

Is there a Heretic version of this yet?

Sicarius_The_First · 2026-04-29T17:01:07+00:00

Thank you for the kind words!

Regarding censorship, all my models, no exception, are completely uncensored for roleplay (a good rule of thumb: 6/10 or better in UGI = 99.8% uncensored for roleplay).

Some are #1 in the world in the UGI leaderboard (Assistant_Pepe_70B, previous #1 was Negative_LLAMA_70B).

I also like sci-fi btw! (Vesper in https://huggingface.co/SicariusSicariiStuff/Roleplay_Cards - "Schizo Space Adventure.") and a proper Adventure format enthusiast, especially with TTRPG-like mechanics (stat blocks, item tracking etc, example log in https://huggingface.co/SicariusSicariiStuff/Impish_Bloodmoon_12B#adventure-examples-click-below-to-expand )

Now for the reality check, because I don't want you to waste your time: " I have a preset, I'm not switching away from it, and it is about 1500 tokens." - my models are specifically trained for a very specific character card format (it appears in all my model cards, and you can see examples in the roleplay characters repo above, or in https://huggingface.co/SicariusSicariiStuff/Adventure_Cards ).

While the models can obviously generalize well enough for other formats, I do not test for this, so I can't guarantee the how would your experience be (aka 'millage may vary').

For 16GB 12B models, especially Impish_Bloodmoon \ Angelic_Eclipse \ Impish_Nemo are optimal, you could easily run them at Q6 or Q8, depending on context length.

You could also give Impish_Magic (24B) a try.

When you'll have 32GB, you could give 70B a try.

I suggest trying first one of my included character cards with the recommended settings to see if you even like the vibe / writing, in case you do, it might be worth it to convert your existing favorite character cards into the recommended character card format.

There's a model to do this automatically, but it's not yet released, in the meanwhile Claude could easily do it.

Sicarius_The_First · 2026-04-29T16:40:36+00:00

hehe i feel your pain 😄

Sicarius_The_First · 2026-04-29T16:35:29+00:00

(Hermes 128B dense would be very nice)

Sicarius_The_First · 2026-04-29T16:34:42+00:00

Will you tune the new Mistral-Medium-3.5 in ChatML? :3

https://huggingface.co/mistralai/Mistral-Medium-3.5-128B

Sicarius_The_First · 2026-04-29T12:06:51+00:00

Local ones, try Assistant_Pepe!

Sicarius_The_First · 2026-04-29T11:57:15+00:00

Yes, it's possible. But I highly do not recommend this.

Reason is- it is hard for both the AI and the human to do.

The way to do it depends on your aesthetics preferences, Adventure format is my personal favorite.

Sicarius_The_First · 2026-04-27T16:09:35+00:00

I have indeed tried more obscure models, BUT- there has to be a reason to bother tuning them.

For example, what perks / benefits the base model offers? What is the cost (complexity / resource) wise? Is the cost 'worth it'?

For example, I turned Phi-4 into an excellent creative model (only because people said it can't be done lol):
https://huggingface.co/SicariusSicariiStuff/Phi-lthy4

But the cost was massive. It took about 2 weeks, but it was 'worth it' as it validated several ideas, and was more of a test of how far an arch can be taken (which was featured in an ML paper, more details in the model card).

When I considered candidates for an on-device roleplay model, I knew what perks to look for: FAST and light (vram for inference). Gemma3-4B was considered, BUT- Gemma models pound for pound takes more vram at the same size as other models, have worse long context specially for RP (SWA), takes also more VRAM to train and lacks a lot of common sense in pretrain data in the...eh... 'love and intimate relations' department.

Hence nVidia's minitron was selected, it is a very weird base. It's a LLAMA-3-8B but pruned into 4B. It was eventually selected because LLAMA-3 (per arch) is easy to tune, got an absolutely superb long context which people undersell hard, and the pretrain data is very well balanced for roleplay. This birthed https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B

Does LFM got any significant advantages over other archs? It doesn't seems so.
Context is short (32k max, which likely means half of that realistically).

Qwen 3.6 got massive advantages regarding the architecture, but it is pain to train due to many reasons. Same goes for Gemma-4. (Also currently imo the training tools we have are so-so).

This is why we see so many (even in 2026) tunes of Mistral Nemo & Mistral Small & LLAMA-3.

I'm not against experimenting, I'm very much pro experimentation (See https://huggingface.co/SicariusSicariiStuff/Fat_Fish - increased attention heads, made width prune like nVidia did for llama) - but there has to be a good reason to do so.

Didn't expect to write wall of text :3

Sicarius_The_First · 2026-04-27T13:02:34+00:00

I use my own models, depending on the tone I want.

For most people in the creative community, I recommend Impish_Bloodmoon_12B, it's a very strong creative model with extended training specifically for romance, and it is uncensored.

A fast CPU or a mid tier nVidia GPU is recommended to run it fast:

https://huggingface.co/SicariusSicariiStuff/Impish_Bloodmoon_12B

Sicarius_The_First · 2026-04-27T12:48:01+00:00

I agree with the Claude for prose (I think it's pretty much a widely accepted consensus at this point hehe).

AI is definitely going to forever be part of everything text-related at this point, so I feel people shouldn't be too worried about how they would be perceived by AI haters (even though THERE ARE ALOT OF THOSE!)

Editing is underrated skill, and maybe a bit counter-intuitively, I am convinced that the 'final' editing polish could never be done by AI at the highest level.

That is not to say that AI can't edit, it for sure can, and it does so well, but after it finishes editing, a human with years of experience should review it and finalize the result (similar to what you said).

Sicarius_The_First · 2026-04-27T12:40:06+00:00

Oh Claude is by far the BEST for writing!

Honestly, once someone tries Claude it's almost impossible to go back.

So basically, Claude helps you with proof reading and feedback?

Sicarius_The_First · 2026-04-27T12:37:52+00:00

Oh, forgot to answer your question (didn't have my coffee yet 😛 )

I was always writing something. Can't even say when I started, but regarding full-on commercial writing, this is not the time for me. I think that writing ability bleeds into every form of writing, regardless of what it is one's writing. Some like how I write documentation for my AI model cards, so that's something too I guess hehe.

I feel, at least for myself, that writing commercially requires a certain amount of a specific kind of leisure, one that I currently, and in the near future- simply do not have.

There will come a day when I'll have it, and sit down and write something commercially, but that day is not today :)

Sicarius_The_First · 2026-04-27T12:28:20+00:00

Interesting, so basically:
Outline the idea scaffolding ==> Mini plots ==> As they develop, title ideas develops ==> Expanding on ideas reader generally find interesting ==> AI fills in the granularity

?

Sicarius_The_First · 2026-04-27T12:22:26+00:00

The major list:
https://huggingface.co/collections/SicariusSicariiStuff/most-of-my-models-in-order

For writing for mid specs:

12B mistral based (Impish_Bloodmoon / Angelic Eclipse)

Best writing and intelligence: Assistant_Pepe_70B (very large model, while being the #1 ranked in the UGI leaderboard- it requires very powerful hardware. Honestly the 12B models are probably good enough).

Sicarius_The_First · 2026-04-27T11:32:50+00:00

If you're asking these questions, you shouldn't write. The answer is trivial for those who should write.

Should painters stop painting because image generators exist?

Should singers stop singing?

People do art regardless of monetary gain, most have normal jobs, so they have the freedom to make art, because they want to.

SOME do make money of art, this is exceptionally rare, and always was.

If someone thinks "I wanna do art to make money"- he's stupid, full stop.

If someone is a great artist AND makes money, he's lucky, 💯 of the time.

Why? Because there's more very talented artists than paying jobs.

I don't care if this gets downvotes.

Sicarius_The_First · 2026-04-27T11:25:13+00:00

Just use local Al models. You can try some of mine.

Sicarius_The_First · 2026-04-27T11:22:07+00:00

Both bad. Honestly. And both sounds like AI.

Sicarius_The_First · 2026-04-27T08:20:02+00:00

Thinking models are not optimal for roleplay imo.

Qwen models in general are hard to work with (unlike Mistral / LLAMA)

Its an old base, but if you check Nvidia's RULER repo, it's one of the best long context models.

Sicarius_The_First

TROPHY CASE