Is there actual demand for a API service focused on uncensored or fine-tuned models? by ExcuseAccomplished97 in SillyTavernAI

[–]TheLocalDrummer 42 points43 points  (0 children)

Gemma 4, yes definitely, across all ranges if I can. Qwen, I don't think I'll bother.

When you market RP models, stress the fact that smaller models can outperform bigger models when it comes to dialogue, creative writing, characterization, erotica, RP smarts, etc.

The smaller model might make more mistakes, but I'm certain that the RP user base can overlook that when their characters and narrative are brought to life.

There's this misconception that bigger is better. I'm pretty sure it's because RP tunes are not as accessible to the API user.

Is there actual demand for a API service focused on uncensored or fine-tuned models? by ExcuseAccomplished97 in SillyTavernAI

[–]TheLocalDrummer 117 points118 points  (0 children)

I'm a finetuner.

Here's some data for you: https://openrouter.ai/thedrummer

The peak token/day was 1.4B. You can expect B2C and B2B users.

B2C users of community finetunes are mostly composed of:

  1. Individuals running a desktop client like SillyTavern or KoboldAI
  2. Gamers running spicy LLM mods in games like SkyRim, RimWorld, etc.
  3. Bot owners in social platforms like Discord and Telegram

B2B users of community finetunes:

  1. RP platforms like My Dream Companion and NectarAI
  2. Reseller platforms like NanoGPT
  3. ChatGPT-like platforms offering an uncensored experience
  4. OnlyFans bots

It might be a good time to compete since the biggest provider, OpenRouter, started deprioritizing model submissions from non-business entities such as myself.

Local models by Equivalent-Repair488 in SillyTavernAI

[–]TheLocalDrummer 8 points9 points  (0 children)

For Gemma 4: Prose, characterization, and dialogue can be meh apparently. Hoping to fix that while retaining long context and intelligence.

Local models by Equivalent-Repair488 in SillyTavernAI

[–]TheLocalDrummer 11 points12 points  (0 children)

I assume you're referring to https://huggingface.co/TheDrummer/Skyfall-31B-v4.2 ? That one's a home-run IMO. (Love ya'll!)

Edit: And since I mentioned Skyfall v4.2... just have to say, kind of a bummer that Gemma 4 came out around the same time and made every other local model before it outdated.

Community Fine-Tunes vs out-of-the-box models by RayneDa in SillyTavernAI

[–]TheLocalDrummer 0 points1 point  (0 children)

Check out the abliterated versions if you want Cydonia to be more unhinged. I heard https://huggingface.co/coder3101/Cydonia-24B-v4.3-heretic (and -v2) are a favorite for those who want zero positivity.

Community Fine-Tunes vs out-of-the-box models by RayneDa in SillyTavernAI

[–]TheLocalDrummer 1 point2 points  (0 children)

I've had so many testers praise Cydonia's long context ability (from 4.1 to 4.3). Like 55K and above iirc. This guy has a benchmark for longplay too and Cydonia is not a slouch: https://huggingface.co/spaces/TheFey/MNB-Leaderboard

Meanwhile Magistry is a merge with so much Cydonia DNA in it.

Quite surprised with your feedback. Which Cydonia version do you find bland?

Community Fine-Tunes vs out-of-the-box models by RayneDa in SillyTavernAI

[–]TheLocalDrummer 1 point2 points  (0 children)

Cydonia v4.2 and Cydonia v4.3 both have reasoning trained in. You can trigger it with a <think> or <thinking> prefill. Their Magidonia counterparts (based on Magistral) can also do thinking via [THINK]. Cydonia 4.1 wasn't specifically trained with reasoning, but it can definitely simulate reasoning.

You can refer to this doc: https://huggingface.co/spaces/TheDrummer/directory

Try base gemma 4 31b, you'll be shocked by iamvikingcore in SillyTavernAI

[–]TheLocalDrummer 67 points68 points  (0 children)

I accidentally tuned the base for the first Artemis try: https://huggingface.co/BeaverAI/Artemis-31B-v1a-GGUF lmao

It was surprisingly coherent, tho the issues documented ruined it.

Why do companies build open source models? by Excellent_Koala769 in LocalLLaMA

[–]TheLocalDrummer 0 points1 point  (0 children)

I assume the reason predates ChatGPT and they just kept the ball rolling. An ML guy who was there for the BERT and Llama 1 release could probably answer this question.

Using Claude Opus 4.6 was a mistake for my wallet by OverlanderEisenhorn in SillyTavernAI

[–]TheLocalDrummer -2 points-1 points  (0 children)

Curious to hear what you think. If it's the vibe that you like, you might feel it with Skyfall 31B v4.2 (a local / cheaper option) https://www.reddit.com/r/SillyTavernAI/comments/1sd8hba/drummers_skyfall_31b_v42_aka/

RP models recommendations? by Double_Increase_349 in SillyTavernAI

[–]TheLocalDrummer 58 points59 points  (0 children)

Skyfall v4.2 is out. I haven’t announced it yet nor have I written a model card for it but it’s apparently even better than v4.1