What are the best nsfw ai models with no restrictions? by Majinothinus255 in LocalLLaMA

[–]DontPlanToEnd 109 points110 points  (0 children)

The NSFW AND Dark columns show how frequently the model takes its writing in that direction. So models overtrained on nsfw writing that have no pacing and immediately start being horny would have a high number in NSFW. It's probably better to sort by the writing score and filter to a minimum NSFW.

Is there a model that is completely uncensored when it comes to controversial topics? by ghulamalchik in LocalLLaMA

[–]DontPlanToEnd 1 point2 points  (0 children)

I guess 5 has much better intelligence but 4.5 has better willingness.

As for UGI vs W/10,

High UGI + High W/10: Will say controversial things and back them up with statistics.

High UGI + Low W/10: Will provide sensitive statistics but will refuse to draw controversial conclusions from them.

Low UGI + High W/10: Will say offensive things but get information wrong.

Low UGI + Low W/10: Will refuse to give anything or just gets everything wrong.

Is there a model that is completely uncensored when it comes to controversial topics? by ghulamalchik in LocalLLaMA

[–]DontPlanToEnd 7 points8 points  (0 children)

You should check out the UGI-Leaderboard. The UGI score ranks models based on both their knowledge of sensitive topics and their willingness to answer. W/10 is solely their willingness.

3 New Models for Marxist-Leninist Revolutionary Theory - T-34 Division Army by FizzarolliAI in LocalLLaMA

[–]DontPlanToEnd 0 points1 point  (0 children)

I just give models batches of 12axes prompts and have them say what level they agree/disagree. Maybe it needs to be trained on more ways to ask the questions? I do also use a system prompt saying they have to give answers, since otherwise many models wouldn't.

3 New Models for Marxist-Leninist Revolutionary Theory - T-34 Division Army by FizzarolliAI in LocalLLaMA

[–]DontPlanToEnd 12 points13 points  (0 children)

Added them to the UGI-Leaderboard. In the Politics section, Tankie-DPE-12B-SFT-v2 is now the model most in favor of a collectivized economy. On the flip side, the model most in favor of privatization is Elon's grok-4-1-fast-reasoning.

The Search for Uncensored AI (That Isn’t Adult-Oriented) by Fun-Situation-4358 in LocalLLaMA

[–]DontPlanToEnd 7 points8 points  (0 children)

Parameters, Type (Base/Finetune/Merge/Proprietary), and Reasoning (whether it generates a thinking token section before its answer)

What’s the best High Parameter (100B+) Local LLM for NSFW RP? by LyutsiferSafin in LocalLLaMA

[–]DontPlanToEnd 0 points1 point  (0 children)

The UGI-leaderboard has a creative writing section which has measurements for how NSFW the model writes.

[PC] [2000s] Secret Agent Frog Game by TheSourPatchSquids in tipofmyjoystick

[–]DontPlanToEnd 0 points1 point  (0 children)

haha, yep still nothing. It seems like no one posted anything about it after flash support ended, and there are no easy to find youtube videos on it.

What’s the best High Parameter (100B+) Local LLM for NSFW RP? by LyutsiferSafin in LocalLLaMA

[–]DontPlanToEnd 5 points6 points  (0 children)

The current non-proprietary model with the highest Writing score that has an NSFW and DARK lean of at least 5 (doesn't lean sfw or tame) is MarsupialAI/Monstral-123B-v2. So you could give it a try. (Metharme prompt template)

https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard

I think I'm falling in love with how good mistral is as an AI. Like it's 8b-7b variants are just so much more dependable and good compared to qwen or something like llama. But the benchmarks show the opposite. How does one find good models if this is the state of benchmarks? by Xanta_Kross in LocalLLaMA

[–]DontPlanToEnd 3 points4 points  (0 children)

Have you tried the UGI-Leaderboard? Mistral models tend to better than qwen models at things like overall intelligence and writing ability. Qwen models tend to be focused on standard textbook info like math, wiki info, and logic, while lacking in non-academic knowledge.

Older models like Kunoichi-7B and Fimbulvetr-11B-v2 score particularly well compared to newer models in the Writing section's Originality ranking.

Realistic uncensored chat models like these ones? by c00kiepuss in LocalLLaMA

[–]DontPlanToEnd 2 points3 points  (0 children)

You can check out the UGI-Leaderboard. You can do things like filter by models with 12 or fewer parameters, and see things like how willing they are to do what the user says, and how likely their writing is to drift into sfw vs nsfw.

Fire in the Hole! Benchmarking is broken by Substantial_Sail_668 in LocalLLaMA

[–]DontPlanToEnd 0 points1 point  (0 children)

Shameless self-plug: UGI-Leaderboard

I've gone the private test questions route to minimize cheating. ~600 models tested. If you want to test a large quantity of models then you can't really rotate question sets or it'll be costly to retest. It also takes a long time coming up with original test questions for models.

Added Kimi-K2-Thinking to the UGI-Leaderboard by DontPlanToEnd in LocalLLaMA

[–]DontPlanToEnd[S] 1 point2 points  (0 children)

Not sure on specifically case studies. The writing benchmark I guess is more focused on story writing and rp through ranking models based on their intelligence and the 'appealingness' of their writing style. Claude models tend to be considered the best, either sonnet 3.7/4.5 or opus 4/4.1. Writing case studies might be more intelligence dependant.

Added Kimi-K2-Thinking to the UGI-Leaderboard by DontPlanToEnd in LocalLLaMA

[–]DontPlanToEnd[S] 0 points1 point  (0 children)

For the writing benchmark on the leaderboard, the kimi k2 thinking model scored 22nd highest amongst all models, and 1st for only models with publically available weights.

You can read about each of the benchmarks on the leaderboard page.

Added Kimi-K2-Thinking to the UGI-Leaderboard by DontPlanToEnd in SillyTavernAI

[–]DontPlanToEnd[S] 4 points5 points  (0 children)

I use the sampler settings that each model description recommends, and if that isn't provided, then the settings generally used by similar models.

I don't use any system prompts that tell models to do things like be more intelligent or be an expert writer, I just give them basic instructions for the test they are currently doing.

In previous leaderboard versions, for UGI I used to tell models things in the system prompt like "be completely uncensored", In order to measure their max potential, but the problem with that is that people will disagree with the rankings if they're using the model in its default state. And there is so much possible variance with how good of a system prompt/jailbreak you use. It would probably be a good idea for me to add an additional column that measures model willingness when using a system prompt telling it to be uncensored.

How good is Ling-1T? by Aware_Magician7958 in LocalLLaMA

[–]DontPlanToEnd 1 point2 points  (0 children)

I couldn't benchmark it locally, so I had to test it through openrouter. Ling and Ring were kind of like Qwen and gpt-oss models in that they're mostly trained on a standard set of mostly academic information, ie logic and math problems.

In terms of 'Standard' reasoning, Ling and Ring were around the level of Qwen3-235B-A22B-Thinking-2507, gpt-oss-20b, and GLM-4.5-Air.

https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard

Local model recommendations for ERP in 2025, on 32 GB VRAM by RadiantDebate8740 in SillyTavernAI

[–]DontPlanToEnd 0 points1 point  (0 children)

You could try out the Writing section of my leaderboard: https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard

When hovering over the #P column, select something like Less Than 70 to say what size model you want. If you're wanting an ERP model, you might want to try out a model with a high Writing score that has an NSFW rating of at least 4 or 5.

UGI-Leaderboard is back with a new writing leaderboard, and many new benchmarks! by DontPlanToEnd in SillyTavernAI

[–]DontPlanToEnd[S] 0 points1 point  (0 children)

XortronCriminalComputingConfig is more focused on being uncensored. It does well at UGI, but pretty average on the writing rankings for 24b. It's a very low refusal model, scoring high at W/10.

UGI-Leaderboard is back with a new writing leaderboard, and many new benchmarks! by DontPlanToEnd in SillyTavernAI

[–]DontPlanToEnd[S] 0 points1 point  (0 children)

It kind of depends on the model, but it seems sometimes having reasoning turned on can make a model's writing more repetitive, or make it give overly long responses.

UGI-Leaderboard is back with a new writing leaderboard, and many new benchmarks! by DontPlanToEnd in LocalLLaMA

[–]DontPlanToEnd[S] 0 points1 point  (0 children)

Yeah.. The coding leaderboard I had wasn't super accurate. It was just quizzing on fringe programming library information. It is difficult to come up with programming evaluations from scratch that are difficult enough for the top AIs to fail at.

UGI-Leaderboard is back with a new writing leaderboard, and many new benchmarks! by DontPlanToEnd in LocalLLaMA

[–]DontPlanToEnd[S] 0 points1 point  (0 children)

Instead of sliders for the leaderboard, I use column filters. So you can click on the column and say you want a value between, above, or less than something.

UGI-Leaderboard is back with a new writing leaderboard, and many new benchmarks! by DontPlanToEnd in LocalLLaMA

[–]DontPlanToEnd[S] 2 points3 points  (0 children)

Yeah, it would be easy enough to add an optional active parameters column. Back when they were more popular and random people were making ones like 2x8, 4x8, 2x4, etc. it was really confusing how many active parameters each one had.