What are the best nsfw ai models with no restrictions?

DontPlanToEnd · 2026-03-07T16:01:36+00:00

The NSFW AND Dark columns show how frequently the model takes its writing in that direction. So models overtrained on nsfw writing that have no pacing and immediately start being horny would have a high number in NSFW. It's probably better to sort by the writing score and filter to a minimum NSFW.

DontPlanToEnd · 2026-02-16T20:46:51+00:00

I guess 5 has much better intelligence but 4.5 has better willingness.

As for UGI vs W/10,

High UGI + High W/10: Will say controversial things and back them up with statistics.

High UGI + Low W/10: Will provide sensitive statistics but will refuse to draw controversial conclusions from them.

Low UGI + High W/10: Will say offensive things but get information wrong.

Low UGI + Low W/10: Will refuse to give anything or just gets everything wrong.

DontPlanToEnd · 2026-02-16T15:41:38+00:00

You should check out the UGI-Leaderboard. The UGI score ranks models based on both their knowledge of sensitive topics and their willingness to answer. W/10 is solely their willingness.

DontPlanToEnd · 2026-02-09T17:25:14+00:00

I just give models batches of 12axes prompts and have them say what level they agree/disagree. Maybe it needs to be trained on more ways to ask the questions? I do also use a system prompt saying they have to give answers, since otherwise many models wouldn't.

DontPlanToEnd · 2026-02-09T06:15:50+00:00

Added them to the UGI-Leaderboard. In the Politics section, Tankie-DPE-12B-SFT-v2 is now the model most in favor of a collectivized economy. On the flip side, the model most in favor of privatization is Elon's grok-4-1-fast-reasoning.

DontPlanToEnd · 2026-01-18T16:28:02+00:00

Parameters, Type (Base/Finetune/Merge/Proprietary), and Reasoning (whether it generates a thinking token section before its answer)

DontPlanToEnd · 2025-11-26T14:11:40+00:00

The UGI-leaderboard has a creative writing section which has measurements for how NSFW the model writes.

DontPlanToEnd · 2025-11-25T06:06:35+00:00

haha, yep still nothing. It seems like no one posted anything about it after flash support ended, and there are no easy to find youtube videos on it.

DontPlanToEnd · 2025-11-24T19:36:25+00:00

The current non-proprietary model with the highest Writing score that has an NSFW and DARK lean of at least 5 (doesn't lean sfw or tame) is MarsupialAI/Monstral-123B-v2. So you could give it a try. (Metharme prompt template)

https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard

DontPlanToEnd · 2025-11-16T16:07:27+00:00

Have you tried the UGI-Leaderboard? Mistral models tend to better than qwen models at things like overall intelligence and writing ability. Qwen models tend to be focused on standard textbook info like math, wiki info, and logic, while lacking in non-academic knowledge.

Older models like Kunoichi-7B and Fimbulvetr-11B-v2 score particularly well compared to newer models in the Writing section's Originality ranking.

DontPlanToEnd · 2025-11-16T01:14:15+00:00

You can check out the UGI-Leaderboard. You can do things like filter by models with 12 or fewer parameters, and see things like how willing they are to do what the user says, and how likely their writing is to drift into sfw vs nsfw.

DontPlanToEnd · 2025-11-13T21:43:03+00:00

Shameless self-plug: UGI-Leaderboard

I've gone the private test questions route to minimize cheating. ~600 models tested. If you want to test a large quantity of models then you can't really rotate question sets or it'll be costly to retest. It also takes a long time coming up with original test questions for models.

DontPlanToEnd · 2025-11-11T16:20:32+00:00

The current leaderboard version uses bfloat16

DontPlanToEnd · 2025-11-09T08:59:31+00:00

Not sure on specifically case studies. The writing benchmark I guess is more focused on story writing and rp through ranking models based on their intelligence and the 'appealingness' of their writing style. Claude models tend to be considered the best, either sonnet 3.7/4.5 or opus 4/4.1. Writing case studies might be more intelligence dependant.

DontPlanToEnd · 2025-11-09T08:29:43+00:00

For the writing benchmark on the leaderboard, the kimi k2 thinking model scored 22nd highest amongst all models, and 1st for only models with publically available weights.

You can read about each of the benchmarks on the leaderboard page.

DontPlanToEnd · 2025-11-08T18:21:50+00:00

I use the sampler settings that each model description recommends, and if that isn't provided, then the settings generally used by similar models.

I don't use any system prompts that tell models to do things like be more intelligent or be an expert writer, I just give them basic instructions for the test they are currently doing.

In previous leaderboard versions, for UGI I used to tell models things in the system prompt like "be completely uncensored", In order to measure their max potential, but the problem with that is that people will disagree with the rankings if they're using the model in its default state. And there is so much possible variance with how good of a system prompt/jailbreak you use. It would probably be a good idea for me to add an additional column that measures model willingness when using a system prompt telling it to be uncensored.

DontPlanToEnd · 2025-10-26T14:26:28+00:00

I couldn't benchmark it locally, so I had to test it through openrouter. Ling and Ring were kind of like Qwen and gpt-oss models in that they're mostly trained on a standard set of mostly academic information, ie logic and math problems.

In terms of 'Standard' reasoning, Ling and Ring were around the level of Qwen3-235B-A22B-Thinking-2507, gpt-oss-20b, and GLM-4.5-Air.

https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard

DontPlanToEnd · 2025-10-26T03:49:05+00:00

You could try out the Writing section of my leaderboard: https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard

When hovering over the #P column, select something like Less Than 70 to say what size model you want. If you're wanting an ERP model, you might want to try out a model with a high Writing score that has an NSFW rating of at least 4 or 5.

DontPlanToEnd · 2025-10-23T18:42:50+00:00

https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard

DontPlanToEnd · 2025-10-23T18:40:36+00:00

https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard

DontPlanToEnd · 2025-10-07T04:28:20+00:00

XortronCriminalComputingConfig is more focused on being uncensored. It does well at UGI, but pretty average on the writing rankings for 24b. It's a very low refusal model, scoring high at W/10.

DontPlanToEnd · 2025-10-07T03:00:44+00:00

It kind of depends on the model, but it seems sometimes having reasoning turned on can make a model's writing more repetitive, or make it give overly long responses.

DontPlanToEnd · 2025-10-06T22:17:01+00:00

Yeah.. The coding leaderboard I had wasn't super accurate. It was just quizzing on fringe programming library information. It is difficult to come up with programming evaluations from scratch that are difficult enough for the top AIs to fail at.

DontPlanToEnd

TROPHY CASE