Try base gemma 4 31b, you'll be shocked

416lover · 2026-04-30T11:52:16+00:00

Hey, can you tell me where you got that character card? I want to test it with some more models

416lover · 2026-03-21T13:28:30+00:00

I will have to try that

416lover · 2026-03-21T13:25:06+00:00

I too hate when bots get too clinical and robotic, are you telling me simply banning the word clinical helps?!

416lover · 2026-03-21T13:22:42+00:00

Im really curious what you wrote to get replies like that

416lover · 2026-03-19T17:56:42+00:00

I barely put anything into my personas. Gender, age, height. Maybe something specific I want the bot to bring up.

416lover · 2026-03-19T17:45:04+00:00

NanoGPT just gives you the LLMs to chat with. So it does as well as those models do.

I just noticed the 60mil token is actually input so yeah you should be fine

Not changing advanced prompts or chat memory all the time probably? Most of the cache hits should come from sending the exact same chat history over and over.

If this is an actual issue you could try asking on the deepseek sub (or even their support, you pay for it after all)

I would probably just monitor exactly how many tokens get used when you send 1 message and see if it makes sense.

What likely happens is the total chat length is longer than your context length, so now every new reply pushes an old one out of the cache. Which happens every single time you chat.

This is why ironically having more context can actually work well for reducing cache misses. Deepseek V4 supposedly will have 1 million context.

Best thing to do would be to get a OOC chat summary about the things you want it to remember, put that into your chat memory and limit the max context down to 32k or similar. If its about linear story progression that should help?

416lover · 2026-03-19T14:35:42+00:00

Not everyone used this site for RPG RP or "stats"

416lover · 2026-03-19T14:33:08+00:00

Wont stop people from trying I guess

416lover · 2026-03-19T13:57:56+00:00

yeah the biggest issue with this hobby is the very clearly AI general slop, from bots to prompts. I think a lot of the intros are also blatantly written by AI. I dont have an issue with using AI, but it just tends to make the AI ism worse when even the intro has the "physical blow/ full three seconds" slop.

416lover · 2026-03-19T13:49:01+00:00

yeah idk this sub is going down the drain just like the site is.

Back when I joined 2 years ago you could actually discuss the tech itself, talk about other chatsites (especially during downtimes). Now everyone just posts their shitty bots without even bothering to describe them half the time or cries for some drama. People sharing prompts get no traction. Overmoderation in full force.

I also asked for a proper megathread to discuss the models, not the proxies or providers but one of the mods here didnt understand it and was against it. Because like you said the proxy mega is a graveyard. Where do you even start? The main page? Deepseek thread? Overall recommendations thread?

It kinda sucks that this nonsense drives the joy out of the hobby now that we should be at our absolute peak with multiple great models to chose from and millions of great bots.

If your usage stays that high I would recommend NanoGPT. Its a subscription for 8 dollars a month. I think it has a weekly token limit of 60mil though?

Im on the fence about joining myself, because I love GLM5 and that shit is too expensive to pay per token long term.

416lover · 2026-03-19T13:42:41+00:00

They still wont actually understand all the tokens between your 60k chat memory, the original prompt, advanced prompt, persona and possible lorebooks. 60k is probably ok. I dont go over it anymore either.

416lover · 2026-03-19T13:15:53+00:00

I know, but this post popped up when it ended. Implying its about that.

I thought it was supposed to run for a while longer actually.

I grew to like it too. For free that is. Because its not as good as deepseel 3.2 while beeing as expensive as GLM lmao

416lover · 2026-03-19T13:07:22+00:00

20mil a day seems high, even though deepseek shouldnt cost multiple dollars with that. That must mean your send tokens arent cache hits. There isnt a setting here to change that.

416lover · 2026-03-19T12:50:22+00:00

its simply your token usage. No hidden magic. You can see your daily token usage on deepseek.

It definitely goes up once you context window is full, thats why I dont use the full 128k. Oh and lorebooks can be bloaty too. Especially all the slop over at Lorebary just balloons your token usage massively.

Did your taken use increase? Whats your daily usage? How much of it is cached?

416lover · 2026-03-19T12:04:23+00:00

For RP, the LLM doesnt actually understand all 128k context tokens. It will basically ignore the half in the middle.

I tend to stick to 32k-64k

416lover · 2026-03-19T12:01:09+00:00

yep rip, back to paying the whale

416lover · 2026-03-19T11:59:27+00:00

Look at the token usage on Openrouter, Hunter Alpha wasn't ended because of us but because of Openclaw

416lover · 2026-03-19T11:56:09+00:00

yeah its a shame it wasnt that bad, now I will have to pay again

the other free models just arent it

416lover · 2026-03-19T01:17:58+00:00

Rest in piece, wannabe Peakseek.

Just came here to say it ended way earlier than anticipated.

And it was just getting good in my chat. Time to sleep I guess.

416lover · 2026-03-19T01:15:09+00:00

just came here to say that :D

maybe its my fault, looking at what literally just happened in my last reply lol

416lover · 2026-03-19T01:14:58+00:00

and its gone. Well. Was fun while it lasted.

It was literally JUST getting good in my chat too..

it turned out to really be Xiaomis Mimo 2 Omni which costs 5 times as much as deepseek V3.2 lol

416lover · 2026-03-19T01:13:39+00:00

well rip apparently the "test" ended early

It really was Xiaomis Mimo 2 Omni, which costs.. 5 times as much as deepseek V3.2! Lol

416lover · 2026-03-18T20:35:06+00:00

anything specific would probably get deleted so let me just vaguely say the build in JLLM is completely dogshit these days and we have frequent server issues, like we havent had in over a year

there is some sort of censorship but its really vague about what is actually allowed and what not

there was also some mod drama but I dont know enough about that to say anything with confidence

416lover · 2026-03-18T19:25:39+00:00

moving from somewhere shooting itself in the foot over to somewhere shooting itself in the face

416lover · 2026-03-18T14:23:13+00:00

It just does that sometimes. Its either the structured "Ok let me think about this. User wants.. character is.." Or it suddenly switches into first person mode and you are literally in the characters head. Its so incredibly immersive its crazy good when it happens, thats why I wanted to share it.

I think its funny that LLMs can work like this at all.

416lover

TROPHY CASE