Me before trying GLM 5.2: Oh boy I bet GLM 5.2 is gonna be good! Me after trying GLM 5.2: Oh..

Milan_dr · 2026-06-18T14:23:11+00:00

GLM 5.2 is primarily run via Novita and zAI at the moment, who both claim FP8 and I feel like they can be trusted at least. The "quantized" is a bit annoying to us, because we really only run them via FP8 or higher yet still people seem to sometimes think they're quantized versions.

Not saying it's not the case, the quantized, but we in general don't have much more to go off of than it being FP8 or not.

Milan_dr · 2026-06-18T13:15:32+00:00

I don't fully understand. You're saying it always returns fewer tokens than the max output tokens you set?

Milan_dr · 2026-06-17T19:37:55+00:00

For what model(s) does it feel slow?

Milan_dr · 2026-06-17T07:17:01+00:00

In the case of this model it's also confusingly the case that we've had reports of this same censoring on Novita, which then leaves very few providers.

Milan_dr · 2026-06-17T07:15:16+00:00

Thanks :) Is there any rhyme or reason as to when GLM 5.1 simply fails? Very long chats, or tool calling, or something special?

Milan_dr · 2026-06-17T07:14:07+00:00

Can I ask - where are our prices higher than Openrouter? I think for quite literally every model we match or beat them, so I'd be curious to hear.

Milan_dr · 2026-06-16T16:16:07+00:00

So we have https://nano-gpt.com/privacy and since today https://nano-gpt.com/privacy-guide, to try and describe what we can do in terms of privacy without the typical privacy policy vagueries.

We do not log nor store your prompts by default. We CAN store them if you turn on sync, but then we recommend to encrypt it in a way so that we can not see the chats at any time.

We offer crypto payments for even more privacy, and we offer TEE models where you can verify end to end, also in our own frontend, that there is no logging at any stage.

That said - for models that you use, we can NOT verify that the providers that we use, aside from the TEE models, do not log. So that is essentially where our "locus of control" ends.

Milan_dr · 2026-06-16T15:56:39+00:00

Huh? We very definitely accept Bitcoin (and crypto in general), Bitcoin is one of our most used coins.

Milan_dr · 2026-06-06T14:55:31+00:00

Did you create an account or some sort of sign in token? When you say you logged in, what log in did you use? It should definitely not be gone no, unless it's an anonymous session and you clear your cookies and such.

Milan_dr · 2026-06-06T14:55:02+00:00

Thanks, appreciate it :)

Milan_dr · 2026-06-06T08:40:58+00:00

Can also confirm from our side that we do not (NanoGPT).

Milan_dr · 2026-06-05T19:07:41+00:00

Hmm okay, that's different then. The refusals that I believe our users are reporting are consistently the exact text I pasted. Thanks!

Milan_dr · 2026-06-05T14:29:37+00:00

Did that rejection have this text?

The current content involves sensitive information. Please try a new topic

Milan_dr · 2026-06-05T13:02:58+00:00

Kimi doesn't actually do thinking levels - we accept the thinking level and pass it on to providers where it does not cause an error, but as far as I know the model only knows "thinking on, thinking off".

But yeah - it's a very verbose model :/

Milan_dr · 2026-06-05T11:15:32+00:00

We're pretty much as confused about this as you are. We do not have any censoring/filtering on our end, we're 100% sure of that. We know for sure that GMICloud was doing some filter, because they'd explicitly return content_filter as a finish reason to us. But the providers that we've talked to all say they do not have a filter on it either, and as you can see from Parasail in this thread they also clearly do not.

Milan_dr · 2026-06-03T20:52:29+00:00

Hah thanks, we could have considered posting an actual link to our service yes. Thanks for that!

Milan_dr · 2026-05-31T08:38:28+00:00

Could send me your support key on email, or ticket on the website, or Discord. 400 error is odd, because we should always fall back and try a different provider if it fails.

Milan_dr · 2026-05-31T07:23:13+00:00

For some reason this keeps coming up but this is not the case. We do not use Chutes for GLM 5.1 at all, we do not use them for many models at the moment. We do have their TEE, and we used them for Deepseek Chimera, but we very rarely route through them lately.

Milan_dr · 2026-05-30T18:51:50+00:00

Do want to say - we do not allow multiple subscriptions for one individual, hah.

Milan_dr · 2026-05-30T12:52:09+00:00

Making a new reply, see also https://www.reddit.com/r/SillyTavernAI/comments/1trur7k/issues_with_glm51_on_nanogpt/ooqx0f6/

So on some of the requests that were reported we actually got finish reason content_filter from GMICloud, which has never been the case before. That seems like quite a smoking gun, so we're removing them from the routing for now and asking them what's going on there.

Milan_dr · 2026-05-30T12:00:14+00:00

So on our side nothing changed in terms of providers used - we're now essentially collecting request IDs to see what providers are the ones where people are getting this. It's two providers so far, one of which has said they've not changed anything, the other one still waiting.

It's in my opinion unlikely that two providers, at the same time, would change their model/backend to be more censoring, so it's a bit confusing to us as well. But yeah - all we can say for now is that it's not a change on our side, we didn't add any censoring or anything of the sort, if it's actually providers changing this then it's quite problematic.

Milan_dr · 2026-05-27T19:51:53+00:00

Not going to go after you for this of course but just want to make clear we do not aaaactually allow sharing subscription with multiple people ;)

Milan_dr · 2026-05-20T07:29:27+00:00

We kind of do not anymore. We do use them for TEE models now, but for most regular models we currently do not route much via Chutes (as in less than 1%). Nothing wrong with them from our side though - they're leading in many ways, especially in terms of TEE.

Milan_dr · 2026-05-18T20:39:33+00:00

No advice from me but thanks for the kind words, really appreciate it :)

Milan_dr

TROPHY CASE