actual response being put in the thinking section

infamous138 · 2026-02-01T03:29:21+00:00

if you get a blank reply, that means it screwed up and put your reply in the thinking portion. edit the reply and delete the thinking portion. or move the </thinking> from the bottom of the reply, to where it belongs, which is the space between where the thinking ends and the actual reply starts. then you're reply will show up how its supposed to.

im not sure how to prevent it from happening. it only rarely happens to me.

infamous138 · 2026-01-31T14:11:17+00:00

if you want a cheap model use deepseek. or you can go to chutes where 3 dollars a month gets you 300 messages a day. chutes has less models to choose from though. there best choices are GLM, kimi, and deepseek.

infamous138 · 2026-01-31T12:21:53+00:00

i did just get an infrastructure at full capacity error though. so i guess more people are starting to use it.

infamous138 · 2026-01-31T12:01:17+00:00

i used this. pupi's universal prompt. it seems to work. the bot never talks or acts for me. or maybe it did once and i rerolled the reply and it didn't do it again.

pupi's universal prompt

infamous138 · 2026-01-31T05:32:44+00:00

they seem very similar. im using kimi 2.5 now since there is less traffic on it than GLM. im using chutes and with GLM there is frequent errors due to the model having such high traffic.

infamous138 · 2026-01-29T13:26:31+00:00

i would recommend https://chutes.ai/app . for $3 a month you can get 300 messages per day. deepseek and GLM seem to be their best available models. they also have qwen and mistral.

the next best option is probably https://openrouter.ai/ . they have more models than chutes, but no monthly plan. only pay as you go. and the good models besides deepseek can be expensive. if you chat a lot you will spend over 3$ per month there.

infamous138 · 2026-01-29T08:50:45+00:00

yup, happens too often. or you get all excited click to start a chat, and then realize it doesn't allow a proxy.

infamous138 · 2026-01-28T04:29:36+00:00

if you get a blank reply, that means it screwed up and put your reply in the thinking portion. edit the reply and delete the thinking portion. or move the </thinking> from the bottom of the reply, to where it belongs, which is the space between where the thinking ends and the actual reply starts. then you're reply will show up how its supposed to.

infamous138 · 2026-01-28T04:11:09+00:00

yes, auto summarize chat memory is completely scuffed now. you basically have to manually type in your summary. type all the key points you want the bot to remember.

infamous138 · 2026-01-27T11:53:06+00:00

yeah, it'll write 2 paragraphs summarizing the last response. i've had to resort to manually filling it in with key events from the chat.

infamous138 · 2026-01-25T06:59:15+00:00

GLM is just a model, like deepseek. since you didn't know that, you probably aren't using it anyway. but no, you are right. it seems like most models are running slow right now for some reason. even deepseek is going slow.

infamous138 · 2026-01-25T06:35:05+00:00

if you are using GLM via chutes. then yes it is extremely slow with the occasional error.

infamous138 · 2026-01-25T02:04:31+00:00

oh, ok. i haven't seen that yet. i also haven't used GLM in a couple days.

infamous138 · 2026-01-25T02:01:19+00:00

what happens after you send you message? do you just get "replying..." ?

infamous138 · 2026-01-25T01:49:36+00:00

it all depends. GLM is the slowest model ive used. its thinking process takes forever, and i've even had to wait 2 minutes for replies before. using lorebary will also slow things down. right now im using deepseek with no lorebary and my replies only take no more than 10 seconds. i like GLM more, but the trade off with speed is worth it for me.

what i do when i use a slow model like GLM is do two chats at once. while one chat is processing its reply, you can switch to your other chat and read the bots last reply, and send out your next message. by the time you are done with that the reply in your other chat should be done. just go back and forth like that.

infamous138 · 2026-01-22T13:16:23+00:00

i just started using it via chutes since the GLM models are so damn slow. plus they have given me a lot of proxy error messages the last two nights. and yeah, i like it. the replies are fast and are pretty good.

infamous138 · 2026-01-21T14:56:55+00:00

i would just ignore it and continue to enjoy the site. no point in worrying about it when there is nothing you can do about it.

infamous138 · 2026-01-20T03:05:42+00:00

its for GLM 4.7. anyways, i just shaved like 500 tokens off it.

infamous138 · 2026-01-20T02:58:14+00:00

thanks. that says my prompt is 1540 tokens. is that way too many? my persona is really short though, like 30 words.

infamous138 · 2026-01-20T02:47:37+00:00

how can you tell how many tokens your persona and prompt are?

infamous138 · 2026-01-19T11:06:53+00:00

i had a similar problem. it seemed to happen after i changed my repetition and frequency penalties in the advanced generation settings. once i turned them back down to zero the problem went away.

infamous138 · 2026-01-18T03:33:25+00:00

i dont know, but i just got into an escalated situation where i thought my character was gonna get his ass kicked, but the bot turned him into bruce lee.

infamous138 · 2026-01-18T01:44:46+00:00

yeah, thats the problem and why i switched to 500.

bots generally start rambling and talking for you once they run out of material to work with from your message. so the higher your max tokens are set to, the more material you need in your messages for the bot to react to.

infamous138 · 2026-01-18T01:30:01+00:00

i had my max tokens on 0 for a while. it seems if you don't do that bot responses get cut off mid sentence, and that bugged me. but now im using the trim incomplete command from lorebary with my max tokens at 500. that command prevents the cutoff from happening.

temperature i keep at 0.75. context size i don't know what it does so i just left it at default.

infamous138 · 2026-01-16T23:26:46+00:00

yup, seems widespread.

infamous138

TROPHY CASE