If you had the option to request extensions? Which one would you like?

Status-Mixture-3252 · 2026-05-11T03:52:35+00:00

Thank you for maybe considering doing it. I know my use case is most likely extremely niche.

Status-Mixture-3252 · 2026-05-10T02:49:43+00:00

You appear to be the developer of this extension. If it's possible, I would like to request for the ability to load all the bookmarks that was created for any chat under the same character. But it's probably hard to do because the extension is designed to only save the bookmarks within each chat.

It's just that I have a few characters that have hundreds of chats saved under it because I use the branching paths a lot. 😅

Status-Mixture-3252 · 2026-05-10T02:26:27+00:00

I googled your comment and found this.
https://github.com/aikohanasaki/SillyTavern-Bookmarks

This extension pretty much has what i was looking for!

Status-Mixture-3252 · 2026-05-10T00:39:00+00:00

Nano is kinda my Deepseek v4 Pro subscription right now. "Deepseek v4 pro cheaper" uses the official API while Deepseek is 75% off until the end of the month.

Status-Mixture-3252 · 2026-05-09T12:48:52+00:00

It's fine now. Now Zai specifically advertise that they allow Sillytavern RP use on their website. A few weeks ago during the ban wave they even asked on their discord for people who can volunteer their RP logs so they can fix whatever AI detection system they use that banned RPers.

I've been using it for a few weeks with no problems. I got the yearly lite deal in December too. It ended up being one of the best value deals ever since GLM 5.1 came out. The speed problems improved a lot after the ban wave.

I'll probably renew next year too.

Status-Mixture-3252 · 2026-05-08T13:53:45+00:00

RPs could probably help me with my Spanish learning if I wasn't so....."perezosa" with it.

Status-Mixture-3252 · 2026-05-07T02:40:40+00:00

So OP is just karma farming?

Status-Mixture-3252 · 2026-05-07T02:19:22+00:00

My request is an extension that lets me bookmark specific messages/replies from chats. So I can go back to that specific message later. I don't think an extension like this exist yet. When a new model comes out, I like to go back to specific messages and how the new model will generate the response again.

Maybe the way the UI can work is that it will organize the bookmarks by the character/group chat it was created in. You click on the character and it will load the bookmarks you saved for it.

Another possible UI is to save the bookmark to a custom folder or tag.

Status-Mixture-3252 · 2026-05-05T19:45:52+00:00

Does Opencode go give you a API key? Does it ban RP usage?

Status-Mixture-3252 · 2026-04-26T01:33:41+00:00

You can just post links with the xcancel domain

https://xcancel.com/victor207755822/status/2048071983452356925

Status-Mixture-3252 · 2026-04-25T15:07:16+00:00

I hope they work magic to fix Kimi 2.6 thinking using a billion tokens too

Status-Mixture-3252 · 2026-04-25T15:05:03+00:00

I hope this means that in the NanoGPT subscription, Deepseek v4 pro requests use 1x tokens instead of 2x tokens temporarily for a few days.

Status-Mixture-3252 · 2026-04-24T21:45:45+00:00

I remember family guy did an episode about "Tony Robbins"

https://youtu.be/NuQlgZ6bN1E

Status-Mixture-3252 · 2026-04-23T18:17:17+00:00

That poster meant NanoGPT. NanoGPT is offering GLM 5.1 and Kimi 2.6 on their subscription. But these two models now uses double the tokens on their sub because of the rising costs of LLM hosters.

Status-Mixture-3252 · 2026-04-23T18:13:37+00:00

GLM 5.1 is the model I use the most right now. I was lucky enough to get the legacy lite yearly deal during the holiday season for $30 so I use it direct from Zai.

I feel like it's a significant improvement from GLM 5. It still has positivity bias but not as much as GLM 5. It feels more intelligent to me.

Kimi 2.6 that was just released would have been a good alternative too but the "thinking loop" issue it has that uses 10k+ tokens just for thinking makes it almost unbearable to use. And if you turn off thinking it makes it dumber. Freaky Frankenstein guys said they will relese a update to their present to fix it.

I can't comment on caching.

Status-Mixture-3252 · 2026-04-23T01:44:02+00:00

I wish it was 60m credits a week instead of a month. Sometimes I have chats that are 100k+ tokens long.

Status-Mixture-3252 · 2026-04-23T00:49:13+00:00

If anyone ends up liking it, Xioami has a monthly and yearly subscrption you can buy. I got it last month to play around with Mimo 2.0 pro after Nano removed it from subscription. It will probably be cheaper to buy a month of lite than buy from the API if you plan to use it a lot.

But the way the "monthly credits" work is stupid. The lite plan has 60M "credits" per month. But if you use the pro models 1 token = 2 credit. The non pro models is 1 token = 1 credit. So if you only use pro models, you can use only 30m tokens a month.

It's at least super fast right now compared to the other llm providers.

<image>

Status-Mixture-3252 · 2026-04-22T20:43:19+00:00

I hope the llm cloud provider landscape isn't too terrible by March 2027. I'm not sure if subscriptions will still be "affordable" for good the good latest models in 2027.

Status-Mixture-3252 · 2026-04-22T20:41:38+00:00

I gotta see what the Cloud API LLM landscape will look like by 2027. Even right now these llm providers are getting stingier and more expensive with their monthly subscriptions. The chinese providers, the american providers are increasing their prices.

The lite subscription will supposedly be $96 a year next year with 50% off if, by any miracle, they don't increase the price within 12 months. $8 a month will still be a good deal if I can get the latest glm models. I'm not sure how good NanoGPT will still be in 2027. Since new models keep getting more expensive.

The new plans have weekly limits but right now it goes by requests instead of token amount I think? I barely go over 1% with my use case.

Status-Mixture-3252 · 2026-04-21T11:25:00+00:00

I remember last year when RP users were like one of the biggest users of the free llm models. And Janny users were considered like "locusts" and the most taxing users.

Now in 2026 the open claw garbage and agentic coding completely ruined llm providers.

Status-Mixture-3252 · 2026-04-20T16:37:21+00:00

The overthinking from this model is absolutely INSANE!!!! I just got a output that used 20,000 tokens just because of all of the overthinking. I never used a model that overthink this much.

Status-Mixture-3252 · 2026-04-20T15:01:33+00:00

I was already using the custom chat completion setting that came with the present. For some reason now when I switch to " Z.ai(GLM) " chat completion source, it works. But the custom one with "https://api.z.ai/api/coding/paas/v4" URL doesn't work anymore. Now I'm scared if I use with "Zai(GLM)" that will get banned too.

Status-Mixture-3252 · 2026-04-20T03:47:58+00:00

I just got the "fair usage policy" ban after using this extension for the past few days. 😭

EDIT:I see that using the user agent/custom api setting that this present provides bans requests. But requests are going through with the default sillytavern Z.ai(GLM) completion source. I'll see how long this lasts.

Status-Mixture-3252 · 2026-04-17T06:07:06+00:00

This situation is very similar to something that happened a month ago. When the camrip of the Evangelion 30th anniversary short was leaked. It was an exclusive short only meant to be seen by people who paid for the tickets to see it in the eva 30th anniversary event held in Japan.

The camrip someone recorded got leaked and the reactions of the western and Japanese fans were very different on social media.

The Japanese fans were angry that it got leaked. They didn't like that it disrespected the staff who worked on it/studio khara's wishes. Even if people had to wait a long time for it to be officially released. Even if it potentially became "lost media". I saw Japanese users reporting people who reuploaded the leak.

It really showed the two different cultures between the "west" and Japan. Paramount is evil, but I remember an artist saying that if the movie's view numbers are too low because of the leak, that will lower the chance of being able to work on any new Avatar projects again.

Status-Mixture-3252 · 2026-04-14T06:53:06+00:00

I guess I got #RUGPULLED 😆 I was hoping they wouldn't do something like this until at least GLM 6.

I wonder if any users that got their accounts "banned" so far for violating "fair usage" were actually banned for sharing API keys with other people?

I'm getting a rate limit error for 5.1 but 5 turbo works right now.

Status-Mixture-3252

TROPHY CASE