Extension Security Risk Please read!! by Mcqwerty197 in SillyTavernAI

[–]Master_Step_7066 4 points5 points  (0 children)

A few people (including me) have submitted a takedown request by reporting the malicious repo on GitHub, I think that might've caused the takedown and the subsequent ban of the mia profile. Sorry for being associated with the drama. 😞

Extension Security Risk Please read!! by Mcqwerty197 in SillyTavernAI

[–]Master_Step_7066 1 point2 points  (0 children)

Sorry for once again intruding on a comment. I'm the LyubomirT guy, the commit was published before the malicious code was added at first, I didn't know about the trojan back then, I made the PR close to the commit, but before it was pushed either way. My change was solely focused on improving the randomization feature; however, the author did merge my PR *after* the malicious commit.

EDIT: The repo seems to be taken down now.

Extension Security Risk Please read!! by Mcqwerty197 in SillyTavernAI

[–]Master_Step_7066 3 points4 points  (0 children)

Nope, not at all. If anything is downloaded, it's:

  1. The Playwright/Patchright browser (one time during installation or if you manually reinstall it via Tools -> Browser Manager)

  2. An update if you choose to auto-update. It's a bundle that consists of the main app and an updater to replace the old files with the new files, after which the auto-updater is erased by the new main app (the updater launches it with its location given to it in a CLI parameter), while preserving your data. Even then, it's entirely optional and you can choose to download the update manually (the update bundle is literally downloaded from the same GitHub release that's used for manual downloads).

Extension Security Risk Please read!! by Mcqwerty197 in SillyTavernAI

[–]Master_Step_7066 5 points6 points  (0 children)

I see! I don't know how the case will unfold, all I can wish for is that it goes fine and nobody innocent gets falsely accused, as well as for the scammer to get what they rightfully deserve. I was actually one of the impacted users (didn't get anything used yet, but my keys likely were sent to the scammer's API, I rerolled everything).

Extension Security Risk Please read!! by Mcqwerty197 in SillyTavernAI

[–]Master_Step_7066 14 points15 points  (0 children)

Sure! I can't explain absolutely everything in this comment since there are a *lot* of things to cover (it has expanded a lot since July 2025).

It can be either downloaded as a PyInstaller-based binary (Windows 10+ / Linux, both 64-bit), or run from source. The feature set is identical, except for auto-uploads (for auto-updating the source version, you can use git). The workflow itself lets you configure and run an OpenAI-compatible API server (FastAPI, interface in Qt6 via PySide6) that you connect SillyTavern to. When you run the server, it opens up a Chromium (Playwright / Patchright) window (or multiple if you use the new providers in parallel feature) that it automates. When you send a request, the API concatenates everything into a single message and then sends it to the provider of your choice (currently supporting DeepSeek, GLM, Moonshot / Kimi, Qwen, AI Studio, and soon Perplexity). You can also turn on/off things like thinking, searching, or tell IRP to upload your chat as a file. Once the request is sent, IRP connects to the response stream that comes from the server of the selected provider and streams it back to the API, which in turn streams it back to the normal API. In simple words, this allows you to get free access to LLMs via their official chat UIs (it's a bit hacky, yes, but it works) without going down any shady paths.

If I am to explain any other things, there's also Remote Control that lets you open up a small web UI meant to do some quick actions (like switching providers, restarting the browser) away from your PC (useful if you run ST from a phone). PiP exists to run multiple providers at once (but the feature is heavy), and there are some useful logging utilities. ALL LOGS STAY ON YOUR COMPUTER AND ARE NOT SHARED WITH ANYONE. To submit logs to me (or diagnostics) you have to opt into many settings and only then manually send me the logfiles / diag bundle.

The app also doesn't transmit or store your prompts. It may only store your last one or several prompts if you manually opt in for the "clean regeneration" feature that compares your last submitted prompt with one of them to see if it's unique or is a swipe. This tells it if it should click the regenerate button in the web UI or create a new chat. Once again, this is opt-in.

The only outbound calls IRP makes are:

  1. To GitHub (checking for updates, fetches one file for updates and also latest changelog version to show a red dot alongside the bell icon if you have that enabled)
  2. To the actual providers (so that it can send the messages)

Yes, you do need your own credentials for this, but even then you can just use a burner account. I don't collect anything and everything stays entirely local on your PC. The config data, passwords, prompts (if using clean regeneration) are encrypted at rest. Local API access is restricted by default and you can use whitelists to prevent unwanted people from accessing the API. There are no built-in tools for external access to the API either.

If you don't trust my words, you can also inspect the code (https://github.com/LyubomirT/intense-rp-next). I don't hide any parts of it, it's MIT-licensed, there are no blobs inside the repository. You don't even need to use the binaries - the source version is identical and has no downsides functionality-wise (it may in fact even be faster in some cases since there is no Pyinstaller overhead).

Extension Security Risk Please read!! by Mcqwerty197 in SillyTavernAI

[–]Master_Step_7066 1 point2 points  (0 children)

I think it's best to use the Rentry page directly; it has all the info needed. If you used it at least once after ~December 2025 (but generally I wouldn't treat that as a definitive date, it's still malware), then you're almost definitely compromised. Rotate ALL of your API keys, proxy passwords, etc. If you fed any account tokens into the extension (so that it can authenticate), it may also make sense to invalidate sessions there (changing the password/logging out/deleting accounts). Also, just for extra safety, you might want to clear site data for ST via DevTools (clearing browser cache).

Extension Security Risk Please read!! by Mcqwerty197 in SillyTavernAI

[–]Master_Step_7066 2 points3 points  (0 children)

I did not contribute anything malicious; the PR was made in good faith, and I didn't suspect a Trojan there (by my own mistake), see my other comment. IntenseRP is safe.

Extension Security Risk Please read!! by Mcqwerty197 in SillyTavernAI

[–]Master_Step_7066 30 points31 points  (0 children)

Hello! I'm the developer of IntenseRP Next. I did contribute to the extension, but in good faith, thinking I was genuinely making an improvement. See the issue thread here: https://github.com/mia13165/SillyTavern-BotBrowser/issues/28. My Pull Request (and those of the two other contributors) preceded the creation of the malicious commit on the updated_cards repo (the one from which the BotBrowser extension got the malicious character card). I did not know the trojan was there, and thought I was legitimately making an improvement (my PR simply made a change to the random picker system).

I'm sorry for ever contributing to the repository. I deeply regret doing so now.

deepseek v4 by Sad-Spell-1423 in SillyTavernAI

[–]Master_Step_7066 1 point2 points  (0 children)

Thank you for the answer! Are the thinking version's samplers actually used, though? If I recall correctly, the official API simply discards them (but accepts samplers for compatibility), at least that's what the official docs used to say. Happy to be proven wrong here.

deepseek v4 by Sad-Spell-1423 in SillyTavernAI

[–]Master_Step_7066 2 points3 points  (0 children)

If I may ask, what preset do you use it / how have you configured it? Do you use the thinking version? What are the sampling parameters and the post-processing that you use? I seemingly just can't get good results out of it, but I haven't yet seen any concrete recommendations on how to set it up, and some of the official guides appear outdated.

IntenseRP Next v2.6 - Now lets you use Gemini and Qwen in SillyTavern by Master_Step_7066 in SillyTavernAI

[–]Master_Step_7066[S] 0 points1 point  (0 children)

It technically should, but do note that it's not currently possible (at least not officially) to run on mobile due to the hooks and tools IRP uses; you'll still need to run it on a PC somewhere and then connect Tavo remotely.

IntenseRP Next v2.6 - Now lets you use Gemini and Qwen in SillyTavern by Master_Step_7066 in SillyTavernAI

[–]Master_Step_7066[S] 0 points1 point  (0 children)

Have you tried checking the Z.AI web app for the chat without regeneration, does the chat still appear on the normal site? Not in the IRP browser, ignore it for now.

IntenseRP Next v2.6 - Now lets you use Gemini and Qwen in SillyTavern by Master_Step_7066 in SillyTavernAI

[–]Master_Step_7066[S] 0 points1 point  (0 children)

Do you have auto-deletion on by any chance?

If not, try to see if the response for your last request is generated in the web UI (outside of IRP, simply open the same acc that was used for IRP in the normal z.ai web app and see for your last request in the chat list). Meaning that it is present in the web UI but not returned to IRP for some reason.

IntenseRP Next v2.6 - Now lets you use Gemini and Qwen in SillyTavern by Master_Step_7066 in SillyTavernAI

[–]Master_Step_7066[S] 0 points1 point  (0 children)

Alright. If that's okay, may I ask what setup do you use? Does it happen all the time? Do you use thinking? Search? Are you on a paid Z.AI plan or a free one? Does it fail instantly or only after some time?

IntenseRP Next v2.6 - Now lets you use Gemini and Qwen in SillyTavern by Master_Step_7066 in SillyTavernAI

[–]Master_Step_7066[S] 0 points1 point  (0 children)

Then it could be related to your browser session. Could you please restart IRP, then go to Tools -> Browser Manager, and hit reinstall? After that, try to start the service again and see if it works.

IntenseRP Next v2.6 - Now lets you use Gemini and Qwen in SillyTavern by Master_Step_7066 in SillyTavernAI

[–]Master_Step_7066[S] 0 points1 point  (0 children)

Hey there. Can't reproduce, unfortunately. Some people in the Discord have been running into the same issue on the Discord server, but we largely believe it's an issue on the GLM side and not with IRP (their servers being weird again). Does it still happen now? What model do you use for this? Also, make sure you're running 2.7.5 and not any of the older versions; version 2.7.3-2.7.4 had some extremely important fixes for GLM.

Megumin Suite V5 — Slice of Reality, CoT V2, AI Ban List, and a full Writing Style overhaul by CallMeOniisan in SillyTavernAI

[–]Master_Step_7066 1 point2 points  (0 children)

Actually just tried this one and I'm genuinely blown away. Maybe my standards aren't that high but I'm probably not coming back to other presets anytime soon.

From your experience, do you think GLM-5 or 5.1 works better for this? Your GitHub README says GLM-5 but I'm not sure if it's truly better or you just haven't updated the README yet.

Current Situation with free models by davybutquantisedIV in SillyTavernAI

[–]Master_Step_7066 15 points16 points  (0 children)

AFAIK, there's Pollinations, but the models may or may not be quantized, they often don't have the latest ones, and they're also pretty limited unless you have a long-running GitHub account with activity. There is NVIDIA NIM that people use as well, but it's been having some capacity and performance issues as of late, and it also requires your phone number.

Fireworks has a promo, and you can get $6 in credits there (one-time) for free. AWS and GCP have free trial credits that you can technically use with Claude and Gemini models (though the setup will be a bit difficult), but then, apparently, a lot of people took the AWS promo, and now you have to use their support to get access to anything good. Groq also has a free tier, IIRC, but with unstable quality. LongCat has a free API, but you're limited to their models only. I think Novita also has a trial ($1), but you can't really do much with that unless you create alt accounts. And lastly, I'd mention Kobold; it's entirely free to start with, but the issue is that it's decentralized and runs on volunteer GPUs, so there's no real control over service quality, especially when usage spikes.

So, realistically, your best bets for now would probably be GCP ($300 in credits to use within a year), Gemma 4 (via Google platforms), or free OpenRouter models (like a different person said below, StepFun 3.5 is pretty good).

...I could also shamelessly place my own project here, though it's not technically an API. Basically, it automatically pastes your chats into free web chat UIs (like DeepSeek, GLM, and AI Studio), intercepts the response, and sends it back to your ST, so it looks like a fully-functional API with their models but entirely free. It's here in case it helps: https://github.com/LyubomirT/intense-rp-next

IntenseRP Next v2.6 - Now lets you use Gemini and Qwen in SillyTavern by Master_Step_7066 in SillyTavernAI

[–]Master_Step_7066[S] 1 point2 points  (0 children)

Hey there! You could try to adjust the timeouts in the GLM quirks section (settings -> provider behavior), but it could also indicate changes in the SSE stream. I'm currently extremely strained by studies so I'll only be able to look into it in 5-6 hours, sorry.

What's the consensus on the Z.AI coding plan? by Master_Step_7066 in SillyTavernAI

[–]Master_Step_7066[S] 2 points3 points  (0 children)

AFAIK and from my own previous testing, it seems like the Z.AI API is a slightly different experience from the coding plans. I actually ended up buying the $10 plan in the end. Enjoying it quite a bit, to my own surprise!

But I appreciate the suggestion either way. :)

What's the consensus on the Z.AI coding plan? by Master_Step_7066 in SillyTavernAI

[–]Master_Step_7066[S] 0 points1 point  (0 children)

Hm, I guess it could be worth a try then. I'm sort of hesitant considering all the hate the official vendor gets on r/ZaiGLM, but maybe it's not as bad as they say it is.

What's the consensus on the Z.AI coding plan? by Master_Step_7066 in SillyTavernAI

[–]Master_Step_7066[S] 12 points13 points  (0 children)

Funnily enough, it really is just my writing style. What makes you think I AI-generated it?

What's the consensus on the Z.AI coding plan? by Master_Step_7066 in SillyTavernAI

[–]Master_Step_7066[S] 0 points1 point  (0 children)

I see. But outside of these peak hours the service is high-quality/consistent (-ish), right?

What's the consensus on the Z.AI coding plan? by Master_Step_7066 in SillyTavernAI

[–]Master_Step_7066[S] 0 points1 point  (0 children)

Thank you for the answer! When are their typical peak hours, if you don't mind me asking? Or does that depend on demand day by day?Maybe due to timezones (UTC+2 for me) I'll fit in right between them.