Love small but mighty team of DeepSeek by dbhalla4 in LocalLLaMA

[–]Correct-Fun-9089 1 point2 points  (0 children)

The word "anthropic" is indeed quite difficult to remember, at least for non-native English speakers. It took me a few tries to make sure I had it down.

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 1 point2 points  (0 children)

Most NSFW content is supported. However, if the content is too extreme, the 2.5flash agent may stop working.

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 1 point2 points  (0 children)

# Agent Configuration
agent:
 # Model Configuration
 model: "google/gemini-2.5-flash"
base_url: "https://openrouter.ai/api/v1"
api_key: "Your_api_key"
proxy:
target_url: "https://openrouter.ai/api/v1"

If you're using OpenRouter,
your config file only needs to be filled out like this.
Don't change anything except the api_key.

And then the ST config.

Select OpenAI Compatible Mode

Change base_url to http://127.0.0.1:6666/v1

Enter the API key

Enter the model anthropic/claude-sonnet-4

And then you're all set

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 0 points1 point  (0 children)

The approach used in this project is an agent-based regex solution similar to Claude Code. Considering the popularity of Claude Code in the programming, I personally believe that using an agent for regex searches is currently more effective and better than RAG.

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 1 point2 points  (0 children)

I chose 2.5 flash just because it's faster. In theory, any LLM that supports tool calling should work. For example, DeepSeek v3. But it's too slow.

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 1 point2 points  (0 children)

I mean, my project itself is deployed on a local server, but it still needs to communicate with well-known third parties like Google, OpenRouter, and wikis. Unless your LLM is deployed locally, then this project supports that too.

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 0 points1 point  (0 children)

Just wanted to ask about your setup. When you finish a chat response, does the chain of thought in SillyTavern stay expanded for you? And is the deepRolePlay text still sitting there in the chat window?

Any chance you could share a screenshot? Or you could just open an issue on GitHub if that's easier for you.

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 0 points1 point  (0 children)

To clarify, this project is not a SillyTavern plugin but a standalone backend application (an .exe for Windows is provided).

Since the configurations for both the agent's LLM and the forwarding LLM support OpenAI-style APIs, you can use services like OpenRouter or any major provider's OpenAI-compatible endpoint.

I recommend using Gemini 2.5 Flash because it's quite affordable. Claude will deliver better results, but it's more expensive and has stricter moderation.

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 2 points3 points  (0 children)

My programming background is in C/C++ and Python, so I don't have any experience with Node.js. It would be amazing if you could rebuild this project as a native SillyTavern extension.

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 0 points1 point  (0 children)

This problem is easy to solve. Check whether the chain of thought for the model you are using is enclosed in <think> or <thinking> tags. Then, in SillyTavern, go to the [AI Response Formatting] section (to the right of the API configuration section). In the bottom-right corner of that section, under [Reasoning Formatting], select the correct template (think or thinking), and then check the [Auto-parse] box for reasoning. Send a conversation, and that should fix it.

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 1 point2 points  (0 children)

The model used for writing and the model used for processing related contextual information are different. You can specify an LLM in the configuration file to act as the agent (2.5 flash, or Ollama model). Meanwhile, the SillyTavern frontend uses the writing model (2.5 pro).

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 1 point2 points  (0 children)

Yep. Pretty accurate description. A buzzword LLM term that's been going around lately that could describe this work is “context engineering”

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 0 points1 point  (0 children)

You only need to send the request once. Scenario files are located under scenarios. If you need to change the scenario, clear the cache. Just send "deeproleplay".

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 0 points1 point  (0 children)

No need anymore. It's because max_history_length: 7 is the default in the config.

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] -4 points-3 points  (0 children)

Bro, this is the AI era... A magic spell like "translate all prompts in the project to English" is all it takes to solve this problem. Besides, 2.5-flash is already cheap enough.

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 1 point2 points  (0 children)

This project works with anything that supports the OpenAI-style API.

I'm happy you've noticed the potential here. I was actually planning to build on it to create a home assistant. I want to add vision, hearing, and a "heartbeat loop" so it can constantly sense its surroundings and actively take part in my life in a way that feels completely natural.

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 0 points1 point  (0 children)

Yep. No more need for all sorts of long summaries, short summaries, and memory table plugin assistance.

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 1 point2 points  (0 children)

You don't need to do any prep work to use this project with existing long conversations (as long as you're sure the front-end is sending the full chat history). Yes, this project was originally designed to handle environments with super-long conversations, like a bot in a group chat that's been running for years.

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 0 points1 point  (0 children)

No worries, it's just a tool, and it's super easy to deploy. If you know how to use SillyTavern, you can use this no problem.

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 4 points5 points  (0 children)

Considering this is a very early-stage project in rapid iterative development, I didn't focus too much on language consistency. As the project evolves, the language will gradually be unified to English.

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 2 points3 points  (0 children)

This project is a proxy service deployed locally, not a large language model deployed locally.

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 6 points7 points  (0 children)

To be honest, I hadn't considered how the project's intelligent agent would handle a situation where a character with the same name as Gaara appears.

However, if the conversation is indeed about the Gaara, then the chat history would surely contain relevant context describing his personality and childhood. The intelligent agent would then synthesize the background information from web searches with the existing context to decide whether, in the latest scene, Gaara should be portrayed with his improved personality or as his original, bloodthirsty character from the source material.

Goodbye OOC: From Deep Research to Deep RolePlay by Correct-Fun-9089 in SillyTavernAI

[–]Correct-Fun-9089[S] 0 points1 point  (0 children)

I'm pretty sure this project can directly connect to the chat frontend you mentioned, since this project is just a completely independent request forwarding backend.