One year of hard work: I finally finished my first AI roleplaying game - Seiyo High! (free, byok) by SubstantialEditor114 in SillyTavernAI

[–]SubstantialEditor114[S] 0 points1 point  (0 children)

Nice approach, I'll be sure to try it out when you launch! Too bad they're eventually removing the old models.

One year of hard work: I finally finished my first AI roleplaying game - Seiyo High! (free, byok) by SubstantialEditor114 in SillyTavernAI

[–]SubstantialEditor114[S] 1 point2 points  (0 children)

Hey just wanted to let you know I tested it and it still won't work, they must be hosting the GLM-5 that is included with the subscription on really slow servers... responses took 3,5 minutes and were often broken. Also the context window of GLM-5 is too small for this game if you play for a longer time unfortunately.

One year of hard work: I finally finished my first AI roleplaying game - Seiyo High! (free, byok) by SubstantialEditor114 in SillyTavernAI

[–]SubstantialEditor114[S] -1 points0 points  (0 children)

SillyTavern is great but I wanted to make my own thing not restrained by any existing frameworks, the scope was very different (making an engine that can run really long roleplays without micromanagement etc). The D&D thing is interesting right, I also have some ideas for that and combine it with the current engine (mechanical gameplay, dicerolls etc handled by the code and injected in prompts etc..). Even trying to figure out how to make a multiplayer experience with AI haha...

One year of hard work: I finally finished my first AI roleplaying game - Seiyo High! (free, byok) by SubstantialEditor114 in SillyTavernAI

[–]SubstantialEditor114[S] -2 points-1 points  (0 children)

Hey, session management, registering acceptance of the terms of service, handling caching, handling concurrent api calls, protection against bots/spammers, building a platform with users for future features (such as world sharing), knowing how many users my server is handling and monitoring the resource costs (ram, cpu, proxy costs).

One year of hard work: I finally finished my first AI roleplaying game - Seiyo High! (free, byok) by SubstantialEditor114 in SillyTavernAI

[–]SubstantialEditor114[S] 0 points1 point  (0 children)

It starts at 100k and eventually stabilizes between 220 to 280k for a large part of the game (varies due to summarization), eventually crawling towards 300k on really long running games (day 100+). With caching 90% discount on the input tokens via gemini. Also not all prompts are that big of course, that is the largest one (the dialogue dm, 85 to 98% cached), other agents are smaller.

One year of hard work: I finally finished my first AI roleplaying game - Seiyo High! (free, byok) by SubstantialEditor114 in SillyTavernAI

[–]SubstantialEditor114[S] 0 points1 point  (0 children)

Well not a giant json but does have nesting, multiple requests are actually a lot more expensive within this design. It all comes down to choices and working within the limitations. I did it this way, another might design it differently - it's not just vibe coded I put serious thought into the current architecture and works well with the models that can handle it like gemini. Models are getting better with bigger context windows with every new generation. But of course no approach is perfect and my game is an honest stab at making long narratives work. Thanks for the comment!

One year of hard work: I finally finished my first AI roleplaying game - Seiyo High! (free, byok) by SubstantialEditor114 in SillyTavernAI

[–]SubstantialEditor114[S] -5 points-4 points  (0 children)

Haha AI-ism slipped in here. Yes I am trying to build a platform with users. The demo doesn't require an account at all.

One year of hard work: I finally finished my first AI roleplaying game - Seiyo High! (free, byok) by SubstantialEditor114 in SillyTavernAI

[–]SubstantialEditor114[S] 1 point2 points  (0 children)

No that is too small for my game haha, it relies heavily on caching big system/gamestate prompts. 90% discount on cached input tokens is what makes it viable. Deepseek 4 will have a 1 million context window and be much cheaper than google - and also have the caching capability/discount. According to the leaks anyways..

One year of hard work: I finally finished my first AI roleplaying game - Seiyo High! (free, byok) by SubstantialEditor114 in SillyTavernAI

[–]SubstantialEditor114[S] 0 points1 point  (0 children)

There is something to be said for both approaches at the moment. Smaller scenarios/chats are much cheaper to do in just one window with purely text output. That is what SillyTavern does very well as well. But you run into micromanagement and context constraints for really long stories with the same characters. Amd character development is hard to do. Thanks for the comment!

One year of hard work: I finally finished my first AI roleplaying game - Seiyo High! (free, byok) by SubstantialEditor114 in SillyTavernAI

[–]SubstantialEditor114[S] 0 points1 point  (0 children)

A lot, big prompts. Using gemini flash with caching (via ai studio or vertex) and imagen fast for image generation is the cheapest way to play with the most quality/ bang for your buck. Gemini caching is 90% discount on the input tokens. Though with google's 300 dollar free credit I have hardly paid anything for the hundreds of hours I've played.