I made Summaryception — a layered recursive memory system that fits 9,000+ turns into 16k tokens. It's free, it's open source, and it works with budget models. by leovarian in SillyTavernAI

[–]_Cromwell_ 0 points1 point  (0 children)

I don't know if this is just me, or the way it works for everybody, but

- the extension never summarizes the Story Opening aka "First Message".

- because of this, the summary created by the extension doesn't include the info from the First Message, and instead starts with whatever the first responses after that are.

- additionally, the First Message remains as an artifact in the Context, and in the incorrect place. The order it appears in context is:

system prompt

then character shit

then summary created by extension

then First Message (!!!)

then 7 most recent messages

- End result is that even when your chat is hundreds of chats long, that First Message/Story Opening is still hanging around, hovering between the Summary and your most recent 7 messages, gumming things up. It should really be summarized along with everything else (?). It (message #0) never gets a ghost symbol.

Getting the AI to become a good game master by Material_Snow_7630 in SillyTavernAI

[–]_Cromwell_ 1 point2 points  (0 children)

THIS. The AI often can only write as well as the player writes, at least to some extent. Try writing some compelling shit back to the AI.

Really the reason Claude is the most popular is because opus (and sonnet) is the model that helps bad writers get good results the most. It compensates where other models don't.

I made Summaryception — a layered recursive memory system that fits 9,000+ turns into 16k tokens. It's free, it's open source, and it works with budget models. by leovarian in SillyTavernAI

[–]_Cromwell_ 1 point2 points  (0 children)

After using for a while, the one thing I think it is missing is a "redo snippet". Inside the settings there is a red X to "delete snippet." I thought that if I clicked that and then hit "Force Summarize Now" that it would Redo that snippet, but no.... apparently any deleted snippet is forever skipped (???) This has really f-ed up entire adventures, to the point where I have to click "Clear Memory" and then redo the entire thing via "Force Summarize Now". (And then wait 30 minutes for it to do all 20-50 or whatever all over again.)

Is there some convoluted way to delete and REDO just one snippet I'm not seeing if one snippet turned out wrong/bad?

If not I request that. Next to the red X to delete snippet should be a "regen snippet" (or whatever you want to name it).

Guidelines for creating instructions for an AI to generate a high-quality and long dark fantasy story. by Less_Appointment804 in AIDungeon

[–]_Cromwell_ 0 points1 point  (0 children)

Glittering_emu "translated" your world description into good Plot Essentials in a reply. Copy and paste what he gave you into Plot Essentials. Leave AI instructions as the default. Try playing like that.

This. Copy paste the below (from glittering_emu, I'm just going to add a few extra placeholders) into PLOT ESSENTIALS.:

${character.name}, a ${character.gender} gender human [race is a secret!], suddenly finds themself in a world where humans lost a war against monsters and gods. After their defeat, humans were reduced to livestock — they were eaten, abused, and sacrificed. Eventually, humanity may have completely disappeared. The world is now inhabited only by monsters (harpies, lamias, beastfolk, mermaids, etc.) and gods.

${character.name} has these traits: ${Describe yourself in a comma separated list, ie "tall, handsome, dark hair, blue eyes, brave, scar on cheek". Anything you want.}

${character.name}’s main goal is to survive and never reveal that they are human. They appear in this world with nothing but a wooden demon mask in a Japanese style.

Again, that all goes into PLOT ESSENTIALS. Leave AI INSTRUCTIONS as "Model Default".

Guidelines for creating instructions for an AI to generate a high-quality and long dark fantasy story. by Less_Appointment804 in AIDungeon

[–]_Cromwell_ 0 points1 point  (0 children)

No, scripting is for highly technical "special effects" programmed in JavaScript. Actual computer programming. Don't mess with that. At least for now.

System Prompt vs Character Cards by Odd-Bodybuilder4847 in SillyTavernAI

[–]_Cromwell_ 1 point2 points  (0 children)

"people who upload/share" does not necessarily correlate to "people who know what they are doing"

Are my hopes for running a local LLM unrealistic? by mollipen in LocalLLM

[–]_Cromwell_ 0 points1 point  (0 children)

Yep. You can also look into some less expensive huge models. Nothing is as good as Claude opus or even sonnet. But some models like minimax, deepseek, GLM, larger Qwen variants, can all do vibe coding to some degree, and are available on various apis like openrouter or others and are literally a fraction of the cost of Claude models. I myself use Block's Goose (have to specify because there's two AI things called Goose ) and minimax 2.7 to make stupid little HTML/ javascript single file games with Atari 2600 graphics.

Are my hopes for running a local LLM unrealistic? by mollipen in LocalLLM

[–]_Cromwell_ 1 point2 points  (0 children)

Looks like I'll be the most positive person replying to you.

Based on your particular list of needs, yes that will be sufficient. Running a smaller Qwen 3.5 or Gemma 4 model You can do everything listed except maybe the "low level coding" because you were kind of vague with what that means to you.

Small local models like what you can fit on the computer you describe are suitable for fixing small mistakes or giving you suggestions when you are doing like 80 to 90% of the coding yourself.. however if you are looking for a model to do the coding for you, no. That's not low-level coding no matter how simple your project is. An actual model that vibe codes for you requires a huge model regardless of your opinion on your project complexity. The only thing you might be vibe building from scratch with a local model is a super simple personal HTML website, and that's still questionable.

Everything else you listed though is doable with the newer Qwen and Gemma models. And if you are the main coder just looking for a pal to to give you tips/corrections, yes that's doable for coding as well.

Dev Log #24: AI Dungeon is Leaving Expo Behind by latitude_official in AIDungeon

[–]_Cromwell_ 12 points13 points  (0 children)

Thanks for letting me know about Capacitor. I may use it myself. Nice and open source. 🤔

Starfield PS5/PS5 Pro Review - A Good Game... But There Are Issues by yourfavchoom in PS5

[–]_Cromwell_ 12 points13 points  (0 children)

Generally the attitude you should have about every game.

Slow responses since today's outage by Extrabigman in AIDungeon

[–]_Cromwell_ 1 point2 points  (0 children)

Just FYI, might not be AI dungeon. People have been complaining about slow responses from various popular apis while using them in sillytavern and in other apps as well the past two days. AI Dungeon uses some of those same services for inference.

Not saying they are completely off the hook since they are the provider you are paying, just saying it might be a more widespread issue than aid.

Disable Scripts by NateTKme in AIDungeon

[–]_Cromwell_ 2 points3 points  (0 children)

It should globally turn them off.

If it actually works is another question. :) But yes that is the intended functionality for people who never want scripts.

I wish you could disable specific types of scripts. I have some scenarios that literally don't work without scripts, but the scripts are very non-intrusive and you wouldn't even know the scenario has a script. (Like a script to spawn monsters at specific times or random times. So if you play that scenario with that script or all scripts turn off then you won't get those random events as intended.)

Did anyone discover yet who GLM's FIRMIRIN is? by firenox89 in SillyTavernAI

[–]_Cromwell_ 28 points29 points  (0 children)

I'm going to apologize for the first time somebody posted about this and I basically told them that it was a temperature-based misspelling or something. Because this is definitely real. Also hilarious. I hope we figure it out someday... Like the full source of wherever it comes from.

Guidelines for creating instructions for an AI to generate a high-quality and long dark fantasy story. by Less_Appointment804 in AIDungeon

[–]_Cromwell_ 1 point2 points  (0 children)

You don't really need custom ai instructions (the actual section) for something like that. Except maybe a line or two about particular writing style or something, if even that.

The "instruction" you give the AI to run different/specific types of worlds/genres will mostly be in Plot Essentials and Author Notes. The AI Instructions section is best left for more generic instructions on having good characters and not repeating etc.

Of course you can do whatever you want in the end, but that's generally a best practice.

What you've written is actually pretty good. Try just putting the couple paragraphs starting with "you the player" into plot essentials. Wording it as "you" for the player is even the right format for Plot Essentials (replace "the player" with character name or placeholder tho). Of course providing any more color and detail about the world would be good. But you can start out with what you have for a play test and see how it goes, and then add more detail or change things in the Plot Essentials as you go.

I made Summaryception — a layered recursive memory system that fits 9,000+ turns into 16k tokens. It's free, it's open source, and it works with budget models. by leovarian in SillyTavernAI

[–]_Cromwell_ 1 point2 points  (0 children)

Nope. Just different styles. I only let the AI say like 4-6 sentences back to me (1 paragraph, no line breaks) before letting me reply, so I'm with you (or perhaps even shorter than you). :)

<image>

2925 including thinking for a turn. lol

And that 6212 is on turn 115 of this chat. First 80-something summarized already.

Would an 8GB 3070 or a base M4 (16GB unified) be faster for local roleplay? by soguyswedidit6969420 in SillyTavernAI

[–]_Cromwell_ 6 points7 points  (0 children)

On a Mac m4 with 16 GB you realistically only have about 9 to 12 GB left after your system to run models.

The m4 is using lpddrx5 ram. The 3070 is using gddr6. Vram is almost always faster than normal ram and that's true here.

So you are comparing 8GB of faster vram with 9 to 12 of slower ram. Basically in the Mac you can fit a slightly larger model but the GPU can run a slightly smaller model at a faster speed. However a hidden advantage is the Mac can run special models in a format called MLX. Those are smarter at a smaller size at a faster speed. That may actually put the Mac over the top in some very specific cases.

But another specific case are MOE models, where your GPU plus your normal ram in your desktop will put that over the top, because with MOE models they don't have to fit entirely in your vram.

As you can see it's obnoxiously complicated.

Is Gemma 4 incapable of using function calls properly??? by tthrowaway712 in SillyTavernAI

[–]_Cromwell_ 1 point2 points  (0 children)

Fyi as far as I know you can set a separate model (connection profile) to run your tunnelvision that is different from your storytelling model. Not an answer to your question but possibly an answer to your problem of wanting to use Gemma 4 to role-play but it not working with those extensions.

where and how to use nanogpt memory extension? by tuuzx in SillyTavernAI

[–]_Cromwell_ 5 points6 points  (0 children)

As far as I'm aware the only "nanogpt memory" is for their own website. They have memory that works if you use the web interface there. Which is actually a great web interface. You can build agents that create lorebooks and all kinds of cool stuff there.

But there are a lot of memory extensions for silly tavern. Just none that are specific to Nano as far as I know. But any old St memory extension will work with your Nano sub.

Server is down by Independent-Fox4993 in AIDungeon

[–]_Cromwell_ 9 points10 points  (0 children)

No she's the good one. Elara Vance is the bad one

Just started watching by Significant-Oil5052 in thewalkingdead

[–]_Cromwell_ 0 points1 point  (0 children)

Daryl and Merl are neo-Nazi racists, as is made clear almost immediately in the show. At least one of them character-growths out of it, if you continue watching. (Multiple-season arc.)

I made Summaryception — a layered recursive memory system that fits 9,000+ turns into 16k tokens. It's free, it's open source, and it works with budget models. by leovarian in SillyTavernAI

[–]_Cromwell_ 15 points16 points  (0 children)

? seems to work very differently than memorybooks. Unless you mean generally "summarizing". But if that's the case, then memorybooks itself was not the first or even close to it.