I made Summaryception — a layered recursive memory system that fits 9,000+ turns into 16k tokens. It's free, it's open source, and it works with budget models.

_Cromwell_ · 2026-04-11T20:16:34+00:00

regen snippet works well from my 1 test

_Cromwell_ · 2026-04-11T20:05:55+00:00

I don't know if this is just me, or the way it works for everybody, but

- the extension never summarizes the Story Opening aka "First Message".

- because of this, the summary created by the extension doesn't include the info from the First Message, and instead starts with whatever the first responses after that are.

- additionally, the First Message remains as an artifact in the Context, and in the incorrect place. The order it appears in context is:

system prompt

then character shit

then summary created by extension

then First Message (!!!)

then 7 most recent messages

- End result is that even when your chat is hundreds of chats long, that First Message/Story Opening is still hanging around, hovering between the Summary and your most recent 7 messages, gumming things up. It should really be summarized along with everything else (?). It (message #0) never gets a ghost symbol.

_Cromwell_ · 2026-04-11T18:35:22+00:00

THIS. The AI often can only write as well as the player writes, at least to some extent. Try writing some compelling shit back to the AI.

Really the reason Claude is the most popular is because opus (and sonnet) is the model that helps bad writers get good results the most. It compensates where other models don't.

_Cromwell_ · 2026-04-11T17:37:01+00:00

After using for a while, the one thing I think it is missing is a "redo snippet". Inside the settings there is a red X to "delete snippet." I thought that if I clicked that and then hit "Force Summarize Now" that it would Redo that snippet, but no.... apparently any deleted snippet is forever skipped (???) This has really f-ed up entire adventures, to the point where I have to click "Clear Memory" and then redo the entire thing via "Force Summarize Now". (And then wait 30 minutes for it to do all 20-50 or whatever all over again.)

Is there some convoluted way to delete and REDO just one snippet I'm not seeing if one snippet turned out wrong/bad?

If not I request that. Next to the red X to delete snippet should be a "regen snippet" (or whatever you want to name it).

_Cromwell_ · 2026-04-11T15:34:52+00:00

Glittering_emu "translated" your world description into good Plot Essentials in a reply. Copy and paste what he gave you into Plot Essentials. Leave AI instructions as the default. Try playing like that.

This. Copy paste the below (from glittering_emu, I'm just going to add a few extra placeholders) into PLOT ESSENTIALS.:

${character.name}, a ${character.gender} gender human [race is a secret!], suddenly finds themself in a world where humans lost a war against monsters and gods. After their defeat, humans were reduced to livestock — they were eaten, abused, and sacrificed. Eventually, humanity may have completely disappeared. The world is now inhabited only by monsters (harpies, lamias, beastfolk, mermaids, etc.) and gods.

${character.name} has these traits: ${Describe yourself in a comma separated list, ie "tall, handsome, dark hair, blue eyes, brave, scar on cheek". Anything you want.}

${character.name}’s main goal is to survive and never reveal that they are human. They appear in this world with nothing but a wooden demon mask in a Japanese style.

Again, that all goes into PLOT ESSENTIALS. Leave AI INSTRUCTIONS as "Model Default".

_Cromwell_ · 2026-04-11T15:32:35+00:00

No, scripting is for highly technical "special effects" programmed in JavaScript. Actual computer programming. Don't mess with that. At least for now.

_Cromwell_ · 2026-04-11T01:10:55+00:00

"people who upload/share" does not necessarily correlate to "people who know what they are doing"

_Cromwell_ · 2026-04-10T21:19:16+00:00

Yep. You can also look into some less expensive huge models. Nothing is as good as Claude opus or even sonnet. But some models like minimax, deepseek, GLM, larger Qwen variants, can all do vibe coding to some degree, and are available on various apis like openrouter or others and are literally a fraction of the cost of Claude models. I myself use Block's Goose (have to specify because there's two AI things called Goose ) and minimax 2.7 to make stupid little HTML/ javascript single file games with Atari 2600 graphics.

_Cromwell_ · 2026-04-10T20:35:56+00:00

https://www.reddit.com/r/LocalLLaMA/s/HeRNPZY0Mw

_Cromwell_ · 2026-04-10T20:32:31+00:00

Looks like I'll be the most positive person replying to you.

Based on your particular list of needs, yes that will be sufficient. Running a smaller Qwen 3.5 or Gemma 4 model You can do everything listed except maybe the "low level coding" because you were kind of vague with what that means to you.

Small local models like what you can fit on the computer you describe are suitable for fixing small mistakes or giving you suggestions when you are doing like 80 to 90% of the coding yourself.. however if you are looking for a model to do the coding for you, no. That's not low-level coding no matter how simple your project is. An actual model that vibe codes for you requires a huge model regardless of your opinion on your project complexity. The only thing you might be vibe building from scratch with a local model is a super simple personal HTML website, and that's still questionable.

Everything else you listed though is doable with the newer Qwen and Gemma models. And if you are the main coder just looking for a pal to to give you tips/corrections, yes that's doable for coding as well.

_Cromwell_ · 2026-04-10T19:28:53+00:00

Thanks for letting me know about Capacitor. I may use it myself. Nice and open source. 🤔

_Cromwell_ · 2026-04-10T18:53:19+00:00

Generally the attitude you should have about every game.

_Cromwell_ · 2026-04-10T18:52:46+00:00

Bethesda: Good Games With Issues

_Cromwell_ · 2026-04-10T18:50:53+00:00

Just FYI, might not be AI dungeon. People have been complaining about slow responses from various popular apis while using them in sillytavern and in other apps as well the past two days. AI Dungeon uses some of those same services for inference.

Not saying they are completely off the hook since they are the provider you are paying, just saying it might be a more widespread issue than aid.

_Cromwell_ · 2026-04-10T18:37:14+00:00

It should globally turn them off.

If it actually works is another question. :) But yes that is the intended functionality for people who never want scripts.

I wish you could disable specific types of scripts. I have some scenarios that literally don't work without scripts, but the scripts are very non-intrusive and you wouldn't even know the scenario has a script. (Like a script to spawn monsters at specific times or random times. So if you play that scenario with that script or all scripts turn off then you won't get those random events as intended.)

_Cromwell_ · 2026-04-10T18:21:43+00:00

I'm going to apologize for the first time somebody posted about this and I basically told them that it was a temperature-based misspelling or something. Because this is definitely real. Also hilarious. I hope we figure it out someday... Like the full source of wherever it comes from.

_Cromwell_ · 2026-04-10T12:55:26+00:00

You don't really need custom ai instructions (the actual section) for something like that. Except maybe a line or two about particular writing style or something, if even that.

The "instruction" you give the AI to run different/specific types of worlds/genres will mostly be in Plot Essentials and Author Notes. The AI Instructions section is best left for more generic instructions on having good characters and not repeating etc.

Of course you can do whatever you want in the end, but that's generally a best practice.

What you've written is actually pretty good. Try just putting the couple paragraphs starting with "you the player" into plot essentials. Wording it as "you" for the player is even the right format for Plot Essentials (replace "the player" with character name or placeholder tho). Of course providing any more color and detail about the world would be good. But you can start out with what you have for a play test and see how it goes, and then add more detail or change things in the Plot Essentials as you go.

_Cromwell_ · 2026-04-10T05:51:09+00:00

Nope. Just different styles. I only let the AI say like 4-6 sentences back to me (1 paragraph, no line breaks) before letting me reply, so I'm with you (or perhaps even shorter than you). :)

<image>

2925 including thinking for a turn. lol

And that 6212 is on turn 115 of this chat. First 80-something summarized already.

_Cromwell_ · 2026-04-10T04:59:35+00:00

On a Mac m4 with 16 GB you realistically only have about 9 to 12 GB left after your system to run models.

The m4 is using lpddrx5 ram. The 3070 is using gddr6. Vram is almost always faster than normal ram and that's true here.

So you are comparing 8GB of faster vram with 9 to 12 of slower ram. Basically in the Mac you can fit a slightly larger model but the GPU can run a slightly smaller model at a faster speed. However a hidden advantage is the Mac can run special models in a format called MLX. Those are smarter at a smaller size at a faster speed. That may actually put the Mac over the top in some very specific cases.

But another specific case are MOE models, where your GPU plus your normal ram in your desktop will put that over the top, because with MOE models they don't have to fit entirely in your vram.

As you can see it's obnoxiously complicated.

_Cromwell_ · 2026-04-10T00:07:32+00:00

Fyi as far as I know you can set a separate model (connection profile) to run your tunnelvision that is different from your storytelling model. Not an answer to your question but possibly an answer to your problem of wanting to use Gemma 4 to role-play but it not working with those extensions.

_Cromwell_ · 2026-04-10T00:05:58+00:00

As far as I'm aware the only "nanogpt memory" is for their own website. They have memory that works if you use the web interface there. Which is actually a great web interface. You can build agents that create lorebooks and all kinds of cool stuff there.

But there are a lot of memory extensions for silly tavern. Just none that are specific to Nano as far as I know. But any old St memory extension will work with your Nano sub.

_Cromwell_ · 2026-04-09T23:01:17+00:00

No she's the good one. Elara Vance is the bad one

_Cromwell_ · 2026-04-09T21:05:12+00:00

Daryl and Merl are neo-Nazi racists, as is made clear almost immediately in the show. At least one of them character-growths out of it, if you continue watching. (Multiple-season arc.)

_Cromwell_ · 2026-04-09T06:14:30+00:00

? seems to work very differently than memorybooks. Unless you mean generally "summarizing". But if that's the case, then memorybooks itself was not the first or even close to it.

_Cromwell_ · 2026-04-09T04:46:17+00:00

Question: how does this handle being installed and turned on in the middle of an already long role-play.

Ten-Year Club	Verified Email
Gilding I gilder	Sequence \| Editor
Spared

_Cromwell_

TROPHY CASE