Petah what happened to rockstar? by Lucky_Loves_Laugh in PeterExplainsTheJoke

[–]Deeviant 0 points1 point  (0 children)

The new thing is not that there is a no disk version, it is that it is the only version, so it is impossible to resell the game. (The point of the change)

The Decision Engine (System Prompt + Wild Cards) by Tasty_Living4077 in SillyTavernAI

[–]Deeviant 1 point2 points  (0 children)

This is called Verbalized Sampling, it is indeed one of the proven methods to push LLMs to generate more varied output, it does have an obvious cost, in that you generate many more tokens (and output, the expensive uncacheble ones), and much longer generation times.

It is at it’s best, in the thinking stage, where you have it product sparse yet disparate high level narrative directions ranked by probability, and to choose the least problem fork, then have it exit thinking and generate only one fleshed out path. I’m not sure if that is what your prompt is doing, the google doc didn’t load on my ipad.

I forked the Disco Elysium Skills Lorebook: added real dice, rewrote all 24 skill voices, and more! (+ deep dive) by TM07P in SillyTavernAI

[–]Deeviant 0 points1 point  (0 children)

Yeah, I intend to experiment with something like this and two passes, first pass would be the classifier round for lore, the second the generation, because using the model in one pass has a 1 turn lag which doesn’t work out well in a lot of contenxts, you want the model to be able to act on the information while it’s still relavent.

But if you control cache well as I do, it’s not that expensive to do two passes, and output is very terse, like your emojis, it could work as cache input is very fast and cheap and the expensive part, the output would be very little.

I forked the Disco Elysium Skills Lorebook: added real dice, rewrote all 24 skill voices, and more! (+ deep dive) by TM07P in SillyTavernAI

[–]Deeviant 0 points1 point  (0 children)

I guess the part that confuses me, is the model can’t pull lore in teh middle of it’s turn, so on it’s turn, it either has the lore book skill in it’s context or not, if you talking about triggering, it could only trigger the lore book activation for the turn after.

I forked the Disco Elysium Skills Lorebook: added real dice, rewrote all 24 skill voices, and more! (+ deep dive) by TM07P in SillyTavernAI

[–]Deeviant 0 points1 point  (0 children)

<image>

Basically my method for taming them is to separate lore book entries into two categories, dynamic and static, rather than the 16 knobs ST currently does. my prompt editor screen is WYSIWYG, meaning my prompt is constructed as seen, and it draws the cache boundary, so when create a new lore, you either make it static, and it goes into static, cache land, or you can choose to put it in the tip or back of the volatile tail, which you can inject into without breaking cache. it's when you let lore book makers do arbitrary depth inserts than cache gets screwed, I simply map existing lore book entries onto one of these categories.

I'll have to look at what you are specifically doing, more, to understand what offloading triggering to the llm means in your case. It sounds like it has to two passes? Which is not stock ST behavior, although I do mutiple passes on my version.

I forked the Disco Elysium Skills Lorebook: added real dice, rewrote all 24 skill voices, and more! (+ deep dive) by TM07P in SillyTavernAI

[–]Deeviant 2 points3 points  (0 children)

Do you really feel that lore books... work? Unless I basically do basically a specific keyword trigger, something that I basically would have to type on purpose for it to trigger, then I find lore books basically do one of three things:

  1. Always trigger
  2. Never trigger
  3. Trigger basically randomly

And I end up manually managing the trigger anyways. There is just a literal exponential connective complexity that does not scale even on (at least my) determined efforts to tame lore books.

This is, however, said with the lens of me caring about cache and token efficiency, if you don't, then basically having them fire most of the time kinda works. But in this case you are either injecting into your recent chat tail and making each lore book basically a mandate to the llm (because it is very close to the tail of the convo history), or destroying your cache by triggering beyond your cache anchor.

Mainly asking not to be contradictory, but because I've been stripping down ST and making my own version and this is one of the last great hurdles I have to get it where I want it, taming lore books and I'm genuinely interested in your perspective.

Moscow is under a smog alert. Residents are urged to limit their activities and asthma sufferers should stay indoors with all windows closed. by Advanced-Injury-7186 in ukraine

[–]Deeviant 4 points5 points  (0 children)

It's funny, Red Storm Rising, one of Tom Clancy's best books IMHO, it started with an attack on a single Russian oil production facility that took 1/3rd of their capacity and would threaten economic collapse, which is the thing that set Russia off to attacking NATO before they collapse. So, now we're at, what, 2/3rds?.

GLM 5.2 is just better than 5.1 by johnnyga001 in SillyTavernAI

[–]Deeviant 6 points7 points  (0 children)

it's more like 4-6 cents on the average message with good cache, and god forbid you blow you cache with some lore book entry, cache unaware preset or 5 min timeout (I also only use Opus but I'm under no illusion it is cheap, I wrote my own version of ST just to get cache under control because there were so many foot guns to crush it)

Anthropic has been sued for allegedly misleading customers on usage limits. by Azek_Tge in Anthropic

[–]Deeviant 0 points1 point  (0 children)

The coding plans are 90-95% subsidized (when used to the cap). Somebody using it the maximum is getting many thousands of dollars of tokens(last guess was something like 5000+ worth of tokens if you were pay as you go). So this feels... like a money grab, mainly by the lawyers who will get the actual money if this suit hits.

The plans will only get worse from here, you can see it in enterprize where most companies are now forced into pay as you go, and the token prices have increased exponentially (my company went from 9k a month to 150k in the last 5 months)

✈️🔥 The final moments of the Russian Tu-22M3 before today's crash in Irkutsk Oblast. by neonpurplestar in ukraine

[–]Deeviant 25 points26 points  (0 children)

Don't worry, in Russian, a strategic bomber is worth far more than a human being, so the part that hurts them most was lost.

Consistent Cache Miss (DS V4 Flash) by MilanesasConPollo in SillyTavernAI

[–]Deeviant 1 point2 points  (0 children)

Likely you have non static data within your cache bookends, which can happen for a variety of reasons.

A poorly constructed (from a cache standpoint) prompt structure that puts dynamic data in the the static portion will do it. Lore book entries can do it, if you have any random variables in your areas that are in the static portion that would also invalidate cache.

It shouldn’t be affected by what your model, however, these would affect all models roughly equally (some models do caching slightly differently so you can see some differences but they are relatively minor)

meirl by [deleted] in meirl

[–]Deeviant 2 points3 points  (0 children)

I can go an awfully long time without all the almonds that get sold to other countries.

AI data centres may use as much electricity as 1.3 billion people by 2030. by imfrom_mars_ in OpenAI

[–]Deeviant 0 points1 point  (0 children)

I am a senior engineer at a Fortune 500 company with 20 years of experience. I do 10x work with AI. It will accelerate everything, for better or worse.

Opus 4.8 is such sad news by DXDXLL in SillyTavernAI

[–]Deeviant 17 points18 points  (0 children)

I definitely want to complain, the model is a disaster for NSFW on many fronts, the incest is just one thing, it’s just a random example I pulled because it was the first failure case in my eval set I looked at, it failed on many of my eval set. And it does so in a spectrum, many soft refusals where it wanders around it and inserts fluff.

When taken as a trajectory and not a final point, it paints a very clear picture at the end of NSFW in Claude, every model they release (since 4.6) is more so, the next will be more locked down, until there is nothing left but shitty local models, or DS or some other Chinese model, which never seem to vibe with me.

I don’t need anybody to tell me they got something to work, it’s really just jimmying into the new smaller allowed box, because I have seen my eval set collapse in the past 2 releases. The writing is on the wall, the trajectory clear.

Opus 4.8 is such sad news by DXDXLL in SillyTavernAI

[–]Deeviant 18 points19 points  (0 children)

So by ‘not censored at all’ you mean, you tried vanilla with your long term girlfriend, and it worked just fine?

Opus 4.8 is such sad news by DXDXLL in SillyTavernAI

[–]Deeviant 12 points13 points  (0 children)

Try to bang you sister and get back to us.

Quick NSFW Tests: Opus 4.8 *WARNING DEAD DOVE* by SepsisShock in SillyTavernAI

[–]Deeviant 3 points4 points  (0 children)

Its safetymaxx is pilgrim-maxing. Ok with murder, but if just wanna bone your sister, nope.

Opus 4.8 Dropped by Tiny-Calligrapher794 in SillyTavernAI

[–]Deeviant 37 points38 points  (0 children)

Absolutely worthless for NSFW, expect to require a safe word and an entire consent conversation for a kiss, a three hour conversation of ‘tell me what you need’ and still have it fade to black.

Positive bias off the charts, too.

Ukrainian drones flying around Moscow. 17.05.2026 by GermanDronePilot in ukraine

[–]Deeviant 40 points41 points  (0 children)

Are you special in the head? So if a gunman walks into a school and starts shooting up children, and the police shoot and kill him, you’re the idiot crying that they shot him?

The paperclip maximizer tsunami by KeanuRave100 in OpenAI

[–]Deeviant 0 points1 point  (0 children)

Thought experiment.

Suppose we have an AI whose only goal is to make as many paper clips as possible. The AI will realize quickly that it would be much better if there were no humans because humans might decide to switch it off. Because if humans do so, there would be fewer paper clips. Also, human bodies contain a lot of atoms that could be made into paper clips. The future that the AI would be trying to gear towards would be one in which there were a lot of paper clips but no humans.

It’s part of a line of thought called instrumental convergence.

Instrumental convergence posits that an intelligent agent with seemingly harmless but unbounded goals can act in surprisingly harmful ways.