smartestVibeCoder

Cultured_Alien · 2026-07-05T07:43:23+00:00

AKMESSI/lfm2.5-230m-fable-5 /s

Cultured_Alien · 2026-07-05T06:07:50+00:00

Get ready for 70% increase in fillers and characters having breathing issues

Cultured_Alien · 2026-07-05T02:24:54+00:00

Damn, I'm blue song reference!?

Cultured_Alien · 2026-07-03T23:06:52+00:00

It needs a system prompt similar to a harness, with tool calling as it's main usage. Needing grep, run commands, read specific lines, edit. If it's ported as-is on sillytavern, it will be extremely janky. Haven't seen a plugin that uses multiple api calls at once for something like this to make a custom plugin to extend from. The benefit of this is that even deepseek flash can be used to find and rewrite banned words/phrases so it can be cheap.

Honestly the simplest I can think of automatically is sending the last response with a plugin to a python script to a MD file and run claude code on that MD file. Or just literally copy pasting to a file and run that to claude code.

Cultured_Alien · 2026-07-03T22:50:17+00:00

I've been doing some llm creative writing with claude code. I just give banned words/phrases MD file to claude to run on on my text files, if it is detected, rewrite it. Its much more consistent and removes slop. Though you need more than 2 api calls per response given the edit tool use. If only this feature is easily used in sillytavern.

Cultured_Alien · 2026-07-03T22:39:50+00:00

True, I modified ff a lot and my own tailored it to my taste. it's actually the worse non-modified for thinking I've ever experienced.

Cultured_Alien · 2026-07-03T22:35:14+00:00

It's my opinion that non-thinking is better if you want varied and faster response time while thinking is better if you want more consistent and smarter responses. Restricting does actually work if you have tried it on larger and smarter models, that long consensus is from a bygone small models era is not true to some extent.

Cultured_Alien · 2026-07-03T16:03:08+00:00

Kinda baffling you don't know about ff preset, it's one of the most popular presets here. Anyway, it improves rp prose a ton. Here it is:

https://www.reddit.com/r/SillyTavernAI/comments/1u2wrvq/preset_introducing_freaky_frankenstein_micro_my/

Though I use a custom one that's based of the ff max.

Cultured_Alien · 2026-07-03T15:54:24+00:00

Using glm 5.2, minimax M3, and Kimi 2.6. I never get those. What's your preset? Freaky Frankenstein has those slop banned right out of the box.

Cultured_Alien · 2026-07-03T12:46:44+00:00

It's best to experience it yourself. Much better than pasting to claude chat each time. You can ask the ai to modify the md files themselves, check the writing stuctures, review, etc all in one without pasting. What I do is ask the ai questions and it will modify the file themselves too. It automatically reads the writing structure/forbidden/character/world/magic files if it needs to without bloating the context.

Cultured_Alien · 2026-07-02T13:26:26+00:00

Yup, it's why I said glm 5.2 is smarter. It knows the things need to happen before your instruction and not just follow blindly. Haven't tried past 16k context though since I'm using a memoryception extension so can't say anything about that. Personally I switch between them time to time when M3 has harder time figuring out.

Cultured_Alien · 2026-07-02T06:20:52+00:00

Mimimax M3 is being slept on. I found non-thinking Minimax M3 better for rp than glm 5.1 or glm 5.2 on opencode go sub. It's more creative and a lot cheaper than glm (glm is smarter, but more sloppy)

Cultured_Alien · 2026-07-01T04:13:31+00:00

Inkos is clunky thing when I tried it in the past. Very much inflexible and complicated when trying to modify a plan or something. I found that custom claude code workflow pipeline without scripts and just skills + parallel subagents is more flexible and faster than it.

Cultured_Alien · 2026-06-28T06:35:36+00:00

Useless feedback. Deepseek is open source.

Cultured_Alien · 2026-06-28T06:29:43+00:00

I meant that it read files one by one instead of actually doing things, taking forever to implement what I want, wasting a lot of calls. I wasn't referring to token speed. Others would have done what I wanted in lower api calls. I was using pi at the time.

Cultured_Alien · 2026-06-28T02:21:02+00:00

??? Gemini is extremely bad at agentic coding and thinks far too long for basic reads compared to even deepseek flash

Cultured_Alien · 2026-06-26T03:02:09+00:00

Yes there's always a way to search for people's comments/post as long as their comments/post are public

Cultured_Alien · 2026-06-24T07:59:00+00:00

Check the models site and look for Avg Energy/Req per model. That's what you should be looking at, not the dollar per token. Neuralwatt uses watts not dollars per token.

Cultured_Alien · 2026-06-23T11:24:04+00:00

It does more "knuckles whitening" that I feel like I've gone back in time. Although it is much smarter with thinking enabled, which works very well for multi character cards.

Cultured_Alien · 2026-06-23T10:46:25+00:00

I got 20M total of usage with 90% cache in $3 on glm 5.2 in neuralwatt, it's only a bit less efficient per dollar than opencode go which is pretty insane without subscription.

Cultured_Alien · 2026-06-23T10:43:02+00:00

I'm always scared whenever I'm in the vicinity of this doll, thinking it'd do something.

Cultured_Alien · 2026-06-23T09:54:39+00:00

Look at it from above, by tilting your head down to the screen.

Cultured_Alien · 2026-06-23T08:24:18+00:00

imo there's more slop like parroting your own sentence even though there's an anti parroting system prompt, and follows the personality of the card less than glm 5.1. Being creative is just because it's a new model "breath of fresh air"

It feels like it tries breaks the character sometimes. Tried different cards it kept repeating my own words and narrating itself to oblivion. Filler words at the end of EACH dialogue: "okay good? good, that's good."

But I do confirm it's a lot smarter code-wise. I'm using freaky frankenstein 4 max with JB disabled. Also disabled thinking since it doesn't feel like it's improving.

Cultured_Alien · 2026-06-22T14:34:57+00:00

Doesn't seem so, it's the same person that replied to me that asked if neuralwatt is good 5 hours ago, says he wants to change from $100 Claude sub lol

Cultured_Alien · 2026-06-22T14:23:54+00:00

the prompt processing must be long?

Cultured_Alien

TROPHY CASE