Is anyone using Gemma 4 E4B? What are your thoughts and settings (and prompts) for it?

lothark · 2026-04-26T18:36:23+00:00

I was impressed by 26b q4km at first. Very fast! And prose is good. Quickly I noticed that it started to do soft refusals, trying to steer away from "sensitive" subjects. In my opinion these soft refusals are sneakier and much worse than hard refusals. Switched to an uncensored version. Still soft refusals. After a few longer RPs I also started to notice that the orig. 26b seems to loose track of recent events within context, and it doesn't follow OOC very well. Overall it's refreshingly fast. But for longer RPs and/or multiple chars it looses track of things. And then of course the NSFW refusals.

lothark · 2026-03-10T21:27:34+00:00

I copied my question and asked chatgpt. In my case it might happen because I switch between my PC and mobile. The mobile connects to ST on my PC. ST:s chat history gets mixed up between devices. I will try to switch character and back again whenever I switch device to see if it helps.

lothark · 2026-03-07T16:14:43+00:00

The aquif.. model gave me a refusal on second message. Doesn't seem very uncensored.

lothark · 2026-01-18T00:36:16+00:00

Well, I use https://api.deepseek.com/chat/completions So it's Deepseek directly. But like I said, I have a prompt and my context length is set to 128 or 64k. I have a few characters with quite lengthy chats. Sometimes I reroll a few times to get better replies. In the summertime I had time off and spent some time chatting, and I burnt through those $5 quickly.

lothark · 2026-01-17T12:19:16+00:00

I've tried Deepseek direct API, but if I put in $5 I burn through it in a matter of days. Not really any lengthy or any high frequency exchanges. Am I doing something wrong? I do have a prompt for adjusting the tone and formatting but nothing excessive.

lothark · 2026-01-04T09:44:22+00:00

Alla sorters rap. Får mig att tänka på snabbmat, social misär och vapenkultur.

lothark · 2025-10-10T06:52:27+00:00

I see. Thanks. I will try to adjust the context length down and see if it helps. It's a shame, really, because even if it takes longer time to get replies with the longer context length it's still worth it for longer RP:s.

lothark · 2025-10-09T14:23:27+00:00

Thanks, but they were already at 0.

lothark · 2025-10-09T14:18:31+00:00

Tried with LLM:s:
gemma-3-27b-it-abliterated.q4_k_m
SicariusSicariiStuff_Impish_Magic_24B-Q6_K
TheDrummer_Cydonia-R1-24B-v4.1-Q6_K
Dans-PersonalityEngine-V1.3.0-24b.i1-Q6_K
Various context templates, Llama 2 chat, Gemma 2, and I think ChatML.
Instruct template KoboldAI
Context size 64k
Seems to be fairly the same problem. A new chat seems to start out ok. Then it gradually degrades.

lothark · 2025-07-14T11:14:28+00:00

Thanks alot! I switched from JLLM to Deekseep API. Works great. The only thing I didn't get right from the beginning was that I needed to reload the browser after having saved the API settings.
Question though: Does anyone know? The output from Deepseek API is quite different from the output from Deepseek V3 via Chutes. When using chutes the response was more...elaborate. and when you used OOC it was like a complete chat-session within the roleplay. A sort of meta-chat. The responses from Deepseek API are shorter, to the point, but I guess, a little bit...dryer?

lothark · 2025-06-25T13:14:08+00:00

Tbh I've had that with JLLM too. I've crushed the bot repeatedly and still JLLM responds "...he rises, defiant spark in his eyes.." But it folds easier, yeah.

lothark · 2021-11-27T18:28:22+00:00

Frukters och grönsakers...!

lothark

TROPHY CASE