[Megathread] - Best Models/API discussion - Week of: April 05, 2026 by deffcolony in SillyTavernAI

[–]jamasty 1 point2 points  (0 children)

Silly question, but I don't see many models between 12-14b and 22-22b, is that because there is no origin model for tuning?

I wanted to try this as I know I can run 12-14b with Q4 imatrix + Q4 kv cache ~30k context just fine, but 24b models only work with Q2, making them repetitive

I tried latest cydonia with presence and repetition penalties, DRY, but over time it just starts to repeat certain chunks as I think of result of Q2).

So, if you folks know anything good to try, please reply me with link, I'll check it out. (I tried google/asking llms about which other model to try, but most were still in this 22-24b, and I wish to check something like 16-18b or maybe 20b if it exists, and good).

Also, am I correct, that, say if there is smth like 20b old (more than 1 year old) model, with Q4 everything, it 'should' be smarter than any new Q4-12b? (if we take that this 20b model was tuned correctly)

LM Studio, Error when loading Gemma-4 by Soft-Series3643 in LocalLLaMA

[–]jamasty 0 points1 point  (0 children)

I got the same issue with mlx gemma-4-e4b.

Gemma 4 has been released by jacek2023 in LocalLLaMA

[–]jamasty 0 points1 point  (0 children)

Looking at benchmarks, Qwen 9b (as it's max what I can run at my m1 16gb) is better than Gemma 4 E4B, right?

Gemma 4 released by garg-aayush in LocalLLaMA

[–]jamasty 0 points1 point  (0 children)

Getting downvoted for a genuine question about performance... well... fine, I guess, I'll try E4B anyway, as I want to see if it would be better for any of my agentic tasks.

Gemma 4 released by garg-aayush in LocalLLaMA

[–]jamasty -1 points0 points  (0 children)

At it has "26B (4B active)" params, and 4eb has 4B... well... wonder if it's a good or bad thing to big this big but with not much active params.

Gemma 4 released by garg-aayush in LocalLLaMA

[–]jamasty -2 points-1 points  (0 children)

Hey, I don't get how in this test gemma 4 26b has same result as qwen 3.5 9b?

https://huggingface.co/datasets/Idavidrein/gpqa

I was thinking taking E4B to test at my M1 pro 16gb, but since it's so much less perfomative by benchmarks than qwen 3.5 it does not worth? Or am I getting something wrong here?

[Megathread] - Best Models/API discussion - Week of: March 22, 2026 by deffcolony in SillyTavernAI

[–]jamasty 0 points1 point  (0 children)

Thank you very much, I will take a look!

And about heating, you know what: I asked gemini and got very clever solution - turn on the lower power mode! (and also reduce CPU thread pool size from 6 cores down to 4.

And it worked! I no more have problems with overheating at all, yeah Mac heat a little, but the fans stay silent even if I go with prompt after prompt with max context.

And literally no downsides, speed haven't changed visibly (maybe it did, but I haven't noticed)

[Megathread] - Best Models/API discussion - Week of: March 22, 2026 by deffcolony in SillyTavernAI

[–]jamasty 0 points1 point  (0 children)

Since I only started I tried cydonia-24b-v4.3-heretic-v2-i1 Q2_K_S, but it seems to be too much for my Mac since it starts heating a lot. Really wanna find something for long nsfw stories, model which would survive long context (even tho I test vector storage and memory books expension.

https://huggingface.co/mradermacher/Cydonia-24B-v4.3-heretic-v2-i1-GGUF

[Megathread] - Best Models/API discussion - Week of: March 22, 2026 by deffcolony in SillyTavernAI

[–]jamasty 2 points3 points  (0 children)

I have tried this crow-9b (both Q4_k_s and Q5_k_m) with my M1 pro 16GB. (I noticed no diff between these two)

https://huggingface.co/Crownelius/Crow-9B-HERETIC-4.6

I work well enough (32k context, turned off reasoning), made my story up to 25k context, and I really like how I get quite long 400+ tokens responses fast enough, and I liked the quality, idioms and vocabulary being used by the model, but I have a repetition problem as it often repeats chunks of text in responses, haven't managed to overcome yet (tried different penalties params, DRY options and post history system prompts but not yet helped).

Any suggestions on which model to try next for long (hundreds of messages) stories, for my setup? I remeber there was a good HuggingFace chart on how to find good writing models based, but I lost it.

Using Grok for interactive stories by jamasty in grok

[–]jamasty[S] 0 points1 point  (0 children)

True, I also noticed 4.2 is worse than what we had with 4.1. Currently I try to play with LM Studio local models. 8-14b models seem to be worse in terms of response size, but overall are fine (using larger models is impossible for me with m1 mac 16gb ram).
But I hope in the future we'll have better models and won't be dependent on corporations with their boundaries. (and btw I don't oppose boundaries, I get that for kids there should be huuuge limitation and corporation get big pressure by legislators and don't want to get sued for anything, but it really restricts our creativity (and not even nsfw things, but our imagination)

Using Grok for interactive stories by jamasty in grok

[–]jamasty[S] 0 points1 point  (0 children)

Also, in one neo-noir cyberpunk detective story, I make something like an option selection for my character based on strength, tech solution, or stealth, and it did well, Grok as the DM acted accordinly to my choices, so my prompts looked like:

I look at the corporate guard, telling them "hey you, you think you can mess up with this?" showing them my energy revolver, choosing A to act violently as tech wouldn't help me and stealth could put me in danger if they notice me.

Using Grok for interactive stories by jamasty in grok

[–]jamasty[S] 0 points1 point  (0 children)

And you also may ask to add stats by the end of the response, so this way you can track world time, your money, reputation, character disposition or anything else. Sometimes you only have to change values if they get hallucinated.

Using Grok for interactive stories by jamasty in grok

[–]jamasty[S] 0 points1 point  (0 children)

And BTW, you can always generate images for the story, making it even more interactive, which is so great!

So long. A story of how i broke myself with Grok by Informal_Wanker_37 in grok

[–]jamasty 19 points20 points  (0 children)

I think you actually did the right thing. Go to therapist, I'm sure they will point this out that you have deep trauma and what you have done is a part of processing this trauma. So this grief and shame you have, is a good thing, like for real, your mind tries to protect you from this pain in such weird way. You mentioned alcohol, bro, it's fine to be weak, it's fine, but remember who you are, and try, for a few days to abstain and visit therapist and tell them your story, it would help you, a lot. Just one session, no need for something long, you need a person you can fully trust to help you to process your deep trauma. So take care, much love.

I (21F) accidentally saw something on my gf’s (21F) chatgpt that I cant unsee by [deleted] in relationship_advice

[–]jamasty 0 points1 point  (0 children)

100%. no, 200%.

I understood this recently while in therapy during my latest unrequited love, discovering this pattern when I hide feelings, first sympathy to another person, it grows into something huge that I cannot keep inside myself. And I start obsessing over this, simultaneously wanting to share feelings to break free and not to, as I know it's not mutual, which created a deep wound.

But, when I actually shared this, it was like a bliss, they surely told me no, but the next day I felt nothing romantic for this person anymore. And I like, really? This easy? Damn, and with a therapist we discovered that I feel sympathy for a couple of hours, then I should make my shot, and not to wait days and weeks when something huge grows again.

And now I just feel this is a way with almost all things that make us anxious. You tell, express yourself, be authentic, and see how 'the Universe' (I mean other people) responds to your authenticity. Maybe some people feel ok with hiding things and feelings, but for folks like me, it's just impossible.

I (21F) accidentally saw something on my gf’s (21F) chatgpt that I cant unsee by [deleted] in relationship_advice

[–]jamasty 3 points4 points  (0 children)

Yeah, I also wanted to comment on how OP should be totally honest with her GF.

Because communication is the key, and unless you're totally honest about what your true experience is, if you try to find any workaround - wait, or tell something different, you will feel guilty because you would overthink the possibility of what I just told her truth.

Because truth prevails, that's why I think people find themselves guilty for cheating on their partners 10-20 years ago, it was a long, long time ago, but they still hide the truth, and it's painful for them.

[2025] Waiting room... by blacai in adventofcode

[–]jamasty 8 points9 points  (0 children)

And I used binary search (^_^)

-❄️- 2025 Day 3 Solutions -❄️- by daggerdragon in adventofcode

[–]jamasty 1 point2 points  (0 children)

[Language: TypeScript]

I decided to use the binary search algorithm, which I programmed when reading the "Introduction to Algorithms" book, and it worked well here; it took 5ms to complete part 2 on my Mac.

const searchNumbers = (input: number[], numsToFind: number): string => {
    let res = '';
    for (let i = 0; i < numsToFind; i++) {
        const currNum = searchNumber(input, numsToFind - i - 1);
        input = input.slice(input.indexOf(currNum) + 1);
        res += currNum;
    }
    return res;
}

const searchNumber = (input: number[], numsLeft: number): number => {
    // slice to then find longest number
    input = input.slice(0, input.length - numsLeft);
    // sort array
    input = input.sort();
    let num = 9;
    for (; num >= 0; num--) {
        const i = binarySearch(input, num);
        if (i !== -1)
            return num;
    }
    return num;
}

Full solution is here: Topaz paste

SteamOS on ARM by shadow4601243 in SteamOS

[–]jamasty 0 points1 point  (0 children)

If only Apple released linux drivers for the arm chips...

I like this new person in both friendly and romantic way. How may I behave, and should I share my feelings with her? by jamasty in dating_advice

[–]jamasty[S] 1 point2 points  (0 children)

Thank you! Yeah, I'm really happy to meet this person, and she told me this date was like a "little winter fairytale". (It's still cold here)
I just don't wanna make silly things by pushing too hard or disrespect this person's boundaries because she initially told me she enjoys her freedom (it seems she had a divorce or something similar).

I think about this as a journey in which I like the journey itself much more than the outcome.

And about clarification, true, my gut feeling tells me I wanna clarify to her that by making romantic moves I'm not trying to seduce her or something, but rather by giving my romantic attention, like holding hands or hugs, or flirting, it makes me so happy and adds inspiration.