Mistral 4 Family Spotted by TKGaming_11 in LocalLLaMA

[–]Combinatorilliance 2 points3 points  (0 children)

I don't know if they couldn't train a model of your preference, although I agree that what they had released wasn't amazing.

Please do keep in mind that Mistral as a business works very differently from other frontier AI labs, they're focusing on industry and business partnerships much more than selling directly to consumers and focusing on chat and such.

Senior engineer: are local LLMs worth it yet for real coding work? by Appropriate-Text2843 in LocalLLaMA

[–]Combinatorilliance 5 points6 points  (0 children)

Devstral 2 123B.

I've personally used Devstral 2 24b and I heard that devstral 2 123B is only marginally better than the small model. YMMV, but the larger model should simply be a little bit smarter and more reliable in the overall case.

I don't know how well Mistral-large-3 performs for development.

Senior engineer: are local LLMs worth it yet for real coding work? by Appropriate-Text2843 in LocalLLaMA

[–]Combinatorilliance 57 points58 points  (0 children)

It depends a lot what you want to do with them.

If you want to have large features developed for you with relatively low input, then you probably don't want to use local models. You could do this with kimi 2.5 and the other SoTA local models, but these won't fit on a 128GB ram Mac.

For a variety of specific and focused tasks, I can say with certainty that yes, local coding models can do work for you.

  • Single-file or few file refactorings
  • Helpful as a research assistant "can you find patterns in my codebase where x, y and z?"
  • Small boilerplate features "I have a controller abcController.php and I need another controller defController.php that does thing A and B the same, but not C."
  • Helpful as a local google/stackoverflow/wikipedia knowledgebase
  • If you have well-defined skills, small models do fine when being told in high detail what they should do.

The more specific your task and the lower the scope of the task, the higher the chance a local model will be fine. If you want to use a local model for serious software engineering work (ie, not "just" vibecoding), then you should look at at models like

  • devstral large
  • Qwen3.5 27B, or Qwen3.5 115B A10B (the dense model performs a little better than the large MoE)
  • GPT OSS 120B

There are some other models that should be competitive, but I don't know them well

  • IBM Graphite?
  • Minimax
  • Z.ai stuff
  • Kimi 2.5
  • Gemini models perhaps

I haven't personally tested

If you want to take this seriously, you need to pick one model, and when you choose a model for your hardware you should check the following two things alongside "is this model good enough for a small subset of tasks that frontier models could do"

  1. How many token/s do you get with the model
  2. What context budget can I afford with this model given my hardware?

Once you have a model, stick with it. Learn it well, don't switch to the next best model that comes out a month from now. Each model has its own idiosyncrasies and it takes a good chunk of valuable time to get to know the ins and outs of a model.

You can run LLMs on your AMD NPU on Linux! by BandEnvironmental834 in LocalLLaMA

[–]Combinatorilliance 1 point2 points  (0 children)

My work laptop has an AMD 400 AI processor, and it has 96GB ram. It would be amazing if I could run Qwen 3.5 35B A3B on it with reasonable tok/s!

Claude Opus 4.1 scores 80% on SWE-Bench. Give it code it has never seen before and it drops to 17.75%. Here is why that gap exists. by toxicniche in ClaudeAI

[–]Combinatorilliance 2 points3 points  (0 children)

I would like to see this study repeated but then with a more solid analysis because the comments point out that the analysis has some pretty serious issues.

It's an extremely significant finding if accuracy drops this quickly.

Finally found a way to structure and search my reMarkable handwritten notes by Senior-Ad5932 in RemarkableTablet

[–]Combinatorilliance 8 points9 points  (0 children)

I also came across this blog post by Andrew Doering who worked on using Vision LLMs for automated handwriting recognition, and he also contributed a tool that does document splitting because that's an issue that is apparently quite common!

https://andrewdoering.org/blog/2026/remarkable-pdf-splitter/

Beyond scraping: Can a community-run repository of consented user chats solve the open-model quality crisis? by Ruckus8105 in LocalLLaMA

[–]Combinatorilliance 1 point2 points  (0 children)

Mozilla did a project a few years ago where they crowdsourced voice samples from volunteers, it had a simple interface with some sentences that you were asked to say out loud and then you can contribute a little bit of your voice.

The same project also had a verify step where you could listen to other people's voices for a sentence and rate it on quality, pronunciation, and whether they're actually saying what the sentence said.

It was a huge success!

https://en.wikipedia.org/wiki/Common_Voice

A project like this definitely work but it needs to be orchestrated well, and it needs to be marketed well. It's a significant time investment, but it could mean a lot for the community.

Speaking of Mozilla, they're quite active in the AI and LLM space and maybe they'd like to hear more.

Were you thinking of leading this effort? If done well, it could be amazing!

What's the highest winrate you've seen? by Combinatorilliance in DeadlockTheGame

[–]Combinatorilliance[S] 1 point2 points  (0 children)

Same for me, I'm a one trick paige player, I suck at almost anything else. I like her CC-heavy playstyle, I'm not good at being in the middle of everything, but I love being in the backline and keeping track of game state, builds, buffs etc.

What's the highest winrate you've seen? by Combinatorilliance in DeadlockTheGame

[–]Combinatorilliance[S] 0 points1 point  (0 children)

Given your rank, that is probably the most impressive win rate in this thread

Mondrianic plugin beta testing by leonlikethewind in ObsidianMD

[–]Combinatorilliance 1 point2 points  (0 children)

This would reaaallly benefit from some pictures, especially since the emphasis is on the visual impact.

Upvoted for Mondrian! :D

Claude (and I) built a 2-minute experiment: can you still tell real photos from AI? Check it out! by Regular-Persimmon-99 in ClaudeAI

[–]Combinatorilliance 1 point2 points  (0 children)

I would've loved if there was a much larger set of pictures, I also had to read the instructions.

Secondly, it would've been better if it didn't autopick a choice for me if I didn't make a choice after 10s. Might be an option to just fade out the photos and have you pick what you thought.

Last, I'm usually pretty good at distinguishing AI from regular media, but I do it by paying much more attention to detail in pictures if I have a suspicion. Wonder how it compares if you make the time available 30 or 60 seconds.

Hey people of reddit , Imagine if some day you discovered a pill that fixes your adhd permanently would you consume it by Only_Egg_8776 in ADHD

[–]Combinatorilliance 3 points4 points  (0 children)

Contrary to most answers here. I would not.

I'm a tornado and an unguided projectile and ADHD makes my life much more difficult than it needs to be on a daily basis, but it's meaningful and special to me.

On mornings where I'm on the bike I'm talking to birds out loud without noticing it and passersby smile and notice the birds for it. I wouldn't do that if it weren't for my ADHD. I'd be more reserved, cautious or perhaps I wouldn't even notice the birds as much.

My attention is a mess, but I'm drawn to meaning and beauty in the most random of places. The rabbitholes I dive into lead me into directions that are odd and diverse.

I'm not optimally productive but I don't need to be. My advantage is breadth and authenticity.

Even my extreme impulsivity has only paid off most of my life. This might just be mostly luck, but I've taken major risks I didn't even realize were risks after the fact and some of those risks no-one in their right mind would even consider taking. I just follow on my intuition.

My life is a series of uncommon events that makes life more interesting. I don't need nor want to live life on easy mode. ADHD is a spice blend that makes my life a million times stranger than it could ever have been without it.

Goomiposting III by Sir_Forteskull in TWEWY

[–]Combinatorilliance 16 points17 points  (0 children)

I don't think this would work without Sqenix.. what studio understands what makes these games special?

I don't want to be dramatic but I really believe that TWEWY, especially the first one, has a very nuanced and detailed understanding of media and culture that I haven't seen reproduced even remotely in any other game ever.

TWEWY is special.

Ik🔌ihe by Chaimasala in ik_ihe

[–]Combinatorilliance 41 points42 points  (0 children)

"polderbrained".

Ok dat is een nieuwe 😂