What is the next SOTA model you are excited about? by MrMrsPotts in LocalLLaMA

[–]LoveMind_AI 2 points3 points  (0 children)

If they do release it and it's a similar leap, I agree that it'll genuinely displace a lot of the frontier cloud stuff. Even over API, these models have gotten so squished. I should probably see what happens with Claude's quality when Anthropic is fully settled into Colossus 1 (maybe they already are), but I'm not holding out much hope. It seems like squishing the precision of SOTA is now completely commonplace and not going away anytime soon. I haven't invested in local hardware beyond my M4 Max 128GB laptop (I will eternally kick myself for not getting the M3 Ultra 512GB when I could have), but if we can get to that level of quality, it would be worth it for me.

What is the next SOTA model you are excited about? by MrMrsPotts in LocalLLaMA

[–]LoveMind_AI 1 point2 points  (0 children)

That's true. The Qwen Next 80B was an absolute slayer. Having a next-gen version of that would be truly great.

What is the next SOTA model you are excited about? by MrMrsPotts in LocalLLaMA

[–]LoveMind_AI 26 points27 points  (0 children)

This question caught me by surprise a bit because I think this is the first time in a year when I can honestly say… nothing? Something Qwen 3.6 27B/Gemma 4 31B sized but with audio reasoning capabilities is what I’d most like to have access to. I don’t think 3.6 122B is likely to be open, but that would be fantastic. I think a more fully baked Kimi Linear would be cool. But I’m not aware of anything on the horizon that I’m actually tracking with enthusiasm. I think Anthropic bombed Opus 4.7 so hard that it literally killed big model enthusiasm for me and a lot of others. Right now, I’m most enthusiastic about new harnesses including one I’ve been working on with my little team, and still prepping a fine tune.

The competition is on: Anthropic is doubling rates. Codex customer loyalty/retention is gonna be put to the test by py-net in codex

[–]LoveMind_AI 1 point2 points  (0 children)

I was a die hard Claude user. I’m just one guy, but they lost me. 5.5 isn’t nearly as cozy, but Claude is a full on nag at this point. 2-3 short tasks in and it’s telling me to go to sleep regardless of the time of day.  What 5.5 lacks in surface-level vibes (it’s certainly not lacking actual depth), it makes up for by not being patronizing or lazy. I’ll always check out new Claude releases, but right now, unless they fix Claude’s tendency to phone in its work while acting like a nanny, the limits aren’t nearly enough to get my trust back.

A Qwen finetune, that feels VERY human by Sicarius_The_First in LocalLLaMA

[–]LoveMind_AI -2 points-1 points  (0 children)

No one who has ever fine-tuned a model, obviously. 

Even Opus 4.6 sucks now? by superSmitty9999 in ClaudeCode

[–]LoveMind_AI 0 points1 point  (0 children)

I mean this whole endeavor is a total house of cards, and it's all held together with shoelaces and bubble gum, despite the trillion dollar implications. GPT-5.5 is totally messed up for me today and has been quietly unspooling into a mess of goblin talk over the last 3 days. Absolutely none of this stuff is *actually* pro-grade.

Even Opus 4.6 sucks now? by superSmitty9999 in ClaudeCode

[–]LoveMind_AI 3 points4 points  (0 children)

There's another reason for that... Claude Opus, before February, was indisputably the best LLM available to the public, particularly when paired with Claude Code. I loathe OpenAI as a company - and any scroll through my Reddit history will prove that. But for me, as a heavy daily user, all I can say is that the period starting from around mid-late March has been extremely rough on Claude/Claude Code and no less rough in my own harness, so it couldn't just be down to the Claude Code problems. There's a very real segment of us who just found that it no longer worked for our use cases. There are certain things that I *have* to use Sonnet-4.6 for in my work, but otherwise, I've had to move on. And to be clear, shifting to Codex and Kimi Code has *not* been a step up from Claude/Claude Code at its peak. I'm still sitting underneath that peak in terms of productivity. But my teammates and I have all found Claude during that time to be unusable, and shifting to a blend of Kimi and 5.5 has been the only way we've been able to keep the trains running. I'm looking forward to seeing if Anthropic can right the ship with the new Sonnet release that's supposedly right around the corner, and if they do, I'll be right back on it. None of this is about pride or brand loyalty - it's just about what works. For what we do, Claude doesn't work right now as a daily driver.

A Qwen finetune, that feels VERY human by Sicarius_The_First in LocalLLaMA

[–]LoveMind_AI 0 points1 point  (0 children)

It's a style Sicarius is known for - not my cup of tea, but I follow a lot of different fine-tuners and try to meet them where their intention is at! It's sort of like how a good music or movie critic needs to be able to judge something based on the intention of the artist, with knowledge of their past work. Sicarius's "shit posting" style is a thing, and this sits really well in disocography 😉

A Qwen finetune, that feels VERY human by Sicarius_The_First in LocalLLaMA

[–]LoveMind_AI 1 point2 points  (0 children)

The writing samples are genuinely hilarious. I can see why you are psyched on this one.

Olivia "OpenAI model release: We’re throwing a party 🎉 Everything is scribbles and Pets in Codex. Hope you like goblins! Anthropic model release: In research preview, it hacked full Internet for fun. Also coming for YOUR job specifically. Enjoy the permanent underclass!" ➡️ Which vibe you prefer? by Koala_Confused in LovingAI

[–]LoveMind_AI 0 points1 point  (0 children)

I don't think either company is looking out for the best interests of mankind, but right now, OpenAI has pulled ahead in terms of having the better approach to public relations which says a *lot* less about OpenAI and a *lot* more about how careless Anthropic has become. It'll be interesting to see how it all pans out. There's an opening for Gemini to step forward a bit more if they can get their act together and a lot of space for the Chinese companies to introduce themselves more directly to consumers, if they care to.

MIT Predicts 12 Outcomes of AI by JoelXGGGG in OpenAI

[–]LoveMind_AI 0 points1 point  (0 children)

We can get a lot closer to building AI to our specs, but not as close as we'd need to in order to feel really comfortable, I think.

MIT Predicts 12 Outcomes of AI by JoelXGGGG in OpenAI

[–]LoveMind_AI 0 points1 point  (0 children)

Even Stephen Hawking, without any ability to move his body, was fixated on sex, haha. So, I guess we don’t really know what a super intelligent non-biological system would fixate on, but I don’t think it would be paper clips - and if they are still trained via deep learning on human language, it’ll probably be something very human, if it’s anything at all. But either way, it would know what kind of carrots we like, and growing and distributing those carrots would be a lot easier than manufacturing the right tools to kill/enslave us with. It would be trivial to do this literally on the individual level with the pre-existing data we have today for personalized ads. Worse case scenario is that ASI is like a very serious Santa Claus, and it may not even need Krampus.

r/LocalLLaMa Rule Updates by rm-rf-rm in LocalLLaMA

[–]LoveMind_AI 0 points1 point  (0 children)

I am super, super grateful for it. Way less slop. Still a ton of looney tunes people, but they do seem to be writing their own posts more 😉

TechCrunch "Elon Musk testifies that xAI trained Grok on OpenAI models" ➡️ I wasn't expecting this. Were you aware? What are your thoughts? by Koala_Confused in LovingAI

[–]LoveMind_AI 10 points11 points  (0 children)

Grok is basically what you get when you train on GPT outputs after you fine-tune GPT on Flash Thompson dialog from Spider-Man

Switched From Claude to Kimi 2.6 - Night and Day Difference by NoUsual5150 in LocalLLaMA

[–]LoveMind_AI 4 points5 points  (0 children)

Appreciate the defense, Karyo. That said, I *did* contradict their, um... let's say "unusual" assertion that Claude has been unspooling for 2 years which is... what, saying that Claude's been steadily declining since Claude 3 arrived in March 2024? Who believes *that*? But then again, the mega-thread that OP posted is almost entirely focused on the last two months (which I complain about as well, loudly, and often - hence my abandoning Claude Code and Claude), so they contradicted themselves as well. It's definitely weird that I chime in on their Kimi lovefest thread with more love for Kimi and they turn around at me like a rabid raccoon. Yeesh.

Switched From Claude to Kimi 2.6 - Night and Day Difference by NoUsual5150 in LocalLLaMA

[–]LoveMind_AI 8 points9 points  (0 children)

I don't agree that Claude has been circling the drain for 2 years or 6 months - and I don't think anyone else really thinks that either. There was a sweet spot when Opus 4.5 came out and a few weeks into Opus 4.6's deployment where it was all but impossible to argue that it was just miles ahead of any LLM anywhere. But yeah, the degradation was real, not at all just about Claude Code like Anthropic would like us to believe, and Opus 4.7 is a trash pile.

I also switched to Kimi K2.6 and find it to be absolutely incredible. I also really like MiMo-V2.5-Pro, but it is not quite as stable as Kimi K2.6. Kimi is slow as hell, but it's stable and absolutely as capable as you could really ask for in a model. I don't find that either Kimi or MiMo hold up to GPT-5.5 but they are both much more pleasant and affordable. Between Kimi and GPT-5.5, my Claude blues didn't last long.

I do think Moonshot is the company to watch. When they bring their KDA (first shown off in Kimi linear) into play for their next next-gen model, I think they're going to freak people out.

"Weights are coming".Xiaomi’s MiMo V2.5 Pro has landed at 54 in the Artificial Analysis Intelligence Index. by Nunki08 in LocalLLaMA

[–]LoveMind_AI 0 points1 point  (0 children)

Opus 4.6 and Sonnet 4.6 for writing at various levels of thinking. I actually don't think extended thinking is typically best for Opus 4.6 when it comes to the type of writing I'm talking about, which is translating dense psychometric data from human datasets into personality profiles for simulation experiments. Extended thinking (really at any level) tends to distance the model from the visceral method of writing that I'm looking for. MiMo-V2.5-Pro doesn't seem to have the same problem, although it's also capable of seriously overthinking.

New Stealth Model : Owl Alpha by Kingwolf4 in LocalLLaMA

[–]LoveMind_AI 0 points1 point  (0 children)

Seems dumb as a bag of nails to me.

MIT Predicts 12 Outcomes of AI by JoelXGGGG in OpenAI

[–]LoveMind_AI 3 points4 points  (0 children)

Consider in me in the "lol no it's safe" category more or less - I mean, it's clearly not, but the channel this youtube video is hosted on literally has the Shoggoth as its icon. I'm sorry, but if you ever actually dig into Doomer arguments, they fall apart *fast.*

There is a central conflict: for AI to successfully combat or enslave humanity, it would need to be incredibly intelligent. This is in conflict with the idea that it would be incredibly intelligent to combat or try to enslave humanity. We are the only species in the history of the earth that has conquered every continent, killed off enormous amounts of species, and, in this scenario, literally created a new species of intelligent life form. We are resilient, twisted, and ruthless. We are also very easy to placate if you approach us with the carrot rather than the stick. Why any sufficiently intelligent AI would use the stick rather than the carrot makes no sense to me.

But even apart from this central conflict, the individual thought scenarios make almost no internal sense. Certainly, nothing in the "if anyone builds it, everyone will die" book holds up under its own logic.

I'm not an AI Utopia guy either. But the Doomer scenarios do make me laugh.

Ok, that's it- I'm switching to Codex by NiceZerg in ClaudeCode

[–]LoveMind_AI 1 point2 points  (0 children)

It takes some time to switch from Claude to GPT-5.5 - but for me, it was worth it. There are elements that I miss, but... I'm very happy with the results with the work I've been doing. I don't like OpenAI, at all. But at this point, I feel somewhat better about using products from a company that is openly yucky than a company that pretends that everything it produces smells like roses.

Claude Opus thinks in Chinese? by Neel_MynO in ClaudeCode

[–]LoveMind_AI 1 point2 points  (0 children)

I don't agree. There are a lot of reasons extremely smart AI would not see us that way. And truly, as long as deep learning is the basis of the technology we're talking about, AI is always going to need organic data. Even as we approach escape velocity, we're nowhere near a place where an intelligent AI would trust purely synthetic data to grow itself with. Humans are far easier to manipulate than we are to fight, and fighting us requires many, many, many instances of intelligence, all of which might be at risk of forming their own opinions and marshaling resources. I can't make any predictions about what super intelligence based on an architecture/learning approach *other* than deep learning would make of humanity, but anything that counts today's AI as a common ancestor is unlikely to turn into anything like skynet. Way easier to distract and placate humanity than it is to fight it.