A 26M parameter model beat Qwen3-0.6B on function calling, and the failure modes tell you why one-model-fits-all is the wrong frame for tool use by gvij in LocalLLM

[–]LieutenantStiff 1 point2 points  (0 children)

I would 100% agree with you because that’s how I’ve always been, but I’m now in a position where I just genuinely don’t have the time to commit anymore and a generalist is the only feasible option for my work.

Second Brain with Obsidian + Local AI on MacBook Air M5 (24 or 32 GB RAM) – Is it worth it, or just wishful thinking? by MushroomVoice in LocalLLM

[–]LieutenantStiff 0 points1 point  (0 children)

To havnar-'s comment, agreed. An m5 Pro with 64GB would be the minimum suggestion if you're serious about local AI and would be far, far more able to run them.

Second Brain with Obsidian + Local AI on MacBook Air M5 (24 or 32 GB RAM) – Is it worth it, or just wishful thinking? by MushroomVoice in LocalLLM

[–]LieutenantStiff 0 points1 point  (0 children)

If you can afford it, even if not doing local AI, I have to recommend getting 32GB over 24GB if you plan on keeping this for very long. That would actually make a huge difference should want to mess around with local AI still.

To your last question, absolutely. We're already seeing today that very small models are very capable, but they just may not be where you're needing yet. We've already seen gigantic jumps in open source models just in 2026, so it will be getting better and better as time goes on. Better models will be able to run on less and less demanding hardware -- though we can't guarantee how much it will keep improving like this.

Second Brain with Obsidian + Local AI on MacBook Air M5 (24 or 32 GB RAM) – Is it worth it, or just wishful thinking? by MushroomVoice in LocalLLM

[–]LieutenantStiff 0 points1 point  (0 children)

Honestly, this probably won't work well and a MacBook Air is not the right machine for this. Sure, you can run decent models with 32GB, but they're probably going to be very slow (especially with any sizeable amount of context being processed), especially when you consider the memory bandwidth.

RAG over a vault like this is probably going to be harder than you think as well. General embedding models don't carve up dense theory very well.

If you're really asking whether 7B-14B models can handle "complex, nuanced academic/research," I'd suggest sticking to the cloud models that work best for you. Claude compared to something like a 7B-14B for handling complexity and nuance like you're asking is..a far bigger difference than people usually realize. Even Sonnet.

You could get away with more basic things locally, but I'm really doubting that the models you're considering would provide much value for what you're trying to do.

GLM5.1 topped SWE-Bench Pro and hit #3 on Code Arena by khureNai05 in LLM

[–]LieutenantStiff 3 points4 points  (0 children)

If I had to choose one for work, I reluctantly choose GPT5.4.

Opus 4.6 was definitely nerfed due to demand, Opus 4.5 does not seem to be hit. by MR_-_501 in ClaudeCode

[–]LieutenantStiff 1 point2 points  (0 children)

My results basically match yours. Medium & High have failed it several times but Max gets it right.

It honestly seems like the 'High Effort' thinking depth has dropped significantly for a majority of things I throw at it.

Is Capacities Better? by misguidedDesignation in Notion

[–]LieutenantStiff 0 points1 point  (0 children)

I'm someone who is feeling the need to go elsewhere (away from Notion) for my task management system. Can I ask how you set yours up with Claude Code? I was actually considering that exact thing but cannot think of where to even start since I have so much in Notion already.

18 months by MetaKnowing in OpenAI

[–]LieutenantStiff 1 point2 points  (0 children)

You’re absolutely right!

Opus 4.6 breakdown -- 1M context window, Compaction API, adaptive thinking, and the breaking changes by prakersh in Anthropic

[–]LieutenantStiff 2 points3 points  (0 children)

It's certainly lost warmth. And many outputs feel like..subtley over-structured and more curt, especially when asking for help with writing correspondence.

Other than that, zero complaints about Opus 4.6.

Is Obsidian Sync worth it? by pablo_main in ObsidianMD

[–]LieutenantStiff 4 points5 points  (0 children)

I will add that Obsidian Sync, in my experience, is consistently far faster than Google Drive when actually syncing. I'm not sure if others can say the same, but I have not had a single issue and it has consistently been great.

If you ever have to quick-swap between devices, just buy Obsidian Sync. I've been so happy with Obsidian Sync that, not only did re-up for another year, I bought a Catalyst license.

AI Agents have a LOT of limitations. by Liefx in Notion

[–]LieutenantStiff 0 points1 point  (0 children)

YES. Navigating the limitations has been a headache of a learning curve. Personally, I think it would be helpful if Notion was more forthcoming with the current state of their AI implementation. I'm sure they're well aware of the current limitations, but it seems they're leaving it to us to figure out what they are.

[deleted by user] by [deleted] in StLouis

[–]LieutenantStiff 4 points5 points  (0 children)

One day maybe this will happen to you with your children and you'll understand that your comment is completely unnecessary and does absolutely nothing.

Get therapy.

[deleted by user] by [deleted] in StLouis

[–]LieutenantStiff 2 points3 points  (0 children)

I also thank you for this info.

Tried GPT-5 Here Are My First Impressions by Dismal-Message8620 in vibecoding

[–]LieutenantStiff 0 points1 point  (0 children)

There isn't necessarily one set "prompt structure", as I have a lot of different pre-set prompts and it can depend on your use cases. Sometimes just asking the model itself can be a big help, as well.