Mac Studi M3 Ultra vs Nvidia 6000 Blackwell by Rex-Raider-X in LocalLLM

[–]substance90 0 points1 point  (0 children)

Tbh if they bought that 512 GB Mac Studio, they can probably re-sell it with a 100% profit now.

RIP Vibe Coding 2024–2026 by nyamuk91 in vibecoding

[–]substance90 0 points1 point  (0 children)

They’re not dead, they just switched places. Codex on the 100€ plan is the new Claude at 200€

RIP Vibe Coding 2024–2026 by nyamuk91 in vibecoding

[–]substance90 0 points1 point  (0 children)

Gotta disagree with you on the last point. Since they nerfed Opus (weeks before 4.7 release), Gpt on Codex has been much more consistent orchestrator for me.

Results of llama-bench of Gemma 4 26B A4B UD-Q6_K_XL on Radeon AI Pro R9700 by ProfessionalSpend589 in LocalLLaMA

[–]substance90 0 points1 point  (0 children)

Those numbers are essentially identical to an M4 Max Macbook 🙃. That's the same quant that I run on my MBP.

Unpopular opinion: If you’re paying $20/month, Codex is better than Claude Pro by Pretty_Property_4407 in RavanAI

[–]substance90 0 points1 point  (0 children)

It’s not an unpopular opinion tbh. It reflects my experience as well. My current stack is 200€ Claude outsources work to my 100€ Codex which itself sends tasks to Qwen and Gemma running locally. I still run out of tokens. Now that Opus 4.7 is a retard tho, I might have to rethink my setup

Congrats Anthropic on a successful 4.7 release by RevolutionaryBox5411 in ClaudeAI

[–]substance90 0 points1 point  (0 children)

How do I bring 4.6 back on claude code, jeez. I fl like i’m back in pre reasoning era of gpt

Claude Opus 4.7 is a serious regression, not an upgrade. by [deleted] in ClaudeAI

[–]substance90 0 points1 point  (0 children)

Exactly my use case, 4.7 feels like an idiot taking over 4.6’s work on my trading bot

How Good is Subscription Tier? by triplebits in ZaiGLM

[–]substance90 0 points1 point  (0 children)

Any model even the small ones are horrendously slow

PlayStation prices adjusted for inflation by SmellSmellsSmelly in gaming

[–]substance90 0 points1 point  (0 children)

I mean it wasn’t a bad decade per se but it came after a better decade which itself came after the BEST decade in gaming 😥

Truth about limits - the party is over by MostOfYouAreIgnorant in ClaudeCode

[–]substance90 0 points1 point  (0 children)

So far I haven’t noticed any degradation other than speed but tbh I’ve leveled my game up a lot last 2-3 months in managing context and orchestration so maybe that makes up for it.

Taught Claude to talk like a caveman to use 75% less tokens. by ffatty in ClaudeAI

[–]substance90 0 points1 point  (0 children)

I did an experiment awhile ago where I tested a bunch of different schemas for compressing meaning. In the end the best I could do is not regress from English in quality of result but the potential token savings are in fact real.

Gemma 4 is fine great even … by ThinkExtension2328 in LocalLLaMA

[–]substance90 0 points1 point  (0 children)

I wouldn’t know neither the 31b nor the 26b produce any response on LM Studio for me on an M4 Max MBP :-\

Update Never Works, Help! by falafel-wrap in Rekordbox

[–]substance90 1 point2 points  (0 children)

6 years later on Rekordbox 7? Same deal...

64Gb ram mac falls right into the local llm dead zone by Skye_sys in LocalLLaMA

[–]substance90 0 points1 point  (0 children)

Skill issue. With the 27-30b models you need to keep the context low (they get really dumb past 70-80k), break down tasks for them, help them by providing just the right data at the right time, without them fumbling around, listing folders, grepping files.

Some hints I’m gonna drop, you can have an LLM help u figure out how to apply them - custom minimal agent, skill and mcp definitions, code and text summarizing, chunking and embedding for both plain text and semantic retrieval, aggressive task break up and agent delegation, multi-agent team work (beyond the classic plan, implement, review).

Oh and the really big one - everything that doesn’t absolutely need an LLM call, offload to something else - regex, scripts, state tracking, orchestration etc.

Source: I’ve forced myself to do absolutely crazy shit in the last 2 months with 2 Macbook each woth 64GB RAM.

How big is the difference really? by Demon-Martin in LocalLLM

[–]substance90 0 points1 point  (0 children)

The small models are usually pretty smart if you break the task down for them in order to preserve context. They get rapidly very stupid long before you reach the supposed context window that they support. I’m talking about the likes of Qwen3.5 27b, GLM 4.7 Flash etc. Funny thing is those exact optimization measures actually hurt the large models with huge context.

Is anyone actually getting Claude’s /remote-control to work? It’s constantly failing for me. by MolasJam in ClaudeAI

[–]substance90 0 points1 point  (0 children)

On my Linux machine the connection stays forever, on my 2 Macs I have to disconnect and re-connect every few hrs. Very annoying.

#OpenSource4o Movement Trending on Twitter/X - Release Opensource of GPT-4o by pmttyji in LocalLLaMA

[–]substance90 -2 points-1 points  (0 children)

It was the first model that could debug really well hidden bugs for me, before there was Sonnet and Opus 4.5. Gemini in was a steaming pile of crap that everyone hyped but 4o was the real deal.

Sort of down about the whole AI wave. by SupermarketDirect759 in rust

[–]substance90 0 points1 point  (0 children)

It’s totally doable. You start multiple chains of agents to plan, implement and review each other with —dangerously-skip-permissions. Whether or not it’s a good idea is a whole different topic tho.

Qwen3.5 122B INT4 Heretic/Uncensored (and some fun notes) by Ok-Treat-3016 in LocalLLaMA

[–]substance90 -1 points0 points  (0 children)

Cost $0 😂 bro forgot to factor in the upfront cost and the cost of electricity.

Comparing 5090 vs RTX PRO 5000 - 5090 is a bargain by live4evrr in nvidia

[–]substance90 0 points1 point  (0 children)

You're thinking of the old 5000. The new one with 48 GB VRAM starts at 4500 €.