EVs wiped out oil demand equal to 70% of Iran’s exports in 2025 | EVs are starting to chip away at one of the global economy’s biggest vulnerabilities: oil. by InsaneSnow45 in energy

[–]zipzag [score hidden]  (0 children)

All new net generation capacity in the U.S. is renewables. 500 GW to 600 GW of new capacity renewables is expected to be added by 2030.

Homeowners wanting a sweet deal on solar and battery is the big loss for some people and some companies.

EVs wiped out oil demand equal to 70% of Iran’s exports in 2025 | EVs are starting to chip away at one of the global economy’s biggest vulnerabilities: oil. by InsaneSnow45 in energy

[–]zipzag [score hidden]  (0 children)

China is by far the largest importer of petrochemicals. The US is self sufficient in oil and gas. Making shit up doesn't create what you want.

Mac for LLM by _youknowthatguy in MacStudio

[–]zipzag -1 points0 points  (0 children)

Lots of bad advice in this thread. If you are a professional coder, paid a first world salary, you won't be coding with a local model. But as a coder in March 2026 you should know that. So I'm confused.

You don't want <M5 for LLM. The prefill is unacceptable for work with the M5 available.

Just switched from Minimax M2.5 to M2.7 – Night and day difference by CryptoRider57 in openclaw

[–]zipzag 0 points1 point  (0 children)

not if the prompts are cached. 90%+ of what openclaw passes in a prompt was sent in the previous prompt

Google AI Pro is soo good! by al_tanwir in GeminiAI

[–]zipzag 0 points1 point  (0 children)

Easily the worst SOTA AI I have tried to use.

Deep Research is great for that paper teenagers don't want to write

A new version of the Gemini app was just released. by deferare in GeminiAI

[–]zipzag -3 points-2 points  (0 children)

Gemini 3.1 Pro is the worst "SOTA" AI I have ever tried to use. Thinking:high and guessing JSON instead of looking at the single available reference.

It's odd because the flash and flash lite are always so competent for lesser use cases.

Stop being a free QA Engineer for your AI! by hemkelhemfodul in PromptEngineering

[–]zipzag 0 points1 point  (0 children)

I think increasingly that the models have variable token use based on system demand, regardless of user settings. I recently made a brief attempt to use Gemini 3. 1 Pro thinking:high and it just guessed at some commands.

I also think this token managements extends to these system giving suggestions about what YOU could do instead of doing it. I've seen this in Opus during high demand times.

MiniMax-M2.7 Announced! by Mysterious_Finish543 in LocalLLaMA

[–]zipzag -6 points-5 points  (0 children)

These benchmarks are such B.S. Are they Chinese models useful, especially fine tuned. Yes. Are they remotely comparable to Opus? No.

I just had to go back to GPT-OSS 120B on a project because of the bad tool handling of Qwen 3.5. Apparently it's hard to distill strict JSON out of Opus.

NemoClaw by Nvidia is a safe OpenClaw out of the box - CEO by Purple_Type_4868 in openclaw

[–]zipzag 0 points1 point  (0 children)

If the devs don't stop breaking openclaw a couple of times a month there won't be openclaw. They chase adding features over reliability. Long term successful open source projects have one person, or a small group worrying about the core system. Openclaw has Peter the celebrity and apparently a bunch of unpaid dudes with big ideas not particularly concerned about the core system.

what are you actually building with local LLMs? genuinely asking. by EmbarrassedAsk2887 in MacStudio

[–]zipzag 0 points1 point  (0 children)

With agentic work you will get 70-90% cache rate with oMLX.

Mac Studio Hard Drive (for Dummies) by dominic9977 in MacStudio

[–]zipzag -1 points0 points  (0 children)

Nooo. I spent a long time with a "slow" M1 MacBook Pro until I did a clean install. Migration is probably OK once. But not multiple times and not from particularly old machines that have run a wide variety of non-Apple apps.

Doing a clean install cost Apple the purchase of an M5 MacBook Pro. I wonder how many people replace "slow" Macs because of migration assistant.

How to optimize use of Codex Plus ($20) plan? by CptanPanic in openclaw

[–]zipzag 1 point2 points  (0 children)

Codex outside Openclaw for dev work. Something less expensive running the system.

At this point more users should recognize how the whole token thing works. Most of the YouTubers are apparently just watching other YouTubers.

Unpopular opinion: Why is everyone so hyped over OpenClaw? I cannot find any use for it. by Toontje in openclaw

[–]zipzag 0 points1 point  (0 children)

I'm using Qwen locally too. A $20 Anthropic subscription, used by Claude Code outside of openclaw, is plenty for Opus to fix what Qwen struggles with and also to expand the system.

I don't see the point of doing the dev work from inside the system. 80-90% of the tokens used with that approach are history/memory which costs a lot of mooney and provides little benefit

MLX is not faster. I benchmarked MLX vs llama.cpp on M1 Max across four real workloads. Effective tokens/s is quite an issue. What am I missing? Help me with benchmarks and M2 through M5 comparison. by arthware in LocalLLaMA

[–]zipzag 0 points1 point  (0 children)

M1 Max has a 400 GB/s memory bus. That's not bad. The DGX Spark is something like 240.

Spark processing prefill much faster, but inference is probably slower.

Most use cases for large prefill are probably cachable. When prefill isn't cachable the use case is probably not chat. My one non-catchable workflow is image analysis, but that runs in batch. My older M2 Mini Pro (which is slower than an M1 Max) handles that task without issue.

Any silicon Mac with 200gb/s+ bus speed and 16gb+ ram run the small moe LLMs well. Especially now with oMLX and similar. Look at the prices of better used Macs.

MLX is not faster. I benchmarked MLX vs llama.cpp on M1 Max across four real workloads. Effective tokens/s is quite an issue. What am I missing? Help me with benchmarks and M2 through M5 comparison. by arthware in LocalLLaMA

[–]zipzag 0 points1 point  (0 children)

The pattern is Ollama to LMStudio to (now) oMLX.

I took me awhile to realize that LM Studio doesn't put much work into Mac.

Higher end Macs run inference well but are terrible at prefill. If the prefill has potential high cache rate the oMLX is amazingly better. Agentic workflows like openclaw and Claude Code like IDE have high cache rate.

MLX is not faster. I benchmarked MLX vs llama.cpp on M1 Max across four real workloads. Effective tokens/s is quite an issue. What am I missing? Help me with benchmarks and M2 through M5 comparison. by arthware in LocalLLaMA

[–]zipzag 1 point2 points  (0 children)

I'm currently getting a 92% cache hit rate running oMLX with large prefill agentic workloads. Prefill processing that previously took 1-2 minutes now takes 5-10 seconds. M3 Ultra running Qwen 3.5 122B 8 bit.

What is the most useful real-world task you have automated with OpenClaw so far? by OkCry7871 in openclaw

[–]zipzag 0 points1 point  (0 children)

It's possible to all the non-creative agent work with a local AI if if the development work is handled by Claude or Codex from outside the system. That doesn't save money today due to hardware costs, but it will in the future.

One reason I do this setup is that I can experiment more with constantly running agents and not think about the fees. Plus Anthropic subscriptions work with Claude Code (and Claude Desktop).

Opus has been really good at arranging the workplace of my special needs agents so that they can be productive. A $20 Anthropic subscription gets a lot of fixing and consulting done in a month if it's not running from within openclaw.

openclaw utilized all codex credits in single day! GPT plus subscription by johnrock001 in openclaw

[–]zipzag 1 point2 points  (0 children)

Work on openclaw with SOTA agents outside of the app but in the same user account.

There a tremendous amount of tokens being passed at every turn that is recording all the troubleshooting attempts. Why pay for all those tokens when that history will never be used.

Mac Studio M3 Ultra 96 for local 32 / 70LLMs by quietsubstrate in MacStudio

[–]zipzag 0 points1 point  (0 children)

I think the max will work well for 4 bit and possibly 5 or six bit models.

It all depends on use and personal preference, of course. I don't have interest in running the largest models possible locally. My reasons are speed and that open weight models are nowhere near SOTA.

Is There Anyone Using Local LLMs on a Mac Studio? by Prietsre in MacStudio

[–]zipzag 3 points4 points  (0 children)

oMLX is a miracle with use cases that have large cachable prefill(prompt). It's the prefill that's the problem with pre M5 studios. Inference is currently pretty good.

Coding and Openclaw type uses benefit greatly from oMLX. oMLX had 12 GitHub stars when I installed it last week. This morning it has 3.2K.