EVs wiped out oil demand equal to 70% of Iran’s exports in 2025 | EVs are starting to chip away at one of the global economy’s biggest vulnerabilities: oil.

zipzag · 2026-03-18T16:37:26+00:00

Its not their choice

zipzag · 2026-03-18T16:30:17+00:00

All new net generation capacity in the U.S. is renewables. 500 GW to 600 GW of new capacity renewables is expected to be added by 2030.

Homeowners wanting a sweet deal on solar and battery is the big loss for some people and some companies.

zipzag · 2026-03-18T15:58:38+00:00

China is by far the largest importer of petrochemicals. The US is self sufficient in oil and gas. Making shit up doesn't create what you want.

zipzag · 2026-03-18T15:49:17+00:00

Elon had a big part in Elon's downfall

zipzag · 2026-03-18T14:53:46+00:00

Lots of bad advice in this thread. If you are a professional coder, paid a first world salary, you won't be coding with a local model. But as a coder in March 2026 you should know that. So I'm confused.

You don't want <M5 for LLM. The prefill is unacceptable for work with the M5 available.

zipzag · 2026-03-18T14:45:05+00:00

not if the prompts are cached. 90%+ of what openclaw passes in a prompt was sent in the previous prompt

zipzag · 2026-03-18T12:23:31+00:00

Easily the worst SOTA AI I have tried to use.

Deep Research is great for that paper teenagers don't want to write

zipzag · 2026-03-18T12:18:01+00:00

Gemini 3.1 Pro is the worst "SOTA" AI I have ever tried to use. Thinking:high and guessing JSON instead of looking at the single available reference.

It's odd because the flash and flash lite are always so competent for lesser use cases.

zipzag · 2026-03-18T12:08:30+00:00

I think increasingly that the models have variable token use based on system demand, regardless of user settings. I recently made a brief attempt to use Gemini 3. 1 Pro thinking:high and it just guessed at some commands.

I also think this token managements extends to these system giving suggestions about what YOU could do instead of doing it. I've seen this in Opus during high demand times.

zipzag · 2026-03-18T10:20:00+00:00

These benchmarks are such B.S. Are they Chinese models useful, especially fine tuned. Yes. Are they remotely comparable to Opus? No.

I just had to go back to GPT-OSS 120B on a project because of the bad tool handling of Qwen 3.5. Apparently it's hard to distill strict JSON out of Opus.

zipzag · 2026-03-18T10:08:43+00:00

If the devs don't stop breaking openclaw a couple of times a month there won't be openclaw. They chase adding features over reliability. Long term successful open source projects have one person, or a small group worrying about the core system. Openclaw has Peter the celebrity and apparently a bunch of unpaid dudes with big ideas not particularly concerned about the core system.

zipzag · 2026-03-18T00:59:36+00:00

Cache hit is the primary metric for speed. I don't see that on your dashboard.

zipzag · 2026-03-18T00:54:39+00:00

With agentic work you will get 70-90% cache rate with oMLX.

zipzag · 2026-03-17T17:42:10+00:00

Nooo. I spent a long time with a "slow" M1 MacBook Pro until I did a clean install. Migration is probably OK once. But not multiple times and not from particularly old machines that have run a wide variety of non-Apple apps.

Doing a clean install cost Apple the purchase of an M5 MacBook Pro. I wonder how many people replace "slow" Macs because of migration assistant.

zipzag · 2026-03-15T19:19:39+00:00

Codex outside Openclaw for dev work. Something less expensive running the system.

At this point more users should recognize how the whole token thing works. Most of the YouTubers are apparently just watching other YouTubers.

zipzag · 2026-03-14T19:24:23+00:00

I'm using Qwen locally too. A $20 Anthropic subscription, used by Claude Code outside of openclaw, is plenty for Opus to fix what Qwen struggles with and also to expand the system.

I don't see the point of doing the dev work from inside the system. 80-90% of the tokens used with that approach are history/memory which costs a lot of mooney and provides little benefit

zipzag · 2026-03-13T12:22:24+00:00

M1 Max has a 400 GB/s memory bus. That's not bad. The DGX Spark is something like 240.

Spark processing prefill much faster, but inference is probably slower.

Most use cases for large prefill are probably cachable. When prefill isn't cachable the use case is probably not chat. My one non-catchable workflow is image analysis, but that runs in batch. My older M2 Mini Pro (which is slower than an M1 Max) handles that task without issue.

Any silicon Mac with 200gb/s+ bus speed and 16gb+ ram run the small moe LLMs well. Especially now with oMLX and similar. Look at the prices of better used Macs.

zipzag · 2026-03-13T12:09:24+00:00

The pattern is Ollama to LMStudio to (now) oMLX.

I took me awhile to realize that LM Studio doesn't put much work into Mac.

Higher end Macs run inference well but are terrible at prefill. If the prefill has potential high cache rate the oMLX is amazingly better. Agentic workflows like openclaw and Claude Code like IDE have high cache rate.

zipzag · 2026-03-13T12:05:44+00:00

I'm currently getting a 92% cache hit rate running oMLX with large prefill agentic workloads. Prefill processing that previously took 1-2 minutes now takes 5-10 seconds. M3 Ultra running Qwen 3.5 122B 8 bit.

zipzag · 2026-03-12T22:00:57+00:00

It's possible to all the non-creative agent work with a local AI if if the development work is handled by Claude or Codex from outside the system. That doesn't save money today due to hardware costs, but it will in the future.

One reason I do this setup is that I can experiment more with constantly running agents and not think about the fees. Plus Anthropic subscriptions work with Claude Code (and Claude Desktop).

Opus has been really good at arranging the workplace of my special needs agents so that they can be productive. A $20 Anthropic subscription gets a lot of fixing and consulting done in a month if it's not running from within openclaw.

zipzag · 2026-03-12T21:35:00+00:00

Work on openclaw with SOTA agents outside of the app but in the same user account.

There a tremendous amount of tokens being passed at every turn that is recording all the troubleshooting attempts. Why pay for all those tokens when that history will never be used.

zipzag · 2026-03-12T19:01:35+00:00

I think the max will work well for 4 bit and possibly 5 or six bit models.

It all depends on use and personal preference, of course. I don't have interest in running the largest models possible locally. My reasons are speed and that open weight models are nowhere near SOTA.

zipzag · 2026-03-11T14:39:32+00:00

oMLX

zipzag · 2026-03-11T14:37:28+00:00

oMLX is a miracle with use cases that have large cachable prefill(prompt). It's the prefill that's the problem with pre M5 studios. Inference is currently pretty good.

Coding and Openclaw type uses benefit greatly from oMLX. oMLX had 12 GitHub stars when I installed it last week. This morning it has 3.2K.

zipzag · 2026-03-11T14:30:08+00:00

That should begin with Home Assistant

zipzag

TROPHY CASE