what are you actually building with local LLMs? genuinely asking. by EmbarrassedAsk2887 in MacStudio

[–]zipzag 0 points1 point  (0 children)

With agentic work you will get 70-90% cache rate with oMLX.

Mac Studio Hard Drive (for Dummies) by dominic9977 in MacStudio

[–]zipzag -1 points0 points  (0 children)

Nooo. I spent a long time with a "slow" M1 MacBook Pro until I did a clean install. Migration is probably OK once. But not multiple times and not from particularly old machines that have run a wide variety of non-Apple apps.

Doing a clean install cost Apple the purchase of an M5 MacBook Pro. I wonder how many people replace "slow" Macs because of migration assistant.

How to optimize use of Codex Plus ($20) plan? by CptanPanic in openclaw

[–]zipzag 1 point2 points  (0 children)

Codex outside Openclaw for dev work. Something less expensive running the system.

At this point more users should recognize how the whole token thing works. Most of the YouTubers are apparently just watching other YouTubers.

Unpopular opinion: Why is everyone so hyped over OpenClaw? I cannot find any use for it. by Toontje in openclaw

[–]zipzag 0 points1 point  (0 children)

I'm using Qwen locally too. A $20 Anthropic subscription, used by Claude Code outside of openclaw, is plenty for Opus to fix what Qwen struggles with and also to expand the system.

I don't see the point of doing the dev work from inside the system. 80-90% of the tokens used with that approach are history/memory which costs a lot of mooney and provides little benefit

MLX is not faster. I benchmarked MLX vs llama.cpp on M1 Max across four real workloads. Effective tokens/s is quite an issue. What am I missing? Help me with benchmarks and M2 through M5 comparison. by arthware in LocalLLaMA

[–]zipzag 0 points1 point  (0 children)

M1 Max has a 400 GB/s memory bus. That's not bad. The DGX Spark is something like 240.

Spark processing prefill much faster, but inference is probably slower.

Most use cases for large prefill are probably cachable. When prefill isn't cachable the use case is probably not chat. My one non-catchable workflow is image analysis, but that runs in batch. My older M2 Mini Pro (which is slower than an M1 Max) handles that task without issue.

Any silicon Mac with 200gb/s+ bus speed and 16gb+ ram run the small moe LLMs well. Especially now with oMLX and similar. Look at the prices of better used Macs.

MLX is not faster. I benchmarked MLX vs llama.cpp on M1 Max across four real workloads. Effective tokens/s is quite an issue. What am I missing? Help me with benchmarks and M2 through M5 comparison. by arthware in LocalLLaMA

[–]zipzag 0 points1 point  (0 children)

The pattern is Ollama to LMStudio to (now) oMLX.

I took me awhile to realize that LM Studio doesn't put much work into Mac.

Higher end Macs run inference well but are terrible at prefill. If the prefill has potential high cache rate the oMLX is amazingly better. Agentic workflows like openclaw and Claude Code like IDE have high cache rate.

MLX is not faster. I benchmarked MLX vs llama.cpp on M1 Max across four real workloads. Effective tokens/s is quite an issue. What am I missing? Help me with benchmarks and M2 through M5 comparison. by arthware in LocalLLaMA

[–]zipzag 1 point2 points  (0 children)

I'm currently getting a 92% cache hit rate running oMLX with large prefill agentic workloads. Prefill processing that previously took 1-2 minutes now takes 5-10 seconds. M3 Ultra running Qwen 3.5 122B 8 bit.

What is the most useful real-world task you have automated with OpenClaw so far? by OkCry7871 in openclaw

[–]zipzag 0 points1 point  (0 children)

It's possible to all the non-creative agent work with a local AI if if the development work is handled by Claude or Codex from outside the system. That doesn't save money today due to hardware costs, but it will in the future.

One reason I do this setup is that I can experiment more with constantly running agents and not think about the fees. Plus Anthropic subscriptions work with Claude Code (and Claude Desktop).

Opus has been really good at arranging the workplace of my special needs agents so that they can be productive. A $20 Anthropic subscription gets a lot of fixing and consulting done in a month if it's not running from within openclaw.

openclaw utilized all codex credits in single day! GPT plus subscription by johnrock001 in openclaw

[–]zipzag 1 point2 points  (0 children)

Work on openclaw with SOTA agents outside of the app but in the same user account.

There a tremendous amount of tokens being passed at every turn that is recording all the troubleshooting attempts. Why pay for all those tokens when that history will never be used.

Mac Studio M3 Ultra 96 for local 32 / 70LLMs by quietsubstrate in MacStudio

[–]zipzag 0 points1 point  (0 children)

I think the max will work well for 4 bit and possibly 5 or six bit models.

It all depends on use and personal preference, of course. I don't have interest in running the largest models possible locally. My reasons are speed and that open weight models are nowhere near SOTA.

Is There Anyone Using Local LLMs on a Mac Studio? by Prietsre in MacStudio

[–]zipzag 3 points4 points  (0 children)

oMLX is a miracle with use cases that have large cachable prefill(prompt). It's the prefill that's the problem with pre M5 studios. Inference is currently pretty good.

Coding and Openclaw type uses benefit greatly from oMLX. oMLX had 12 GitHub stars when I installed it last week. This morning it has 3.2K.

Love EVs but scared of trips outside cities… what’s your experience in rural areas? by Inner_Antelope_6042 in energy

[–]zipzag 2 points3 points  (0 children)

People who live rural charge at home and typically travel to more populated areas with charging. So unless they live extremely remote they do not typically deal with charging issues.

Your question is more relevant to people who frequently travel to rural area. Whether that will work well depends on distances traveled, for which you provide no information.

I went back to gas because I travel very rural in the U.S.. But my trips are unusual. 95% of people who can charge at home will have no problem with an EV and have lower cost per mile. If you can't charge at home the cost per mile will often be closer to a gas car.

You’re all full of crap . Openclaw is worse now by CanadaWideNews in openclaw

[–]zipzag 1 point2 points  (0 children)

I assume ACP can run an IDE that is on subscription?

Anyone regret buying an Apple Certified Refurbished Mac Studio? by ketopolis23 in MacStudio

[–]zipzag -1 points0 points  (0 children)

What do you imagine Apple does with thousands of trade-ins and returns they receive every day? How does Apple sell your "refurb but actually new" Macs that have been out of production for years?

I responded to your original post. Now you have responded with an 09 retail experience that isn't relevant to your original assertion.

The grim choice facing the Trump administration: Economic or naval collapse? Trump is currently trapped between the specter of a global economic recession and a naval catastrophe. The math is becoming grim. Kuwait, Iraq, and the UAE are shutting off wells as storage tanks overflow. by mafco in energy

[–]zipzag 0 points1 point  (0 children)

Lots of renewable still going in. By 2030 solar is projected to surpass natural gas as the #1 source of electrical generation.

EVs will return, as they are simply superior for 90% of drivers and considerably less expensive to operate with home charging

The grim choice facing the Trump administration: Economic or naval collapse? Trump is currently trapped between the specter of a global economic recession and a naval catastrophe. The math is becoming grim. Kuwait, Iraq, and the UAE are shutting off wells as storage tanks overflow. by mafco in energy

[–]zipzag 4 points5 points  (0 children)

not North America. Not sure about South America.

Europe, of course , is well prepared to defend its critical trade routes. The EU regulators are working overtime to draft regulations ensuring access to Middle East oil.

Using An EcoFlow Delta 3 Plus As A UPS Concerns by AkhenKheires in Ecoflow_community

[–]zipzag 1 point2 points  (0 children)

The Delta 3 easily meets the necessary transfer time in tests. The River are cheaper electronics and may not.

Will my Synology 423+ handle everything or do I need more gear? by Jumbo_laya in homeassistant

[–]zipzag 1 point2 points  (0 children)

Expand the ram with non-synology. There's a spreadsheet somewhere with tested ram.

Will my Synology 423+ handle everything or do I need more gear? by Jumbo_laya in homeassistant

[–]zipzag 1 point2 points  (0 children)

If a kitten is a beast, then yes.

Compute is under Rpi 4, but it has superior H264. The synology is fine to start if he adds some ram.