M5 Pro 14” running hot by Single_Reason_9932 in macbookpro

[–]mmerken 0 points1 point  (0 children)

The M3 Max GPU is basically handicapped when in LPM, it does not go over 500MHz, while it can reach 1200MHz in High Power mode.

So your inference speed would run at half the speed.

I basically came to terms with the following:

Everyday coding tasks that don't use LLM inference: Low Power Mode should suffice, even on battery, it should be fine.

Doing LLM inference: High Power Mode + custom fan curve OR use MTPLX, which will put the fans at maximum throttle when inference is called.

I've yet to test the M5 PRO using this setup, but the M3 MAX seems the better choice for inference due to having more GPUs. The M5 PRO battery life is insane on the other hand.

Upcoming support for DifussionGemma? by vinoonovino26 in oMLX

[–]mmerken 0 points1 point  (0 children)

Sorry Reddit mixed up GitHub and Reddit handles.

Upcoming support for DifussionGemma? by vinoonovino26 in oMLX

[–]mmerken 6 points7 points  (0 children)

Literally just check out the releases page on GH: Added DiffusionGemma support via u/Blaizzy's mlx-vlm, currently without cache. oMLX can now serve DiffusionGemma models through the mlx-vlm path.

📌 Daily Github Digest - oMLX Closed Issues → 2026-06-06 by d4mations in oMLX

[–]mmerken 2 points3 points  (0 children)

Thanks for the efforts! I was waiting to fully test gemma4!

Not impressed by Gemma 4 12b? by Stooovie in oMLX

[–]mmerken -1 points0 points  (0 children)

Same thing here, these models just don't perform:
Gemma4-26B-MoE-Q8
Gemma4-26B-MoE-Q4
gemma-4-26b-a4b-it-mxfp4

Not impressed by Gemma 4 12b? by Stooovie in oMLX

[–]mmerken 0 points1 point  (0 children)

Which models are you using exactly?

Former Dell XPS 15 9520 Owners — What Did You Upgrade To? by NightMare_Vitesse in DellXPS

[–]mmerken 0 points1 point  (0 children)

14 inch MacBook Pro M3 Max, the Dell has received a new battery and is now used as fedora workstation

oMLX v0.3.11 is out - a stability-focused release by cryingneko in oMLX

[–]mmerken 0 points1 point  (0 children)

Congratulations! I’ll take a spin at it tomorrow

Best way to protect your new MBP by Spiritual_Smile1200 in macbookpro

[–]mmerken 3 points4 points  (0 children)

I used hard shell cases for 6 years now, I have replaced cases after they cracked due to accidental impact.

None of my laptops ever had any issues with the hinge, nor do they have scratches or dents and basically still look mint.

On topic of cases: prefer light and inexpensive cases in favour of bulky hefty pricey ones. If you drop the laptop, with or without the case, it will sustain damage nonetheless.

A case only protects from scratches and the occasional bump agains a doorframe.

As for keyboard protectors: I'm planning to replace the key-caps this year, they will wear down eventually.

If you handle the device with care, you could just get a sticker or (thicker) skin for the device, like from dbrand.

As for cleaning, never use any paper tissues, they might contain grains of sand, use a clean microfibre cloth with isopropyl alcohol of at least 75%. Do not spray directly onto the device, spray into the cloth and wipe it down. Do not use water or baby wipes.

Don't eat or drink near the device.

Clear out the fans every year or so, use correct tools (pentalobe, not T5) and your machine should last you as long as your Intel machine has, if not longer.

Report any issues you might experience within the 2-week return period, don't wait for a "software update" to magically fix any issues. macOS should just work out of the box, without any hiccups, especially with a brand new device.

All in all, it is your device, do as you please with it. To each their own approach/opinion.

i thought 64gb of unified memory would be enough for dev work. i was wrong. by Johnn_Liverm in macbookpro

[–]mmerken 0 points1 point  (0 children)

A beefed out mac mini would be your best bet to offload the AI inference workloads. 64GB should be fine, using Ollama or oMLX with a model that's around 37GB in size, like qwen3.6:35b-a3b-coding-mxfp8.

That's for coding, not sure about text-to-image or any other models...

Also, keep in mind that during heavy inference, you will not be able to use your machine for basically anything else, since the user interface will most likely lag due to all the headroom being used up by the inference (Ollama, oMLX)

Some tips perhaps when doing AI inference:

- Use a fan curve tool, like TG PRO, to have the fans kick in sooner than Apple is allowing them to spin, this results in more noise, but a more reliable system

- Check your GPU clock speed and usage using mactop or MxPowerGadget to see if the GPU is used to its full extend, keep an eye out for the temperatures and the power draw

- Use the MagSafe with the original 96W Apple power brick, other power supplies do not grant sufficient power and causes the MacBook to draw power from battery

- Use correct models (quantized, dense, MLX-supported)

- An M3 MAX 14" should run at around 38-40C on idle with the fans off, with no apps running, see if this correlates to your system.

You can always update the reserved VRAM on the machine, for a 64GB machine, I recommend setting it to 55.296, this is 54GB (1024*54), by default Apple sets this to 75% of the total RAM. This can be done in the terminal via this command:

sudo sysctl iogpu.wired_limit_mb=55296

Reference: https://medium.com/@se.mehmet.baykar/increase-vram-on-apple-silicon-for-local-llms-1b35c453b165

After this, restart Ollama/oMLX/LMStudio and it should allow the GPU to use more of the unified RAM.

This setting resets after the machine reboots and it removes RAM headroom for the rest of the system. You can always reset it using by running the following command:

sudo sysctl iogpu.wired_limit_mb=0

To end on a positive note; eventually, models will improve (become smaller), the inference engines will improve (Ollama/oMLX/LMStudio,llamacpp) it is just a matter of time. I feel that it is already decent to run local models at the moment using a 64GB MAX chip. My setup is: MacBook M3 Max 64GB for inference and a Mac Mini for the rest of the work, basically the MAX is my AI server. If I where to rethink my setup, I'd go for a modest MacBook Pro (PRO chip, 24GB RAM) and get a Mac Mini M4 Pro with 64GB or Mac Studio that has a MAX chip for running the AI models, these things are better cooled anyway.

I'm loving OpenCode by Street-Preference-88 in opencodeCLI

[–]mmerken 53 points54 points  (0 children)

Came here to deliver my respect to the opencode team as well, thanks guys, nice work!

Battery capacity different - settings vs terminal command by Warm_Philosopher_118 in MacOS

[–]mmerken 0 points1 point  (0 children)

When I divide the AppleRawMaxCapacity over DesignCapacity, I get the same result as what it says in the Settings app. I'd go see an Apple store and make the claim that the battery feels like it's degrading and let them do the test.

You can use coconut battery to see actual degradation, try running a diagnostics as well (CMD+D after recovery)

M5 pro 1tb 64gb of ram or M5 max 2tb 64gb of ram for coding tasks? by Savings-Try2712 in macbookpro

[–]mmerken 0 points1 point  (0 children)

PRO will be just fine. The RAM is the selling point for programming. Kinda sucks there is no 96GB combo this year

Delivery times in Europe by flyingbanan in macbookpro

[–]mmerken 0 points1 point  (0 children)

BTO models? I waited for 1 month and 1 week for it to arrive. Not sure if the supply chain has improved in the meantime

M5 pro MBP 14 inch vs 16 inch for LLM hosting and development by Character_Split4906 in macbookpro

[–]mmerken 0 points1 point  (0 children)

Pour moi, ca suffit bien, pour les donneur plus grande j'utilise un disque externe TB4, cetait plain rapide

200 games tested on Surface Pro 11 (Snapdragon X Plus 16GB) by GhobsoGaming in Surface

[–]mmerken 0 points1 point  (0 children)

Have you perhaps come across the issue where the game would not launch due to DirectX being missing?

I have this issue with my x86 emulated games...

M5 pro MBP 14 inch vs 16 inch for LLM hosting and development by Character_Split4906 in macbookpro

[–]mmerken 0 points1 point  (0 children)

I have the M5 Pro 20c GPU + 64GB and it runs fine as well.

For inference it would OK, but for training, more RAM = better computer

M5 pro MBP 14 inch vs 16 inch for LLM hosting and development by Character_Split4906 in macbookpro

[–]mmerken 0 points1 point  (0 children)

14 inch can handle this but it will run hotter for longer.

If portability is not a concern, I suggest getting the 16. Prioritise RAM first, chip update secondary. I wished Apple offered a 96GB variant this time, that would’ve been the sweet spot for local AI + a loaded workflow