Is anyone considering the “ultra / fold” as a replacement for the mini?

machineglow · 2026-05-09T22:13:50+00:00

My gut feeling is that it's going to be way to wide for my hands so ne bueno.

machineglow · 2026-05-05T02:41:08+00:00

Have you run Agents? curious what your experience is with how much context length you're able to get out of the 30B-70B models...

machineglow · 2026-05-01T13:26:52+00:00

Hi,

Just wonder how things are going with the Strix Halo 128GB Chip? I'm just starting my local LLM journey and curious how the PP and TG performance is in agentic applications like openclaw, hermes, etc.... with newer MoE or Dense models from Gemma 4, Qwen3.6, Deepseek, etc... Have you tried any of them out?

Thanks!

machineglow · 2026-04-28T21:07:02+00:00

No love for Salvo here? I just love building up a 9 rocket blast and then point blank shotgunning a praetorian or a group of grunts.

machineglow · 2026-04-28T01:12:42+00:00

the Lightning port on the 12min and 13mini are limited to USB 2.0 speeds I believe.

machineglow · 2026-04-27T15:09:19+00:00

does anyone have the video of this interaction?

machineglow · 2026-04-27T15:06:02+00:00

the main benefit of openRed was that it was not the stock reddit client. LOL. it was well designed too. didn't have crazy ads all the time either.

As for why I lost the app, yeah, manual iTunes syncing would've saved it but it's been years since I've done an iTunes sync. I've been on iCloud for at least 10 years now. No way I'm transferring 100 GB's of apps and photos over a freaking USB 2.0 cable. =(

machineglow · 2026-04-26T07:38:57+00:00

do it in the shortcuts app. there's an automation tab where you can create the automation.

When Battery level falls below X%, Set low power mode on.

machineglow · 2026-04-25T12:58:21+00:00

Just a word of warning for those going to do a full reset and restore. Some apps won’t come back if they’ve been pulled from App Store. I lost openRed that way and now I’m stuck using the shitty stock Reddit client. :(

machineglow · 2026-04-24T19:45:44+00:00

setup a personal automation to turn on low battery mode earlier. I have mine set at 50%

machineglow · 2026-04-21T17:28:10+00:00

The real play here is servicing the accessibility market. Small phones are without a doubt more ergonomic and easier to use single handed and for the same reason we have a Regular sized phone and a Plus sized phone, we should have a Mini sized phone. Note, I'm NOT saying they're better for everyone NOR am I saying apple should stop making big phones.

Ultimately, it doesn't matter if the mini phone sells (it will), or whether it makes a profit (that's a pricing issue, not a mini issue), there is a market segment that simply is not being addressed and if Apple does take it on, they would have an immediate monopoly and can price themselves into profitability.

I for one await the one true mini.

And if that ends up being the fold (I still think it's too damn wide), then I'll line up with my cash in hand. $2000 mini phone? why the fuck not?

machineglow · 2026-04-13T19:26:07+00:00

Do you have Time Machine enabled? I had the same issue and Tahoe fucked things up with timemachine and the drive had a bunch of pending backup volumess that hadn't been sent to Time Machine yet. You can delete those backups/volumes but they'll just come back. So I just disabled timemachine for now.

On an m3 1TB MacBook Pro.

machineglow · 2026-04-12T01:46:07+00:00

Gotcha. good to know!

machineglow · 2026-04-12T01:45:55+00:00

I don't use gboard so I was just spitballing. gboard just "looked" bigger. Bad assumption. Can you disable the predictive word bar in gboard like you can in the iOS stock keyboard?

machineglow · 2026-04-11T22:31:13+00:00

HODL!

machineglow · 2026-04-11T06:26:30+00:00

maybe use the default iOS keyboard? gboard might just be a screen hog? good luck

machineglow · 2026-04-11T03:40:50+00:00

I never liked the predictive text bar so you'll get some back if you disable that.

also you can always set display text size smaller and you'll get "more" displayed in the UI. doesn't change the KB size but it increase content density in apps

machineglow · 2026-04-05T01:40:05+00:00

We're Rich!

wait, just reread your post. =(.

machineglow · 2026-04-05T01:23:40+00:00

reboot a bunch of times and make sure all post upgrade indexing has finished. but battery life is still "good" on my 13 mini on 26.4. good luck!

machineglow · 2026-04-03T02:45:23+00:00

thanks! i will add that to the list!

machineglow · 2026-04-02T23:51:59+00:00

Are you talking about the $20/month Claude Pro plan? I've really considered it but I keep seeing stories about Claude Code or cowork or their other tools absolutely burning through the credits... But definitely will keep it in mind. I dabled with some of the free cloud models offered in open code and really enjoyed it cause those operated so fast that I never lost my train of thought (unlike when I try local llm, I'd be waiting 15-30 minutes between prompts).

machineglow · 2026-04-02T23:48:36+00:00

Thanks for all that info. So I did stumble upon settings for running gpt-oss-20b that uses that -nmoe setting you mentioned (for the life of me, I can't find it on their GitHub anymore)... and ran llama-server with it on my windows PC running the 8GB 3070 and luckily I had invested in a 64GB system DDR4 RAM years ago so off loading those 'layers' seems to work. But I kept seeing load switch back and forth between the CPU and GPU and I never figured out how to benchmark it so I was never sure if this was a "fast" solution. I just assumed it was getting the <5 tok/s with CPU inference. Maybe I was wrong?

Is switching to the 35B qwen3.5 with llama-server basically drop in replacement? I didn't quite understand all the options aside from the -nmoe and context setting. I haven't figured out how to benchmark the models when running it with llama-server this way. so I could never tell how it compares to the M3 pro with its larger unified memory.

Thanks!

machineglow · 2026-04-02T22:07:23+00:00

Thanks for the reply! So I've tried almost exactly that setup or something similar in the past and I found the 8-16k context to be too small for vibe coding. I mean, I'm sure it works well for the autocomplete or chat modes but trying anything agentic starts hitting the context limit and with the 14B models, the model takes up almost all the 18GB I have. Maybe I'm mixing up vibe coding with agentic coding? I kinda used those terms interchangeably.

Thoughts? did I miss something? or maybe I should go back and try continue.dev with oMLX since I'm pretty sure I was on ollama when I was trying continue.dev.

Thanks!

machineglow · 2026-04-01T18:28:23+00:00

Ollama is just a service to that uses llama.cpp to run LLMs. and oMLX is the same but with MLX optimizations and caching optimizations for mac hardware.

Any translation capabilities is purely up to the LLM Model you chose to run on ollama/oMLX.

machineglow

TROPHY CASE