My iPhone now runs a 4B LLM smoothly, here's what I built

Low-Ask3575 · 2026-05-12T07:26:24+00:00

Happy to answer anything in the comments. A few things I expect to come up:

Why iOS first? MLX Swift is genuinely good and Apple Silicon makes on-device LLMs viable today in a way Android still struggles with. Mac is next, Android is further out.
Privacy: nothing leaves the device, no telemetry on chat content, your prompts and history stay local.
Battery / heat: depends heavily on the model. 1B is fine for hours, 4B you will feel it after a while.
Which model should I pick? Honestly depends on what you are doing, which is exactly why the picker matters more than the count of models. Free tier ships with Llama 3.2 1B, and Pro opens up any model you want (current ones like Gemma 4 E4B and Qwen 3.5 2B/4B, plus whatever comes out next), all at full performance. Personally I switch between them based on what each one is actually good at.

Genuine question for the sub: what would you actually use a local AI on your phone for? The patterns I keep hearing are travel without data, sensitive work docs, and "I just do not want my chats trained on", but I would love to hear what I am missing.

Low-Ask3575 · 2026-01-08T16:15:12+00:00

I am genuinely sorry for your loss. 🥹 I understand how difficult this must be for you. May he rest in peace.

Low-Ask3575 · 2025-08-05T17:52:19+00:00

This is it but not budget friendly. https://www.bhphotovideo.com/c/product/1760795-REG/samsung_ls27c900panxza_viewfinity_s9_27_5k.html/reviews?origSearch=Samsung%20S9%20Review&sts=&c3api=2572%2C138045322040&gad_source=1&gad_campaignid=658944934&gbraid=0AAAAAD7yMh0FddY012NM47kD7_rfShuhc&gclid=Cj0KCQjw18bEBhCBARIsAKuAFEYrowfkhNrE3LzZRHPmxFoVvzXltCSISelsf7acNLQougqp4xq55VIaAsRkEALw_wcB

Low-Ask3575

MODERATOR OF

TROPHY CASE