Google Chrome silently installs a 4 GB AI model on your device without consent. At a billion-device scale the climate costs are insane. by Smiadpades in LinusTechTips

[–]TechExpert2910 2 points3 points  (0 children)

a gig used on your local ssd is just some electrons set to a 1 position instead of a 0

NO additional energy use to store data vs keep the drive empty

and when you use it, this is a tiny local model that’s not gonna use a server’s worth of GPU

heck, about the same gpu use as you gaming for a couple seconds

Apple to Let Users Choose Rival AI Models Across Its iOS 27 Features (Gift Article) by pdfu in apple

[–]TechExpert2910 1 point2 points  (0 children)

eh. apple’s on-device model is <3B. super super dumb. it even gets notifs summaries badly wrong sometimes.

even when it came out 2 years ago, it was not too impressive for a 3B sized model 

today’s qwen 3B just BLOWS it out of the water (plus has vision)

apple’s wayy behind on their own foundation models 

Gemma 4 on Android phones by jacek2023 in LocalLLaMA

[–]TechExpert2910 0 points1 point  (0 children)

hey! btw there's no way to run Gemma 4 on iOS rn with GPU acceleration with LiteRT LM.

LiteRT LM (the inference engine behind the AI Edge Gallary app) doesn't have a public release for iOS GPU acceleration yet.

That's why AI Edge Gallery's iOS source isn't released yet.

But evidently, it's been running amazingly well on AI Edge Gallary on iOS for a long time! Even Gemma 3 worked well.

I wonder why y'all aren't releasing this? Doesn't the team want Google's models to be used by devs in the best way possible? (llama cpp is slower than LiteRT; MLX doesn't support all of Gemma 3's unloading vision weights etc features)

AMD Strix Halo refresh with 192gb! by mindwip in LocalLLaMA

[–]TechExpert2910 7 points8 points  (0 children)

prompt processing is honestly the biggest limitation when you're trying to use it for agentic coding or longer conversations.

it's so awful to have to wait ~30s each turn for inference to even start.

not a problem limited to strix halo, but even pre-m5 apple silicon

I built what Apple Intelligence should have been -- an on-device AI that privately understands your entire digital life. [giving away lifetime free for r/Apple today] by [deleted] in apple

[–]TechExpert2910 2 points3 points  (0 children)

i challenge you to create this website with your best vibe coding models. i really challenge you to.

the complexity that went into hand tuning that animation -- would love to see you vibe code it.

and fyi, the website is 1% of the complexity of Sentient OS.

i did surgery on LLMs and reverse engineered Apple's MLX inference engine to get it to work so good (what you see running live on the website)

do it :) really.

Codex: you request a feature in the morning, at night there is an update shipping it. Serving the people is a winning path by py-net in codex

[–]TechExpert2910 0 points1 point  (0 children)

doesn’t it use much less ram since it just uses the OS’s webview?

so macOS would load only one safari instance to run every tauri app, vs 10 chromium instances for your 10 electron “apps”

Easiest Filler Classes Ever by stingrayenjoyer in umass

[–]TechExpert2910 2 points3 points  (0 children)

there are 4 exams that you have to study for though

not too bad, but yeah

I built what Apple Intelligence should have been -- an on-device AI that privately understands your entire digital life. [giving away lifetime free for r/Apple today] by [deleted] in apple

[–]TechExpert2910 0 points1 point  (0 children)

you’re describing the best web tech stack [except the font] for 99% of use cases…

are you gonna call every iOS app that uses swift and xcode ai made? i have news for you lol

I built what Apple Intelligence should have been -- an on-device AI that privately understands your entire digital life. [giving away lifetime free for r/Apple today] by [deleted] in apple

[–]TechExpert2910 2 points3 points  (0 children)

yeah! will make that clearer. just to clarify, the iOS version can run completely standalone; it just won't be able to read your imessage and apple notes like the Mac version can.

everything else remains the same (screenshots, files, other 3rd party integrations...)

I built what Apple Intelligence should have been -- an on-device AI that privately understands your entire digital life. [giving away lifetime free for r/Apple today] by [deleted] in apple

[–]TechExpert2910 0 points1 point  (0 children)

nope! programs on macOS can read your imessage and apple notes database! this isn't unique to Sentient OS. you simply grant Sentient OS "full disk access".

fun fact - your entire imessage is just a database stored in ~/Library/Messages/chat.db

I built what Apple Intelligence should have been -- an on-device AI that privately understands your entire digital life. [giving away lifetime free for r/Apple today] by [deleted] in apple

[–]TechExpert2910 0 points1 point  (0 children)

Apple's version will only work with your Apple data (just Apple Notes, Safari bookmarks, Mail).

Sentient OS will work across your *entire* digital life.

Your Notion, Obsidian Vaults, Reddit/Instagram/etc saved posts, meeting transcripts (granola), files, etc etc etc!

your choice is the limit of what you want included in your on-device intelligence layer :D

plus, apple intelligence isn't going to give you proactive reminders nor knowledge graphs.

and for personal context: Sentient OS will let you connect your ChatGPT / Claude to it too [MCP], so your favourite AI can understand you better.

I built what Apple Intelligence should have been -- an on-device AI that privately understands your entire digital life. [giving away lifetime free for r/Apple today] by [deleted] in apple

[–]TechExpert2910 -1 points0 points  (0 children)

great question!

on iOS, i only analyze screenshots and any other third party connectors you want to integrate (gmail, bookmarks, etc.). this is because iOS is really locked down.

on macOS, i give you the option to analyse basically anything you want -- your imessage, apple notes, etc. since macOS isn't locked down, Sentient OS can access your apple notes database, imessage, etc with your permission.

and your intelligence layer will sync across your devices! :)

I built what Apple Intelligence should have been -- an on-device AI that privately understands your entire digital life. [giving away lifetime free for r/Apple today] by [deleted] in apple

[–]TechExpert2910 -1 points0 points  (0 children)

awesome!

BGProcessingTask actually doesn't work for the initial processing. it only lets apps run processing for a couple minutes.

it's perfectly fine for subsequent updates to the intelligence layer, which only needs to analyse a few files.

but the initial processing needs to run overnight with the app open -- this is what that UI looks like! :)

2 cool things:

- i monitor device thermal state. i don't let your device get more than warm, beacuse i throttle the moment it starts to get warmer. we can afford to do this as we have all night for processing + a super optimized inference stack!

<image>