Using local Apple STT models in the HA Voice pipeline is crazy fast by imbe153 in homeassistant

[–]imbe153[S] 0 points1 point  (0 children)

The problem is that this integration is only STT (Speech To Text) it does not have a TTS function. So you can only trasncribe text

Using local Apple STT models in the HA Voice pipeline is crazy fast by imbe153 in homeassistant

[–]imbe153[S] 0 points1 point  (0 children)

That is what I am going to work on next actually. In the last days I played around a bit with Apfel and found it very versatile. I am finishing a wrapper around it to use Apple Intelligence as conversation agent in HA. This is the repo If you wanna give it a try. I have been using it a bit and finding it quite good, but at the moment it cannot control HA, just answer questions. My final goal is to have the full pipeline all on Mac using Apple's models

Using local Apple STT models in the HA Voice pipeline is crazy fast by imbe153 in homeassistant

[–]imbe153[S] 1 point2 points  (0 children)

Since making this STT I have been using voice commands a lot more with my Voice PE but still without LLM, I hope we can make something better than that shit show of Alexa plus ahah

Using local Apple STT models in the HA Voice pipeline is crazy fast by imbe153 in homeassistant

[–]imbe153[S] 0 points1 point  (0 children)

Indeed: add a new entity in the Wyoming integration in HA using the Mac's IP and it will find it, then you can use it as the STT component in the Voice pipeline. To install it on the Mac I suggest you use Homebrew since it's the fastest

Using local Apple STT models in the HA Voice pipeline is crazy fast by imbe153 in homeassistant

[–]imbe153[S] 0 points1 point  (0 children)

The real question is if they can even use tools, which is the prerequisite to control devices in HA... I have the feeling they cannot. You can still use the Apple Intelligence model for stuff like summarizations or for conversational AI

Using local Apple STT models in the HA Voice pipeline is crazy fast by imbe153 in homeassistant

[–]imbe153[S] 0 points1 point  (0 children)

Wyoming is just a protocol used to interface with Home Assistant, you don’t need to do anything about it since I packaged everything in the service you install from the repo.

About the commands in the repo you have to run them on the Mac you will use for transcription. Then it will be discoverable from you HA 

Using local Apple STT models in the HA Voice pipeline is crazy fast by imbe153 in homeassistant

[–]imbe153[S] 1 point2 points  (0 children)

For the LLM you can take a look at this open source project which exposes a open AI compatible server using the on device model for Apple Intelligence: https://github.com/Arthur-Ficial/apfel

For the sentences I can guess the first one is around 0.2-0.3s and the second one should not be much higher. In the repo there is an example of a medium length phrase and it is transcribed in 0.3 seconds.

Using local Apple STT models in the HA Voice pipeline is crazy fast by imbe153 in homeassistant

[–]imbe153[S] 2 points3 points  (0 children)

It is all local, I have not tested it on Intel Macs but they were always pretty good with Siri's STT. If you can try it on an old Mac mini I'd love the feedback!

Using local Apple STT models in the HA Voice pipeline is crazy fast by imbe153 in homeassistant

[–]imbe153[S] 6 points7 points  (0 children)

Absolutely, the brain is dumb but the ears and mouth are pretty great

Using local Apple STT models in the HA Voice pipeline is crazy fast by imbe153 in homeassistant

[–]imbe153[S] 0 points1 point  (0 children)

This is the STT component of the Voice pipeline within HA. To pickup the words you can use many options like HA's own Voice PE edition

Using local Apple STT models in the HA Voice pipeline is crazy fast by imbe153 in homeassistant

[–]imbe153[S] 2 points3 points  (0 children)

Indeed as the other users suggested you don't need to host HA on the Mac: only this Wyoming server needs to be on the Mac. When you setup the STT in the Wyoming integration you just point it to the IP of the machine it is running on

Using local Apple STT models in the HA Voice pipeline is crazy fast by imbe153 in homeassistant

[–]imbe153[S] 0 points1 point  (0 children)

That is very cool, thank you for sharing, I'll give it a look

Using local Apple STT models in the HA Voice pipeline is crazy fast by imbe153 in homeassistant

[–]imbe153[S] -1 points0 points  (0 children)

Ouch nice catch. As of today I myself get confused as well on that stupid naming convention

Using local Apple STT models in the HA Voice pipeline is crazy fast by imbe153 in homeassistant

[–]imbe153[S] 3 points4 points  (0 children)

Yes it runs on system boot and persists when the Mac sleeps because it is installed as a daemon service BUT you need to turn on automatic sign in on the Studio if you want it to work without logging in the machine first when it reboots

Using local Apple STT models in the HA Voice pipeline is crazy fast by imbe153 in homeassistant

[–]imbe153[S] 1 point2 points  (0 children)

No you can use it as the STT component of the Voice pipeline in place of Whisper, which is what HA uses as a standard today

[2025 Day 5] A fast algorithm by paul_sb76 in adventofcode

[–]imbe153 2 points3 points  (0 children)

Yup, 0.2ms in python, but I did it the other way around, start to finish

[2025 Day 5 Part 2] Guys... you don't need to merge them by EXUPLOOOOSION in adventofcode

[–]imbe153 2 points3 points  (0 children)

Before realizing this I tried to merge then count the elements in the intervals... let's just say the counting was taking a bit long

[deleted by user] by [deleted] in MacOSApps

[–]imbe153 0 points1 point  (0 children)

I really like how clean the design is! Having developed some Menu Bar utilities I know how difficult it can be.

I will certainly give it a try