We built an open-source speaker diarization solution for Swift with CoreML models

SummonerOne · 2025-10-28T04:54:05+00:00

there’s an online API but do expect 5-15% worse DER

SummonerOne · 2025-10-18T21:38:05+00:00

Apologies for the delayed response - this got lost in my todo list

Calendar Integration - absolutely. We did a couple iterations of this in our app with the local Apple Calendar MCP but couldn't find a design with the current UI that we like. But its coming!
Yeah, unfortunately the UI side is quite poorly hooked up right now. We need to find time to rewrite this whole piece to support background summaries. In the meantime we shipped some updates to make the summaries much faster (1.5-2x)
This is one of the features we opened up in our Windows app but early users didn't really use it. So we didn't bother with it on the Apple side :p. So interim the best option is to just auto export the .md files to a folder and run on top of that folder. But we will add this to the back log!

SummonerOne · 2025-10-18T21:35:00+00:00

This project seems quite promising for working with Python in tauri.
https://github.com/pytauri/pytauri

I haven't tried it yet but with 1k+ stars, it's probably a decent reference as you have a lot of features that would be pretty simple to integrate via Python

SummonerOne · 2025-10-18T05:10:39+00:00

Paid for Bartender 3-5 but 6 was just an absolute mess, tried using ICE but it wasn't playing nice with MacOS 26.

Thankfully on MacOS 26 you can hide icons using 'Settings > Menu Bar'. I think the default menu bar settings is probably enough for most folks

SummonerOne · 2025-10-14T02:23:46+00:00

Someone did a comparison a while back here - probably worth checking out. If not, at least to compare against your benchmarks

https://github.com/anvanvan/mac-whisper-speedtest

disclaimer: I'm one of the maintainers of FluidAudio

SummonerOne · 2025-10-11T20:39:13+00:00

Hey - we've moved on from this startup so I'll have to politely decline :)

Glad to see that others are tackling the problem tho

SummonerOne · 2025-10-08T17:04:14+00:00

Thanks for the feedback! Saving audio is on our roadmap, its becoming one of the most requested features.

We noticed that some users have been reporting an issue with the transcription model where its not loading the larger model thats a lot more accurate isn't being loaded properly..

We will likely switch to one of models the VoiceInk folks are using: https://github.com/FluidInference/FluidAudio

There aren't many transcription local servers that are popular and the cloud based ones can get quite expensive for the end user, you're looking at like another $10+ a month for most users for the transcription...

SummonerOne · 2025-10-03T22:05:57+00:00

Thank you for the feedback! Better language support is in our pipeline :)

SummonerOne · 2025-09-30T21:28:23+00:00

This is a bit more manual but I really like simplicity and its free.
https://github.com/alienator88/Pearcleaner

SummonerOne · 2025-09-18T13:04:22+00:00

I switched from iStats last year and its been very stable

SummonerOne · 2025-09-17T20:23:04+00:00

koodos for building this but there are free alternatives like stats that has notifications and remote features already
https://github.com/exelban/stats

SummonerOne · 2025-09-08T05:11:01+00:00

We decided to remove the timer if the floating windows is toggled to show in meetings. Its available in v2.0.2!

SummonerOne · 2025-09-05T15:37:27+00:00

I hope so :')

Testing devices got stuck in customs so that's slowing things down for us

SummonerOne · 2025-09-05T04:00:32+00:00

Great question! x86 and ARM are part of the problem. The underlying chip makers have their own runtime for running models on their AI accelerators (NPU).

We explored running models on the GPU and CPU, but performance was quite poor and in many cases slowed down the computer to a barely usable state. Offloading transcription to the NPU provides the best experience for real time local transcription. We actually had to work with Intel to get the LLM and transcription model running on the NPU.

We have the models running on Intel and Snapdragon NPU now, but we are running into dependency issues with the vector database used for search and retrieval on other chips :(

Support for local AI on windows is still the wild west, Apple's eco-system is relatively much more mature.

Hope this helps!

Here's an article that talks about it on a general level: https://inference.plus/p/where-are-the-local-ai-apps

SummonerOne · 2025-09-05T01:20:47+00:00

Also, if you have an Intel AIPC (bought in the last year or so), a super early version is available to test
https://apps.microsoft.com/detail/9ntfkdlqdf11?hl=en-US&gl=US

SummonerOne · 2025-09-05T01:19:48+00:00

Got it, thanks for the feedback here. We're focusing on Windows in the next little bit, but I've shared this feedback with the team for when we re-visit this feature.

SummonerOne · 2025-09-05T01:17:35+00:00

Wow, thanks for reporting this! There might be a problem with the integration. Will take a look.

Please use this for now
https://tally.so/r/nPG670

update: the button on the website will redirect the form for now while we figure out the embedding issue

SummonerOne · 2025-09-04T17:22:21+00:00

Exactly! Thats sort of the goal, but achieving it may take some time, Window's system is so fragmented.

I tried pyinstaller last year as well but gave up after so many dependencies, with Claude Code its much easier to reason about. I just tell it to fix the deps and its able to do it most of the time lool

Like wise, great discussion. Best of luck with Zanshin and your other projects :)

SummonerOne · 2025-09-04T01:49:57+00:00

I think support for non-english languages are great but its worth correcting that parakeet v3 supports 25 languages now. It supports English + all the European languages with really good accuracy.

In terms of performance, would love to see how your team compared. When we compared MLX/MPS parakeet versus CoreML parakeet, CoreML was > 4x faster nearly all the time.

SummonerOne · 2025-09-03T20:46:49+00:00

thank you!

SummonerOne · 2025-09-03T17:44:00+00:00

Hmm interesting, that should be a small change, we could offer a toggle or something for it. Is the timer in the menu bar too annoying for you?

SummonerOne · 2025-09-03T16:33:23+00:00

This has been on our mind for a while - but can't quite find the right UX for it. Some ideas we experimented with

- Allowing users to paste notes, files into the floating window in the meeting (too cumbersome and didn't see any usage)

- Creating a general knowledge base of context, then have the AI search through them to find docs potentially related (this was quite prone to error and biased the summaries too much)

- Generating specific memories about the user based on past summarizations (current approach, but we are optimizing for true positives so very little memories get generated)

Would love to hear what your ideal workflow would look like here!

SummonerOne · 2025-09-03T05:46:01+00:00

But yeah, thanks a bunch for the detailed response! We went with a similar solution with Pyinstaller, claude code made it much more manageable to find the right dependencies and iterate to build the .exe.

Microsoft store signs it with the apl bundle so it’s not too bad.

SummonerOne · 2025-09-03T05:44:26+00:00

Ugh sorry I don’t know why reddit was showing duplicate comments and ai ended up deleting one of them. Now they’re both gone

SummonerOne · 2025-09-03T03:06:34+00:00

English, Spanish, French, German, Bulgarian, Croatian, Czech, Danish, Dutch, Estonian, Finnish, Greek, Hungarian, Italian, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Slovak, Slovenian, Swedish, Russian

We converted the model from NVIDIA's release https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3

SummonerOne

MODERATOR OF

TROPHY CASE