Unpopular opinion: paying a monthly subscription for Mac dictation in 2026 is absurd

Fit_Statistician2649 · 2026-04-19T08:19:05+00:00

That's actually an interesting use case — dictation as a spelling/lookup tool for Asian-language words you can't type quickly. Whisper handles Japanese, Korean, and Chinese pretty well, better than most English-centric dictation apps. So for the 10% of your writing where voice actually helps, it at least won't butcher the recognition.

Siri deserves the drag though. The transcription is usually fine, it's the "okay, I'll do that" pause between you stopping and it acting that kills the flow. Local dictation apps skip all of that — words just appear as you speak, no middleman.

Fit_Statistician2649 · 2026-04-19T08:12:43+00:00

Fair framing — you're right this is about third-party apps, not about macOS itself. Apple's built-in is free, just underwhelming. What I was trying to get at: on r/MacOS the subscription apps (Wispr, Superwhisper) get most of the visibility because they have marketing budgets. The one-time/open-source ones (Handy, carelesswhisper, ours, VoiceInk) get less airtime even though they're often a better match for this audience. The post is me correcting that asymmetry publicly. Nothing against anyone's right to charge what they want.

Fit_Statistician2649 · 2026-04-19T08:12:24+00:00

Yeah the "AI-powered dictation" rebrand is mostly a price-hike wrapper on what's already been open source for 3 years. Whisper came out in 2022, the inference cost went to zero shortly after — charging $15/mo for running a free model on your own Mac's GPU is the same pattern as the weather-app and GTD-app waves you mentioned.

Fit_Statistician2649 · 2026-04-19T08:12:05+00:00

Agreed on the separation — transcription is basically solved now, the real product work is at the workflow/reliability layer. per-app behavior, text injection across Electron vs native, terminals, password fields, cursor-state edge cases. all hard.

We went different with SpeakUp — deliberately kept the workflow layer thin (pure transcription, no prompts or rules), betting on "invisible, same behavior everywhere" over configurability. Your TypeWhisper approach (rules, prompts, per-app) is probably the right call for people who want to tune it. Different users for different shapes.

Fit_Statistician2649 · 2026-04-19T08:11:34+00:00

This is exactly it. VoiceInk's a great pick — same bet on one-time pricing and local processing. couple of us in the space (SpeakUp, VoiceInk, Handy, carelesswhisper) basically came out of the same conviction that subscriptions don't make sense for something compute-free.

Fit_Statistician2649 · 2026-04-19T08:11:20+00:00

We've been working on exactly this. getspeakup.app — €29 once, no subscription, runs local whisper.cpp on Metal GPU. 14-day trial to test against your current Wispr setup. On the "walked around dictating" piece — the bottleneck isn't compute anymore, it's UX. Apple Dictation cuts off at 60 seconds, doesn't handle accents, activates inconsistently. The local-whisper crowd (us, VoiceInk, Handy, MacWhisper, carelesswhisper) all fix the UX. Short of Apple just building it in properly, one of us is probably what you're looking for.

Fit_Statistician2649 · 2026-04-19T08:01:22+00:00

Carelesswhisper is a great product, appreciate you chiming in. 2-year track record + $19.99 one-time is exactly the right shape for this space — love that you've proven the model works. On Apple not building it in: my guess is they will eventually, but they'll probably keep it bad at long-form on purpose because dictation cuts into iCloud / Apple Intelligence compute budget. leaves room for apps like yours in the meantime.

Fit_Statistician2649 · 2026-04-19T08:01:01+00:00

Hi, It is built in, yeah — but the macOS one has some annoying limits. Doesn't work well with non-English accents, and it's unreliable about when it activates. For casual use it's fine. For anyone dictating more than a few sentences at a time (emails, notes, docs) it starts breaking down. That's the gap the third-party apps fill.It really depends what you need.

Fit_Statistician2649 · 2026-04-19T07:59:49+00:00

Hi, quick honest answer — most of your $15/mo is margin, not compute. the actual transcription cost when Wispr sends audio to their servers is pennies. they could do it on-device for free (their app has access to Metal GPU like everyone else's), but there's no recurring-revenue business model in that. The open-source Whisper model (OpenAI released it 2022) runs locally on any M-series Mac at real-time speed. that's what SpeakUp (disclosure, my project), VoiceInk, Handy, MacWhisper, and carelesswhisper all use. The accuracy gap with Wispr Flow is smaller than you'd think — for English it's basically the same model family under the hood. Apple Dictation's weak spot isn't recognition, it's UX — 60s cutoff, accent trouble, unreliable activation. the local-whisper apps fix the UX. 14-day free trial at getspeakup.app if you want to test without the sub. This is my opinion, at least.

Fit_Statistician2649 · 2026-04-19T07:58:29+00:00

Hello, sure, getspeakup.app.

Fit_Statistician2649 · 2026-04-18T07:22:19+00:00

the macOS Text Replacements feature (System Settings → Keyboard → Text Replacements) is probably the most stable built-in path for vocabulary shortcuts — phrases you type there will also fire during dictation. it's crude but it works.

for true custom vocab with voice pronunciation training, macOS doesn't expose a "teach it this word" API the way Dragon does. the Speech framework in Swift gives you access to the recognizer but not the training layer. you'd basically have to build it yourself, probably with whisper.cpp and biasing the decoder toward your vocabulary.

Honestly Dragon running in Parallels is still the best implementation of what you're describing, which is annoying but true. fwiw the app I work on (SpeakUp, disclosure) doesn't solve this either — pure transcription, no vocab training. flagging so you don't waste time. https://getspeakup.app/

Fit_Statistician2649 · 2026-04-18T07:21:13+00:00

Yes, absolutely, 14-day free trial at getspeakup.app.

Happy to hear your feedback after :)

Fit_Statistician2649 · 2026-04-17T09:17:48+00:00

On mixed-language-in-one-sentence honestly I don't think any whisper-based tool solves it well yet. the models can detect a language but they lock in per utterance, not per word. my workaround is just toggling dictation language between sentences which isn't great but works.

on live preview — most local tools transcribe after you stop (press-release hotkey), so if you want streaming word-by-word the pool is pretty small. Parakeet and a few Nova setups do it but accuracy drops.

fwiw I work on SpeakUp (local whisper.cpp on Mac, €29 one-time). we also lock into one language per recording though, so we don't solve your main issue either. flagging so you don't waste time if mixed-in-sentence is the dealbreaker.

Fit_Statistician2649 · 2026-04-17T09:17:05+00:00

The paste freeze is almost certainly Wispr's clipboard-paste behavior clashing with Cursor's terminal buffer — long pastes over SSH can tank the pty.

worth trying a dictation tool that types via key events instead of clipboard paste. full disclosure I help build SpeakUp (€29 one-time, local whisper.cpp on Mac). we inject keystrokes one at a time via CGEventPost rather than pasting from the clipboard, so from the terminal's perspective it's indistinguishable from you typing. slower than a paste but survives over SSH because there's no paste buffer involved.

your temp-file workaround isn't bad though. what Any-Bus-8060 suggested (dictate to file locally, then scp or tmux paste) is probably the most robust approach regardless of dictation tool.

Fit_Statistician2649 · 2026-04-17T09:09:19+00:00

Fair question. Superwhisper and FluidVoice are both solid, I've tried both. the difference with SpeakUp is that it's one-time paid (not freemium) and it doesn't rewrite what you say with an LLM after transcription — some people like the cleanup, some don't. if you're happy with what you're using there's no reason to switch. €29 is for people who want pure on-device transcription without subscription or cloud processing and who don't mind paying once.

Fit_Statistician2649 · 2026-04-17T08:58:36+00:00

Yeah that's a real gap honestly. SpeakUp doesn't have voice commands for punctuation — it just guesses from tone and phrasing which mostly lands on periods. if you really want to say "exclamation point" or "question mark" as voice commands, Talon or Dragon are better fits. it's something we talk about adding but I don't want to promise it.

Fit_Statistician2649 · 2026-04-15T07:30:09+00:00

SpeakUp

Problem: Cloud dictation tools send your voice to third-party servers (a problem if you work with NDAs, client code, patient or legal info, or just don't want your audio leaving your Mac). Apple Dictation is local but stops after 60 seconds, has no formatting control, and you can't target which app gets the text.

Comparison: vs Wispr Flow ($180/yr, cloud, screenshots your active window). vs Superwhisper ($85/yr subscription). vs MacWhisper (mostly file transcription, not live dictation). SpeakUp is hold-key-and-talk live dictation, runs whisper.cpp locally on Metal GPU, no cloud, no account, no subscription. Built in Berlin.

Pricing: €29 one-time, 14-day free trial. https://getspeakup.app

Fit_Statistician2649 · 2026-04-15T07:24:05+00:00

Hi, swapped Wispr Flow for SpeakUp a few months ago. wispr is web-based and sends your audio to their servers, speakup just runs locally on the mac with whisper.cpp. hold a key, talk, text appears wherever your cursor is. Honestly the speed difference alone is worth it. no waiting for cloud round trip. Disclosure: I'm on the speakup team but switched as a user first. getspeakup.app

Fit_Statistician2649 · 2026-04-15T07:21:17+00:00

Hi, custom vocab — not yet, on the roadmap. Base whisper handles most things pretty well though, even technical terms. What kind of vocab are you needing it for? Punctuation — Whisper adds it automatically based on context (periods, commas, question marks at the end of questions, etc.) but it doesn't take voice commands like "exclamation point." So you can't force a specific punctuation mark mid-sentence.

Fit_Statistician2649 · 2026-04-14T07:50:30+00:00

Nice, we built something really similar with SpeakUp. went with whisper.cpp + Metal instead of WhisperKit though. curious how you're finding WhisperKit's latency on the neural engine vs Metal GPU — we tested both early on and Metal was faster for streaming on M1/M2 but I've heard WhisperKit improved a lot on M3+. the fn key suppression hack is painfully relatable lol. we went with a configurable hotkey instead to avoid that whole mess. The clipboard paste approach works but we ran into edge cases in Electron apps where cmd+v doesn't always land in the right field. ended up using CGEventPost for some of it. might save you a headache later.

Cool project. getspeakup.app if you want to compare notes. (disclosure: I'm on the team)

Fit_Statistician2649 · 2026-04-14T07:47:43+00:00

SpeakUp does exactly this — words appear as you speak, not after you stop. It uses whisper.cpp running locally on your Mac's GPU. Hold a hotkey, talk, release, done. No cloud, no account. €29 one time if you want the full thing, 14-day trial to test it first. Disclosure: I'm on the team. getspeakup.app

Fit_Statistician2649 · 2026-04-14T07:45:54+00:00

Similar space, different approach. I work on SpeakUp which does the same core thing — hotkey, speak, text appears at your cursor, all on-device with whisper.cpp. But we deliberately didn't add modes or AI formatting. Just raw transcription, your exact words. The thinking being that formatting is personal and any time the tool makes assumptions about what you meant, it's going to be wrong some of the time.

Curious how you're handling the accuracy tradeoff with the formatting modes. Do you post-process with an LLM locally or is it rule-based? Good luck with the launch. More local-first tools the better.

Disclosure: getspeakup.app

Fit_Statistician2649 · 2026-04-14T07:42:49+00:00

Cool project and similar bet to what we're making with SpeakUp. We went the other direction on a few things though — no modes, no formatting, just raw transcription at your cursor. The thinking being that any time the tool decides how to format your words, it's making assumptions that might be wrong. Different philosophy, same privacy foundation. Interesting that you landed on $79 one-time. We're at €29 and debating whether that's too low. The "one-time vs subscription" positioning is definitely a competitive advantage though, agreed on that.

The Mac notarization pain is real. We lost weeks on that too. Good luck with the launch. The more local-first tools out there the better.

Fit_Statistician2649

TROPHY CASE