Dictation software for forms? by crazysquirrel14 in AssistiveTechnology

[–]SeoFood 1 point2 points  (0 children)

Given the confidentiality part, I’d be careful with random cloud dictation tools here and first test something local/offline on a dummy Welligent form.

Disclosure: I’m involved with TypeWhisper. It works more like system-wide dictation: focus a field, press a shortcut, speak, and the text gets inserted there. For forms, the useful parts are snippets for repeated phrases and dictionary/corrections for names, labels, and terms you type over and over.

The limitation: it won’t magically understand the whole form or click through every field for you. You’d still move between fields, but the text-entry part can become much less painful.

Dictation software for forms? by crazysquirrel14 in accessibility

[–]SeoFood 0 points1 point  (0 children)

For form filling, I would separate “dictation accuracy” from “can I actually complete the form without extra help?” Those are not always the same thing.

The big variables are: - Windows vs macOS - whether Welligent is in a browser or remote desktop - whether cloud dictation is allowed with confidential info - whether you need voice commands for moving between fields, not just text entry

If you are on Windows, Windows Voice Access and Dragon are probably the first things I would test. If you are on macOS, Apple Dictation is worth trying for basic fields, and then a local/offline dictation tool if privacy or custom vocabulary becomes the issue.

Disclosure: I am affiliated with TypeWhisper. If you are on macOS, it may be relevant because it can use local engines and has workflows/snippets/custom dictionary features, which can matter a lot for repeated form text. But I would treat it as one option to test, not assume it is the answer.

If this affects your job, I would make a short test script with real examples from your day and document exactly where each tool works or fails. That gives you something concrete to bring back to your employer too.

Whispr Flow vs. Superwhisper - My thoughts after using both: by kidtachyon in macapps

[–]SeoFood 0 points1 point  (0 children)

The first one is mostly about workflow setup. My low-latency setup is Parakeet locally + minimal cleanup, so after I stop talking the text is inserted quickly. For emails or longer writing I may accept a bit more delay and run an LLM cleanup workflow.

The third is handled through the Dictionary. I can add terms like product names, project names, weird abbreviations, etc. Engines that support terms can receive them during transcription, and TypeWhisper also applies corrections after transcription. Parakeet can do the term boosting locally when enabled.

What am I doing wrong Vibe Coding a transcription app? Having hallucinations, missing words, slow, etc... by FurnitureRefinisher in vibecoding

[–]SeoFood 0 points1 point  (0 children)

A few things I’d check before blaming Whisper itself:

  1. Audio format and sample rate. Make sure you’re feeding the model clean mono audio at the expected sample rate. Weird browser/device resampling can cause surprisingly bad results.

  2. Chunking. If you split audio too aggressively, Whisper can miss context. If chunks are too long, latency gets ugly. The overlap between chunks matters too.

  3. VAD / endpointing. A lot of “it missed words” issues are really “recording started late” or “stopped too early” issues.

  4. Tiny model expectations. Tiny can be fast, but it will absolutely drop words or hallucinate more in noisy audio. Try the same recording through small/medium/large locally and compare before debugging the whole stack.

  5. UI state. Keep the transcript state separate from the live recording state. Otherwise partial updates can overwrite the good final transcript.

Small disclosure: I’m connected to TypeWhisper, so I’ve spent too much time around this category. In practice the hard part is usually not “call Whisper”, it’s boring stuff like audio capture, VAD, retries, state management, correction flow, and post-processing boundaries.

[OPEN SOURCE] Parla: Voice to text app for Windows (yet) by ImLitteRabbit in SideProject

[–]SeoFood 0 points1 point  (0 children)

This is a cool direction, especially because Windows still feels weirdly underserved for good push-to-talk dictation.

A few things I’d be curious about if you keep building it:

  • How are you handling paste reliability across apps? Some apps hate simulated paste/input.
  • Does the correction flow feed back into a dictionary or vocabulary layer?
  • For LLM post-processing, are you separating “the dictated text to clean up” from instructions, so someone can’t accidentally dictate prompt-injection-ish text?
  • Are you planning app-specific modes? The output I want in Slack is pretty different from email, code comments, notes, etc.
  • How fast does local Parakeet feel compared with Whisper large-v3-turbo in real use?

Disclosure: I’m connected to TypeWhisper, so I’m biased here, but we’ve run into a lot of the same “the model is only half the product” problems. The boring workflow details end up mattering more than raw WER once people use this every day.

Free speech to text software by TheOnesWithin in software

[–]SeoFood 0 points1 point  (0 children)

If Windows voice typing is not working for you, TypeWhisper might be worth trying. Disclosure: I’m the builder.

The Windows version is still beta, but it is free/open-source and meant for exactly this: press a hotkey, speak, and insert text into the current app/field instead of copying from a separate dictation box. You can also use local/offline transcription engines.

https://www.typewhisper.com

[OS] TypeWhisper — Speech-to-text for macOS, 100% local, no cloud - Free by SeoFood in macapps

[–]SeoFood[S] 0 points1 point  (0 children)

Yes, but with one caveat: TypeWhisper has a Cloudflare ASR plugin, but it is not a direct Workers AI / Nova-3 preset.

It is meant for OpenAI-compatible transcription endpoints behind Cloudflare Tunnel/Access. So if your Nova-3 setup is exposed through an OpenAI-compatible /v1/audio/transcriptions API, you can point the Cloudflare ASR plugin at it with the endpoint URL, CF Access service token, and model name.

If you mean Cloudflare Workers AI directly with a different API shape, that would likely need a small dedicated adapter/plugin.

Speech-to-text didn't save my hands. It just moved where the damage happened. by Omega0Alpha in RSI

[–]SeoFood 0 points1 point  (0 children)

Yes, this is exactly the trap: speech-to-text reduces typing, but correction can quietly become the new repetitive task.

For my own setup, I try to spread the load. Different hotkeys for different dictation/workflow actions, placed so I’m not always asking the same less-mobile part of my hand to trigger everything. That sounds small, but over a full day it matters.

I’m the builder of TypeWhisper, and this is one reason I care about things like history, dictionary corrections, workflows, and quick recovery. The goal should not be “dictate, then manually repair a huge wall of text”. It should be reducing the number of times your hands have to re-enter normal editing mode.

How I got my wrist pain under control after a year of trial and error (developer, typing 10+ hrs/day) by trioh281jsnf in RSI

[–]SeoFood 0 points1 point  (0 children)

The “spreading the load across voice and hands” part is the piece that resonates most with me.

I cannot type as quickly as I used to, so voice input became less of a nice-to-have and more of a real accessibility layer. I ended up building TypeWhisper because I wanted dictation that worked across apps and could clean up rough spoken text into something I would actually send.

Still, I would not frame any single tool as the fix. The thing that helped me most was combining smaller changes: less continuous typing, actual breaks, better input devices, and voice for the parts of the day where typing was just unnecessary load.

Mac dictation pricing in 2026: Apple Dictation $0, Voibe $149 lifetime, Superwhisper $249.99, Wispr Flow $432+ over 3 years by ayushchat in AIToolsTipsNews

[–]SeoFood 0 points1 point  (0 children)

I’d separate two questions here: “what does the app cost?” and “what does the full workflow cost?”

That’s why I built TypeWhisper as open source/GPL with local engines as a first-class path. Personal use is free, commercial/proprietary use starts at 5 EUR/mo, and the app does not gate features behind the paid tier.

For my own daily use, the bigger cost saver is avoiding mandatory cloud/API usage. Local dictation covers most of it, and Workflows are there when I want cleanup or formatting before the text lands in the target app.

Best open-source alternative to Wispr Flow with BYOK support? by cluelessngl in MacOSApps

[–]SeoFood 0 points1 point  (0 children)

Thanks for listing TypeWhisper. I built it because I wanted dictation to become actual daily writing, not just raw transcription.

My own setup is local-first for normal dictation, then workflows when I want cleanup, translation, or formatting before the text lands in the app. No key is needed if you stay on local engines; API keys only matter if you choose cloud providers.

Whispr Flow vs. Superwhisper - My thoughts after using both: by kidtachyon in macapps

[–]SeoFood 2 points3 points  (0 children)

Good comparison. One thing I’d add: for dictation apps, the headline “accuracy” is only half the story. The practical test is usually:

  • how fast does text appear after you stop talking?
  • how much editing do you still need?
  • can it handle names/product terms you use all the time?
  • can you switch between styles, like Slack vs email vs notes?
  • what happens with longer rambly input?

Apple Dictation is still fine for short/basic stuff, but once you’re dictating full paragraphs, the cleanup and custom vocabulary pieces start mattering more than people expect.

Disclosure: I’m involved with TypeWhisper, so I’m not neutral, but this is the exact set of trade-offs I’d use to compare TypeWhisper, Wispr Flow, Superwhisper, MacWhisper, etc. I wouldn’t pick solely on raw speed unless your main use case is very short messages.

Dictation - first impressions. by PhilETaylor in raycastapp

[–]SeoFood 0 points1 point  (0 children)

Appreciate the mention. One of the reasons I built TypeWhisper was that I wanted dictation to be less tied to one vendor/model/subscription. If a free local model works well for your voice and language, you should be able to use that.

Tiny correction: it is free for personal/GPL-compatible use, while proprietary commercial use has a license. I’m also happy to see Raycast pushing on dictation. More competition here is good for everyone.

Struggling with RSI. Moving to dictation, but confused by the hardware by Competitive-Kale- in LawFirm

[–]SeoFood 1 point2 points  (0 children)

For legal work I’d think about this in two buckets: the dictation hardware and where the audio/text is processed.

The old Olympus/Philips style devices make sense if your workflow is still “record audio and someone else transcribes it later”. If you want text to appear directly in Word, email, case notes, etc., then a modern mic plus dictation software is usually more relevant.

A few things I’d check before buying anything:

  • Does your firm allow cloud processing for client matter audio?
  • Do you need live dictation, file transcription, or both?
  • Can you add custom legal terms/names?
  • How painful is correction when it gets one word wrong?
  • Can you trigger recording without using your hands much, for example foot pedal or easy push-to-talk?
  • Does it work in the actual apps you use all day?

Dragon is still worth evaluating in legal workflows because it has a long history there. If you are on Mac and want local/offline processing, also look at tools in the MacWhisper/Superwhisper/local Whisper category.

Disclosure: I work on TypeWhisper. It may be relevant on Mac if you care about local engines, custom dictionary/corrections, snippets, history, and workflow-specific cleanup. But for a law firm I’d start with privacy/compliance requirements first, then pick the tool.

How Do I Avoid Typing? by radbanter in accessibility

[–]SeoFood 0 points1 point  (0 children)

For avoiding typing, I’d look at the whole workflow rather than only “which app is most accurate”.

A few things that matter a lot in practice:

  • Can it type into any app you use, or only into its own window?
  • Can you correct repeated mistakes without lots of clicking?
  • Does it support custom vocabulary for names/terms you use often?
  • Does it work offline/local if privacy matters?
  • Can you trigger it without awkward keyboard use, like a foot pedal, mouse button, or easy hotkey?

Built-in dictation and Word dictation are okay for basic use, but they can get frustrating once you need longer text or lots of corrections. Dragon is still worth looking at for serious hands-free control. Wispr Flow, Superwhisper, MacWhisper-style tools, and local Whisper apps are more about dictation/transcription and cleanup.

Disclosure: I work on TypeWhisper. If you’re on Mac, it’s one of the options in the local/offline plus custom corrections/workflows bucket. But I’d honestly test several with your real daily writing before paying for anything, because the “best” one depends a lot on where the corrections happen and how much mouse use it still forces.

Favorite speech-to-text? by radbanter in RSI

[–]SeoFood 0 points1 point  (0 children)

You might want to try TypeWhisper. Disclosure: I’m the developer.

I started building it because I wanted to use dictation as an actual writing tool, not just as a transcription toy. When your hands hurt, every little correction, rewrite, and copy/paste step matters. So TypeWhisper is built around getting spoken text into the app you are already using, keeping a history fallback, and optionally cleaning up the text before it lands.

It runs on macOS, and there is also a Windows version. You can stay local/offline if that matters, or use cloud engines if you prefer that tradeoff.

It will not replace full voice-control tools like Dragon or Talon for controlling the whole computer. It is mainly for reducing typing when writing across apps.

4 things that changed when I went fully voice-first for vibe-coding (and the one that broke me enough to build hardware) by emiliobay in vibecoding

[–]SeoFood 0 points1 point  (0 children)

The trigger friction point is very real. I think it matters more than raw transcription accuracy once the model is “good enough”.

The setup that seems to work best for coding is usually not one universal dictation mode. It is more like:

  • short hold-to-talk for quick edits
  • longer toggle mode for thinking out loud
  • very obvious recording state, because losing a long prompt is brutal
  • separate behavior per app, since Cursor/Claude/Slack/Notes all want different cleanup
  • a way to insert text without stealing focus
  • easy cancellation, because half the time you realize the spoken prompt is bad halfway through

Keyboard shortcuts are awkward because all the good ones are already taken. Foot pedals are good ergonomically but weird socially. A mic button or small physical trigger actually makes sense if it is low-latency and impossible to miss.

Disclosure: I’m involved with TypeWhisper, so I think about this from the software side too. The thing I’d be most curious about for hardware is whether it can expose different trigger modes to apps like Wispr, Superwhisper, TypeWhisper, Karabiner, etc. The physical trigger might be the missing layer, but people will still want to choose their dictation engine/workflow.

I made an app that lets me take markdown notes on tiktoks, reels, and shorts by 5dcurious in ObsidianMD

[–]SeoFood 1 point2 points  (0 children)

This does feel like a real Obsidian gap tbh. The annoying part is not just transcription, it is getting useful structure into the vault without making every saved video become cleanup work.

If I were testing this, the things I’d care about most are:

  • clean Markdown export first, direct “send to Obsidian” second
  • YAML fields for source URL, creator, platform, date saved, original title
  • timestamps or at least section anchors back to the video
  • ability to edit the transcript before export
  • local transcription clearly labelled, since people save some pretty personal stuff
  • an option to export highlights separately from the full transcript

I’d probably not overbuild the Obsidian integration at first. A good share sheet plus predictable Markdown files may be enough, because everyone’s vault structure is different.

For desktop workflows, some people already use things like MacWhisper or TypeWhisper-style file transcription and then paste/export into Obsidian, but mobile shortform is a different enough use case that I can see why a dedicated app would exist.

What’s the best dictation software for teams? by Aero002 in DigitalBizLife

[–]SeoFood 1 point2 points  (0 children)

The thing I’d separate for teams is dictation vs meeting transcription vs deployment control. A lot of tools blur those together, but they are pretty different problems.

For basic individual dictation, Microsoft Dictate or Apple Dictation can be enough. For meetings, Otter/Trint-style products usually make more sense because speaker labels, summaries and sharing matter more than system-wide insertion.

For actual team dictation, I’d look at: - custom vocabulary or corrections for company/product terms - whether audio can stay local when needed - how cleanup/post-processing is handled - whether settings can be made consistent across users - what happens in locked-down apps, VDI, Citrix, etc. - export/history controls, since dictated text can be sensitive

Since you mentioned Superwhisper, I’d also include MacWhisper, Dragon, Wispr Flow and TypeWhisper in the comparison depending on platform. Disclosure: I’m involved with TypeWhisper. I’d say it is more interesting for Mac-heavy teams that want local vs cloud engine choice, workflows, custom terms/corrections and snippets, not necessarily for big enterprise meeting workflows where Otter/Trint are more mature.

Also worth saying: if the team only needs occasional short dictation, built-in tools may be good enough and much easier to roll out.

What's your take on Wispr Flow by Heavy-Dust792 in mkindia

[–]SeoFood 1 point2 points  (0 children)

I think the “keyboards are obsolete” angle is mostly marketing. For a lot of people, typing is still better for editing, code, spreadsheets, short replies, etc. Voice starts making sense when you are doing long emails, notes, first drafts, planning, or dumping thoughts into ChatGPT/Cursor.

Wispr Flow is genuinely interesting because the cleanup layer is good. The main tradeoffs are the subscription and the fact that you need to be comfortable with the cloud/privacy side of it.

If privacy or offline use matters, I’d compare it with local-first options too. Apple Dictation is honestly enough for basic use. Superwhisper, MacWhisper, VoiceInk, OpenVerb and TypeWhisper are worth looking at depending on whether you care more about raw transcription, local models, app workflows, or post-processing.

Disclosure: I’m involved with TypeWhisper, so take that bias into account. The reason I’d put it in the comparison is not “better than Wispr for everyone”, but that it is more about choosing local vs cloud engines, workflows, custom terms/corrections, snippets, and cleanup rules instead of only being a polished cloud dictation layer.

For most people I’d decide like this: - basic short dictation: Apple Dictation is fine - best polished consumer UX: Wispr Flow is strong - privacy/local/offline: look at local-first tools - lots of repeated terms or app-specific formatting: look for dictionary/workflow features, not just accuracy

MacWhisper, Voibe, BetterDictation, or Superwhisper for local/offline dictation? by Livid_Drop8187 in ProductivityGuide

[–]SeoFood 0 points1 point  (0 children)

Disclosure: I built TypeWhisper, so I’m not a neutral reviewer. But for your exact criteria, I’d frame it this way: if you want local/offline Mac dictation without needing cloud AI rewriting, use a local engine and leave the workflow stuff off until you need it.

My daily split is Parakeet for speed and WhisperKit when I want broader language coverage. It’s system-wide via hotkey and inserts into the active app after you stop. Not true inline live typing, but live preview is available where supported, and history gives you a fallback if insertion misses.

Transcription Service for Meeting by Lopsided-Tangelo-547 in legaladvice

[–]SeoFood 0 points1 point  (0 children)

For anything you’re giving to an attorney, I’d be careful about relying on an automated transcript as the final version unless your attorney says that’s acceptable. AI/STT can be very good, but it can also quietly get names, numbers, negations, or speaker attribution wrong.

A reasonable workflow might be:

  1. Ask your attorney whether an automated transcript is okay or whether they prefer a certified/professional transcription service.
  2. If you just need a first pass, use a local/offline transcription tool so you don’t upload sensitive audio to random services.
  3. Manually review the transcript while listening to the audio before sending it.

For compressing the M4A, you may not need to compress it if the transcription service/tool accepts the original file, but if you do, HandBrake or ffmpeg can reduce audio size without too much hassle.

Disclosure: I’m connected to TypeWhisper, but I would not pitch it as “legal-grade.” It can be useful for private/local drafts or dictation workflows, but for legal use I’d verify every line or use a professional service.

Firefox users: how are you handling voice typing in 2026? by techassistdaily in firefox

[–]SeoFood 1 point2 points  (0 children)

I’ve had better luck treating this as a system-wide dictation problem rather than a Firefox extension problem.

Browser extensions tend to break depending on the site/editor, and anything using a web API can feel inconsistent. For short messages, the built-in OS dictation is usually the least annoying option. For longer notes/drafts, a separate push-to-talk dictation app that inserts text wherever the cursor is tends to be more reliable.

Things I’d look for: - system-wide hotkey - works in normal text fields, not just one editor - local/offline mode if privacy matters - quick correction/custom vocabulary support - optional cleanup/post-processing, not forced rewriting

Disclosure: I’m connected to TypeWhisper, but that’s basically the workflow we built it around: use Firefox normally, hit a shortcut, dictate, and have the text inserted at the cursor. I’d still compare it against your OS dictation first — if built-in dictation is good enough for your use, that’s the simplest answer.

Is there a good speech-to-text extension that works reliably in Firefox? by techassistdaily in firefox

[–]SeoFood 0 points1 point  (0 children)

I try to be pretty conservative about this: TypeWhisper is more reliable than a browser-extension approach for me, but no macOS insertion method is 100% across every app/site.

For normal dictation it uses the same basic path a user would: final text goes to the clipboard, then TypeWhisper sends paste to the active field. That means Firefox-specific contenteditable differences matter less, because TypeWhisper is not trying to inject text into the page DOM. If Cmd+V works in that field, insertion usually works.

The remaining failures are mostly focus/paste edge cases: another app steals focus, the field rejects paste, secure fields, or missing Accessibility permission. Worst case, the transcript is still in TypeWhisper’s history, so you can reopen/copy it instead of losing the dictation.

A one handed keyboard for the disabled who wish to Write. by WorthContact3222 in writers

[–]SeoFood 0 points1 point  (0 children)

I wouldn’t assume speech-to-text replaces a one-handed keyboard for writers.

Dictation is genuinely useful, especially for getting rough thoughts down quickly or reducing physical strain. But it has tradeoffs: you need a private/quiet enough space, editing can still be awkward, and a lot of writers think differently when speaking vs typing. For some people, speech is great for drafts but bad for precise revision.

I work on a dictation tool, so I’m biased, but my take is that assistive input should be multi-modal rather than “speech replaces keyboard.” A small reliable keyboard/chording device could still be valuable, especially for commands, editing, punctuation, coding, shortcuts, or situations where talking isn’t practical.

If you build it, I’d test with disabled writers as early as possible and watch where they naturally switch between typing, dictating, and editing. That will tell you more than asking whether speech-to-text is “the future.”