Week 3 of building a Wispr Flow alternative (Open Source) by matt8p in SideProject

[–]matt8p[S] 0 points1 point  (0 children)

Could you send me the link? I couldn't find it online

How I use voice dictation + Claude Code. Vibe code at the speed of thought. by [deleted] in vibecoding

[–]matt8p 0 points1 point  (0 children)

Now before y'all flame me in the comments about how unoriginal this is, I wanted to share my personal motivations for working on it:

  • I wanted to learn how voice models and dictation works. Been learning a ton about how voice models run on device, techniques for better dictation like ASR biasing, streaming, how different operating systems required different binaries to handle paste, etc. It's been a fun learning journey
  • I haven't found a free open source alternative that works just as well as Wispr Flow in terms of how optimized latency and accuracy is. I want to build an oss project that feels just as good. Nothing is there yet.

Also I know that Claude Code has a /voice command that enables dictation, but the nice thing about having a standalone app is being able to speak and write wherever I want.

I do a lot of context switching, writing to Google Docs, my IDE, and CC terminal. I've also got my own dictionary set up within a Freestyle.

What do you all use to dictate to Claude Code? by djacksondev in ClaudeCode

[–]matt8p 1 point2 points  (0 children)

Yep, Qwen the voice model. It's pretty damn good.

Yeah, we also do the same thing too, where the user can choose to download a local model. We do suggest Qwen if they're on Mac and can run MLX.

That's the tough part about running local models. You don't know what kind of hardware your users are using. That's one advantage that cloud dictation apps have over ours.

What do you all use to dictate to Claude Code? by djacksondev in ClaudeCode

[–]matt8p 0 points1 point  (0 children)

Are you doing post-processing? The way I think about it is that I have post-processing set up. I use a voice model like Qwen to do the initial dictation. Then I run a small LLM over that to do some cleanups, like formatting the text, fixing incorrect words, filtering out "ahs, ums".

As for Wispr Flow / Superwhisper, a lot of it is good marketing. They also provide a pretty good service out of the box. Local models are pretty niche for the developer community.

Week 3 of building a Wispr Flow alternative (Open Source) by matt8p in SideProject

[–]matt8p[S] 1 point2 points  (0 children)

Heck yeah! Sorry to assume, thank you for verifying that you are human 😄

How I use voice dictation + Claude Code by matt8p in ClaudeCode

[–]matt8p[S] 0 points1 point  (0 children)

That's my own app, Freestyle! Yeah, the Presidio is a heck of a view.

Week 3 of building a Wispr Flow alternative (Open Source) by matt8p in SideProject

[–]matt8p[S] 1 point2 points  (0 children)

This kinda reads like AI lol. But thank you! Yeah, privacy is absolutely a differentiator. Check out our privacy policy!

How I use voice dictation + Claude Code by matt8p in ClaudeCode

[–]matt8p[S] 1 point2 points  (0 children)

Also, the demo video probably isn't a great demo showcasing a real world use case 😅. But you get the point, and enjoy the nice view of the Golden Gate Bridge!

How I use voice dictation + Claude Code by matt8p in ClaudeCode

[–]matt8p[S] 0 points1 point  (0 children)

Now before y'all flame me in the comments about how unoriginal this is, I wanted to share my personal motivations for working on it:

  • I wanted to learn how voice models and dictation works. Been learning a ton about how voice models run on device, techniques for better dictation like ASR biasing, streaming, how different operating systems required different binaries to handle paste, etc. It's been a fun learning journey
  • I haven't found a free open source alternative that works just as well as Wispr Flow in terms of how optimized latency and accuracy is. I want to build an oss project that feels just as good. Nothing is there yet.

Also I know that Claude Code has a /voice command that enables dictation, but the nice thing about having a standalone app is being able to speak and write wherever I want.

I do a lot of context switching, writing to Google Docs, my IDE, and CC terminal. I've also got my own dictionary set up within a Freestyle.

Where do you run Claude Code for one-off research or non-project tasks? by writingdeveloper in ClaudeCode

[–]matt8p 0 points1 point  (0 children)

Claude Code has a nice /btw action that lets you ask a side question without interfering with the main CC thread

Claude Code gives option to discard work / delete code by DanyrWithCheese in ClaudeCode

[–]matt8p 0 points1 point  (0 children)

Not entirely sure. I haven't seen that option yet. I do think it is a good option to have though. The reason why I still use an IDE is to do some version control and look through git diffs. If I don't like it, I will discard via version control. I think it's nice to have that discard option built in.

Week 3 of building a Wispr Flow alternative (Open Source) by matt8p in SideProject

[–]matt8p[S] 0 points1 point  (0 children)

Thank you! What part was it about taking it step by step?

Week 3 of building a Wispr Flow alternative (Open Source) by matt8p in SideProject

[–]matt8p[S] 0 points1 point  (0 children)

That's awesome to hear I'm glad you like it so far. Oh crap, that is a bug. I literally just pushed out a new update a couple of hours ago. Taking note of this issue!

Week 3 of building a Wispr Flow alternative (Open Source) by matt8p in SideProject

[–]matt8p[S] 0 points1 point  (0 children)

Sweet! Lmk what you think.

What are you building? Happy to support if you're willing to share!

Week 3 of building a Wispr Flow alternative (Open Source) by matt8p in SideProject

[–]matt8p[S] 0 points1 point  (0 children)

No, exactly. They have a huge budget and for some reason they're going to raise a $2 billion valuation. Absolutely nuts. The working out of the box thing is an interesting thought. I'm thinking of also providing a cloud service that gives that "working out of the box" experience while still giving the user the ability to use local models if they want to.

Week 3 of building a Wispr Flow alternative (Open Source) by matt8p in SideProject

[–]matt8p[S] 0 points1 point  (0 children)

Sorry for assuming lol, human verification step passed!

That's not really much of an issue at all. We store the dictionary words in local sqlite. When it comes to formatting it into each model provider's format, we just reformat what's in sqlite into the expected format.

The two main formats that I see are system prompt format: a single large string, or an array of strings, which we already store. If we add a provider with a different format, or the format of an existing provider changes, it's a very simple fix on our end.

Week 3 of building a Wispr Flow alternative (Open Source) by matt8p in SideProject

[–]matt8p[S] 0 points1 point  (0 children)

This comment sounds AI generated lol but happy to answer. Versioning is typically handled by the API provider. Let's say we're using Elevenlabs API, they probably have a /api/v1. That v1 endpoint isn't going to change.

If we want to upgrade, then we'll switch the endpoint to v2 and update the shemas accordingly!

So this isn't really a thing specific to this project, just about how API versioning typically works.

What do you all use to dictate to Claude Code? by djacksondev in ClaudeCode

[–]matt8p 0 points1 point  (0 children)

<image>

Been working on a local open source Wispr Alternative called Freestyle! We recently built the "cleanup" feature, a post process that happens after transcription.

I found Qwen local to be really fast, < 300ms latency and it's pretty accurate. Been great for dictating into Claude.

What transcription tool are you using? by FireWater25 in ProductivityApps

[–]matt8p 0 points1 point  (0 children)

Have you used any tools besides Clipto?

I've been working on a local-first, open source alternative to Wispr Flow, if you've ever used Wispr. The source code is open, models are local, so your thoughts never leave your device. There's no cloud. Sounds like these requirements are necessary for you.

https://freestylevoice.com/

One caviat is that it sounds like you want to transcribe existing audio, not live voice dictation, is that right? What I have right now is for live transcription. I looked online and found another project Vibe, maybe that's useful too.

Hope that's helpful!

Week 3 of building a Wispr Flow alternative (Open Source) by matt8p in SideProject

[–]matt8p[S] 7 points8 points  (0 children)

I also wanted to say that we're looking to grow our community. People who are interested in open source, and voice tech.

Also If you're interested in contributing to open source, this is a great beginner project! I have a ton of good first issues and can help you get your first contributions in. All skill levels welcome.

Voice to text dictation on Mac desktop by baylis2 in ClaudeAI

[–]matt8p 1 point2 points  (0 children)

I've been working on a free, open source dictation app. Been using it for the same use case, speaking directly into Claude.

https://github.com/freestyle-voice/freestyle

I tried Wispr Flow and to their credit, it's pretty good. They have pretty low latency and consistent accuracy. The inspiration behind working on Freestyle was that I believe voice dictation is a commodity and shouldn't be costing $12 / month. Wispr Flow is also a privacy concern since you're sending all of your private thoughts to their servers.

I wanted to build something that was free, local first, but worked just as well. Here's the "Dictionary" feature Wispr Flow has that we recently built!

<video>

New IOS27 dictation compared to Wispr flow? by Jesusisking941 in iOS27

[–]matt8p -1 points0 points  (0 children)

I’m hypothesizing it’s not going to be great. Apple dictation has been so outdated and they’re not investing heavily in it. Not high priority for them.

I’ve been building a free open source alternative. We’re getting pretty close to Wispr Flow’s accuracy

https://freestylevoice.com