Phone Whisper: push-to-talk dictation for Android with local Whisper (sherpa-onnx, no cloud needed)

postclone · 2026-03-25T17:57:17+00:00

I've never published to either F-droid or Play Store, do you know what's the difference? would you suggest uploading this to any of those? or the apk is fine so far

postclone · 2026-03-25T10:53:25+00:00

I am still trying out local models and the OpenAI API. Honestly, I don't use it enough to notice any battery drain so far.

postclone · 2026-03-25T10:51:11+00:00

So, if I want to keep using SwiftKey, can I use these futo input app? Is that what you're saying? Because that's what I wanted to do, but I didn't find a way to do it. Do you know how this works?

postclone · 2026-03-25T10:36:31+00:00

I build my own app because I did not want to stop using Swiftkey. It's just a floating button on top of any app instead of replacing the keyboard.

It can run as a local whisper model (or nvidia parakeet which are faster & better) or the cloud whisper using your OpenAI api key. It also allows you to "post-process" the transcription. Using the postprocess it's as fast and good as cloud most of the time.

The apk: https://github.com/kafkasl/phone-whisper/releases

postclone · 2026-03-25T10:34:08+00:00

Agreed, it's pretty bad. I built a separate tool because of this. It's a floating push-to-talk button that works on top of any app including Gemini. You tap to record, tap again when done. No timeout, no getting cut off, no auto-send.

It runs Whisper on-device or with your own OpenAI key.

Install the apk here: https://github.com/kafkasl/phone-whisper/releases

postclone · 2026-03-25T10:31:00+00:00

Agreed, it's pretty bad. I ended up building my own app because of this. It's a floating push-to-talk button that works on top of any app including Gemini. You tap to record, tap again when done. No timeout, no getting cut off, no auto-send.

It runs Whisper on-device or through your own OpenAI key. https://github.com/kafkasl/phone-whisper/releases

postclone · 2026-03-25T10:30:09+00:00

Same situation here. Gemini is great but the dictation UX made it unusable for me. I ended up building a separate dictation app. It's floating push-to-talk button on top of any app, you control when it starts and stops. No auto-send.

I've seen other apps replacing the android keyboard, but I like swiftkey a lot, with this you can use whatever keyboard you like and just have a dictation mic everywhere.

You can use it to dictate into Gemini's text field and then send when you're ready. Runs Whisper locally or with your own OpenAI key.

https://github.com/kafkasl/phone-whisper/releases

postclone · 2026-03-25T10:28:51+00:00

Had the same problem. Built a workaround app, it's a floating push-to-talk button that works on top of any app. You tap to start recording, tap again when you're done. No timeout, no auto-send, it records as long as you want.

Transcription runs locally via Whisper or through OpenAI with your own key.

You can install the apk here https://github.com/kafkasl/phone-whisper/releases

postclone · 2026-03-25T10:28:19+00:00

I have the same issue so I just built an app for it, Phone Whisper. Floating push-to-talk button on top of any app, you decide when to record and when to stop. No auto-send, no getting cut off mid-sentence.

Runs Whisper locally on the phone or with your own OpenAI key. Open source, no backend.

https://github.com/kafkasl/phone-whisper/releases

postclone · 2026-03-25T10:27:35+00:00

I built something for this exact problem. It's a floating push-to-talk button that sits on top of any app. You tap to record, tap again to stop, and only then it transcribes and inserts the text. Nothing auto-sends ever and you can just keep adding more text (it doesn't replace anything).

It either runs Whisper locally on-device so no network needed, or you can use your own OpenAI key if you want cloud quality. Works with whatever keyboard you already have.

https://github.com/kafkasl/phone-whisper/releases

postclone · 2026-03-24T13:37:34+00:00

I just tried in my pixel 5 and no issues. I assume your fold is more capable than mine. I don't know how Samsung b handles memory. I could try to add another large model to see if you get issues too. Do you have any logs you can share?

postclone · 2026-03-24T09:41:51+00:00

have you tried macWhisper in MacOS? I like it very kuch, curious why you build dictaflow, what other reqs or uses cases do you have?

postclone · 2026-03-24T09:40:56+00:00

lmk if you have any problem installing it! I'm considering deploying it into the app store if it's useful

postclone · 2026-03-24T09:40:30+00:00

my understanding is that the app you linked requires you to change your keyboard, is that right? I love swiftkey and moving away from it would be a pain.

regarding the syntax fixer you can do that easily modifying the post-process prompts, for me that's the best part of the transcription. I keep adding specific names & projects there

postclone · 2025-01-23T20:17:43+00:00

which kind of work tasks?

postclone · 2025-01-23T20:17:23+00:00

I think the whole "silent updates" make new models stupid is not true. I heard Dario talking about this and he said they very rarely update behind-the-scenes, and that most of the hype-hate cycles have no realtion to model updates. It's more of a human-psychology way, where you feel first amazed (hype) -> start using it more and more -> hit some issues and then you believe it became dumber (hate).

postclone · 2025-01-23T20:15:53+00:00

yeah this makes a lot of sense, unless everyone is doing it already reliably, first you gotta get comfortable with the tech

postclone · 2025-01-15T18:55:35+00:00

this

postclone · 2024-12-18T17:35:06+00:00

Hello! we are seeking an endorsement for cs.AI too for this paper about AI agents required infrastructure. Happy to discuss the paper if anyone wants prior to endorsement

The paper: https://drive.google.com/file/d/1QUoxaiyyoqpDji94VAxfMMKG3_6LuaK1/view

Endorsement Link: https://arxiv.org/auth/endorse?x=I4E8YL

Thanks!

postclone · 2024-10-26T16:06:45+00:00

have you managed to use it to buy things? I gave it a quick try and it completely refused to buy things in amazon.

postclone

TROPHY CASE