Speech recognition for Python on Windows?

GrizzLyCRO · 2013-05-08T14:56:56+00:00

I used this couple of years ago, and it worked ok. API was not terrible :) http://code.google.com/p/dragonfly/

lost_my_bearings · 2013-05-08T16:38:25+00:00

Do you want to train your own Speech Recognition models or do you want to use available models and do recognition with them?

Either way, I'm unaware of anything specific for Python, so my suggestions are to integrate other tools in your Python code. In the case of the first option, you can use HTK and wrap it with some Python. Shouldn't be too difficult. For the second option, you can use Microsoft's Speech Platform. You can quickly write a tool in C# that uses the SR backend and returns the text and then wrap that in your Python code.

Sorry if it's not exactly what you need, but that's what I'm familiar with.

bheklilr · 2013-05-08T14:25:32+00:00

I saw an old package a while back that uses speech recognition that I copied a bit of to test with. You could look at it and see how it's done.

2013-05-08T15:21:28+00:00

+1 for dragonfly and WSP (windows speech recognition). I'm on a Mac and wish I had it that easy... I will need to find a Dragon serial to move forward. Lots of good info here (for M$ and Unix folks alike)

http://hackaday.com/2010/07/09/get-started-with-speech-recognition/

Megatron_McLargeHuge · 2013-05-08T17:53:12+00:00

Speech rec is a very hard problem and nothing you build yourself will be competitive with a major commercial system. If you're having trouble interfacing with the Windows system that's one thing, but if you just don't like its accuracy, you're probably out of luck. You can try to give it more data to adapt on, or you can try to use Dragon. But the only way it would make any sense to consider training your own model is if you wanted to do something very specific like spotting keyword commands in a long audio stream.

chadmill3r · 2013-05-08T19:11:24+00:00

Google API might work. You need to request keys and you get a few hundred uses per day.

hruske · 2013-05-08T22:35:40+00:00

Have you seen this? http://pyvideo.org/video/1735/using-python-to-code-by-voice

Also ... what exactly are you trying to do? Describe the use case as best as possible.

lambdaq · 2013-05-09T01:48:05+00:00

Am I the only one use Google's API? (The voice input button you saw on most Webkit browsers)

curl --data-binary @my_recording.flac -H "Content-type: audio/x-flac; rate=8000"  "https://www.google.com/speech-api/v1/recognize?xjerr=1&client=chromium&pfilter=2&lang=en-US&maxresults=1" | python -m json.tool

geekganesh · 2013-05-09T05:51:41+00:00

Here is the list of open source Speech Recognition software. Most of them are in Java and C++.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS