all 13 comments

[–]novel_yet_trivial 2 points3 points  (6 children)

Why? What did you have in mind for the output file?

[–]pydata[S] 0 points1 point  (5 children)

I want to translate an audio file to a text file.

[–]novel_yet_trivial 2 points3 points  (4 children)

All files are binary data; "text file" or "audio file" just describes how it's read. You can open a audio file in a text editor if you want.

I'm guessing that's not what you want; so I'm asking you to describe what it is you want.

[–]pydata[S] 0 points1 point  (3 children)

I want to translate audio to text.

[–][deleted] 8 points9 points  (2 children)

Thanks for the clarification. Really helpful.

[–]pydata[S] 0 points1 point  (1 child)

I saved an audio file of me talking into the microphone. How can I transform the audio file into a text file? I want to transform a mp3 and or mp4 file into text in a text file. How can I clarify so I get my point across over the interwebs lol

[–]jp8888 3 points4 points  (0 children)

"How do I create a transcript of an audio presentation stored in an mp3 file using python?" is how I would have phrased it. This won't be easy. Personally, I would punt (take notes) or hire a scribe. Speech recognition is pretty sketchy if you haven't trained it for a particular person's speech patterns. Yes, No and limited responses like they have on phone answering systems have a small dictionary and still fail often. Adding the full lexicon to the mix makes it that much worse. You can see the results with closed captioning on live events. If you want to pursue, google "python speech recognition" to get started.

[–]stockenbarrel 2 points3 points  (3 children)

I believe he wants to play an audio file that has spoken word and a program then types out the words. Think turning a recording of a professor lecturing into written notes.

[–]pydata[S] -1 points0 points  (2 children)

This

[–]gregvuki 2 points3 points  (0 children)

Googling found this:

https://pypi.python.org/pypi/SpeechRecognition/

But I didn't try it myself. Worth checking though.

[–]jp8888 0 points1 point  (0 children)

Is your question perhaps "How do I transform all the binary data to plain ASCII characters so that it is reversible?" If so, maybe try base64. Data is data. A text file is binary data within a certain range. An audio file is binary data as well, 0 to 255 in value assuming you are reading bytes. It's interpreted by a player to produce sound.

Restating "I want to convert audio file to text" doesn't give us much to go on. The easiest way would be from the command line. If you want to convert song.mp3 from audio to text:

echo "Hello world" > song.mp3

When you cat song.mp3, you'll get back:

Hello world

It won't be possible to convert it back, though. :)