Sometimes in life, you run into situations where turning a voice recording into a text document is necessary. Perhaps this is from an interview for a news publication or perhaps you need to transcribe a verbal lecture from school. On Windows and OS X, there are a number of software programs that can help with this. Yet for Linux users, the options feel a bit sparse by comparison.
Today’s tip will address this issue. In this tip, I’ll show you how to combine Google’s Web Speech API with the Linux sound management server, PulseAudio.
Ready to get started? Great, here’s what you’re going to do:
1) Install pavucontrol (PulseAudio Control). It’s available from most software repositories.
2) Open pavucontrol (PulseAudio Control), click into the Input Devices tab. At the bottom, set Show to Monitors. Select the monitor that reflects the audio device you’ll be listening from by clicking the box next to the padlock on the right side. In my case, this was the USB speakers.
3) Now goto the Output Devices tab, make sure the matching output device is selected by clicking the box next to the padlock on the right side. Leave this app open, for troubleshooting.
4) Install/Open Chrome, browse to Google’s Web Speech API Demonstration page.
5) Now open up your audio player that will play the audio file. Get ready to play the audio file, but don’t hit play just yet.
6) Back on the API Demonstration page in Chrome, click on the microphone icon in the right center of the page.
7) Now in the audio player, hit play.
If everything went well, you should start seeing text appear on the Chrome page. If it isn’t working, re-check your settings. Another reason why it might not work is because of music or other noises in the background making voice audio difficult to detect.
Bonus fun: This also makes for a fun game of Mad Libs, by using a separate tab for YouTube podcasts. Some of the results are quite funny!