Day 21 of 100 Days Coding Challenge: Python
When I saw “audio-to-text converter” on the list, my brain did a dramatic slow turn and whispered, “Where were you twenty years ago?” I mean, really—this could’ve saved me hours of repeating the word “vegetable” into a cassette recorder, trying to sound less like an anime character doing karaoke. Back then, I was on a one-woman mission to master English pronunciation, and a tool like this would’ve felt like cheating—but the good kind. Now, with AI and a little Python magic, I have made a program that converts my voice into readable text. Only catch? It plays favorites with WAV files. No love for MP3s. So naturally, I whipped up an MP3-to-WAV converter on the side, because why not? I recorded my voice straight from my laptop, half-expecting it to transcribe me saying something profound. Instead, I got back “this is a test.” Which is fair. It was.
Today’s Motivation / Challenge
Today’s project is the kind of tool you don’t know you need until you really need it—like when you’re trying to capture a brilliant idea mid-shower, but this time, it’s your voice memos from a walk, an interview, or that rambling TED Talk you gave to your cat. Audio-to-text isn’t just practical—it’s empowering, especially for learners, note-takers, and people who speak faster than they type.
Purpose of the Code (Object)
This simple program listens to your audio file and writes down what you said. That’s it. You talk, it types. It works best with clear speech and doesn’t judge your accent (too much). You can even record yourself with a laptop mic and see your words appear as text. It’s like magic—but with WAV files.
AI Prompt:
Create a Python script using the SpeechRecognition library that takes an audio file (preferably WAV), converts it to text, and prints the result. Add basic error handling. Bonus: include a simple MP3-to-WAV converter using pydub.
Functions & Features
- Load an audio file (WAV preferred)
- Convert speech in the file to text
- Display transcription
- Convert MP3 to WAV if needed
Requirements / Setup
bash
CopyEdit
pip install SpeechRecognition
pip install pydub
You’ll also need ffmpeg installed and added to PATH for MP3 support.
Minimal Code Sample
import speech_recognition as sr
recognizer = sr.Recognizer()
with sr.AudioFile(r”C:\path\to\your.wav”) as source:
audio = recognizer.record(source)
text = recognizer.recognize_google(audio)
print(text) # This prints your spoken words
Notes / Lessons Learned
So here’s the thing: I wrote audio_file = “C:\Users\tinyt\OneDrive\Desktop\Test\Recording.wav” and, well… Python had a meltdown. Apparently, \t is not just a cute path separator—it’s a tab. Classic Windows-path betrayal. Once I added a humble little r in front like this—r”C:\Users\tinyt\…”—the program stopped being dramatic and worked like a charm. But let’s be real: rewriting the file path in code every time?
That’s a cry for a GUI. So, I made one. And it was glorious. Click, select, transcribe—done. Also, a fun fact: if you skip articles in your speech, the program skips them in your text too. No article, no mercy. And if your grammar is questionable? The output will happily reflect that. It’s an honest mirror, not a grammar teacher.
Optional Ideas for Expansion
- Add a “Save as TXT” button for transcribed text
- Let users choose between multiple languages for transcription
- Record audio directly in the app, no file needed
