Day 21 of 100 Days Coding Challenge: Python
When I saw โaudio-to-text converterโ on the list, my brain did a dramatic slow turn and whispered, โWhere were you twenty years ago?โ I mean, reallyโthis couldโve saved me hours of repeating the word โvegetableโ into a cassette recorder, trying to sound less like an anime character doing karaoke. Back then, I was on a one-woman mission to master English pronunciation, and a tool like this would’ve felt like cheatingโbut the good kind. Now, with AI and a little Python magic, I have made a program that converts my voice into readable text. Only catch? It plays favorites with WAV files. No love for MP3s. So naturally, I whipped up an MP3-to-WAV converter on the side, because why not? I recorded my voice straight from my laptop, half-expecting it to transcribe me saying something profound. Instead, I got back โthis is a test.โ Which is fair. It was.
Todayโs Motivation / Challenge
Todayโs project is the kind of tool you donโt know you need until you really need itโlike when you’re trying to capture a brilliant idea mid-shower, but this time, itโs your voice memos from a walk, an interview, or that rambling TED Talk you gave to your cat. Audio-to-text isnโt just practicalโitโs empowering, especially for learners, note-takers, and people who speak faster than they type.
Purpose of the Code (Object)
This simple program listens to your audio file and writes down what you said. Thatโs it. You talk, it types. It works best with clear speech and doesnโt judge your accent (too much). You can even record yourself with a laptop mic and see your words appear as text. Itโs like magicโbut with WAV files.
AI Prompt:
Create a Python script using the SpeechRecognition library that takes an audio file (preferably WAV), converts it to text, and prints the result. Add basic error handling. Bonus: include a simple MP3-to-WAV converter using pydub.
Functions & Features
- Load an audio file (WAV preferred)
- Convert speech in the file to text
- Display transcription
- Convert MP3 to WAV if needed
Requirements / Setup
bash
CopyEdit
pip install SpeechRecognition
pip install pydub
Youโll also need ffmpeg installed and added to PATH for MP3 support.
Minimal Code Sample
import speech_recognition as sr
recognizer = sr.Recognizer()
with sr.AudioFile(r”C:\path\to\your.wav”) as source:
audio = recognizer.record(source)
text = recognizer.recognize_google(audio)
print(text) # This prints your spoken words
Notes / Lessons Learned
So hereโs the thing: I wrote audio_file = “C:\Users\tinyt\OneDrive\Desktop\Test\Recording.wav” and, well… Python had a meltdown. Apparently, \t is not just a cute path separatorโitโs a tab. Classic Windows-path betrayal. Once I added a humble little r in front like thisโr”C:\Users\tinyt\…”โthe program stopped being dramatic and worked like a charm. But letโs be real: rewriting the file path in code every time?
Thatโs a cry for a GUI. So, I made one. And it was glorious. Click, select, transcribeโdone. Also, a fun fact: if you skip articles in your speech, the program skips them in your text too. No article, no mercy. And if your grammar is questionable? The output will happily reflect that. Itโs an honest mirror, not a grammar teacher.
Optional Ideas for Expansion
- Add a “Save as TXT” button for transcribed text
- Let users choose between multiple languages for transcription
- Record audio directly in the app, no file needed
