Why Didn’t I Think of This in 2005?

Day 21 of 100 Days Coding Challenge: Python

When I saw “audio-to-text converter” on the list, my brain did a dramatic slow turn and whispered, “Where were you twenty years ago?” I mean, really—this could’ve saved me hours of repeating the word “vegetable” into a cassette recorder, trying to sound less like an anime character doing karaoke. Back then, I was on a one-woman mission to master English pronunciation, and a tool like this would’ve felt like cheating—but the good kind. Now, with AI and a little Python magic, I have made a program that converts my voice into readable text. Only catch? It plays favorites with WAV files. No love for MP3s. So naturally, I whipped up an MP3-to-WAV converter on the side, because why not? I recorded my voice straight from my laptop, half-expecting it to transcribe me saying something profound. Instead, I got back “this is a test.” Which is fair. It was.

Today’s Motivation / Challenge

Today’s project is the kind of tool you don’t know you need until you really need it—like when you’re trying to capture a brilliant idea mid-shower, but this time, it’s your voice memos from a walk, an interview, or that rambling TED Talk you gave to your cat. Audio-to-text isn’t just practical—it’s empowering, especially for learners, note-takers, and people who speak faster than they type.

Purpose of the Code (Object)

This simple program listens to your audio file and writes down what you said. That’s it. You talk, it types. It works best with clear speech and doesn’t judge your accent (too much). You can even record yourself with a laptop mic and see your words appear as text. It’s like magic—but with WAV files.

AI Prompt:

Create a Python script using the SpeechRecognition library that takes an audio file (preferably WAV), converts it to text, and prints the result. Add basic error handling. Bonus: include a simple MP3-to-WAV converter using pydub.

Functions & Features

  • Load an audio file (WAV preferred)
  • Convert speech in the file to text
  • Display transcription
  • Convert MP3 to WAV if needed

Requirements / Setup

bash

CopyEdit

pip install SpeechRecognition

pip install pydub

You’ll also need ffmpeg installed and added to PATH for MP3 support.

Minimal Code Sample

import speech_recognition as sr

recognizer = sr.Recognizer()

with sr.AudioFile(r”C:\path\to\your.wav”) as source:

    audio = recognizer.record(source)

    text = recognizer.recognize_google(audio)

    print(text)  # This prints your spoken words

Audio to Text GUI

Notes / Lessons Learned


So here’s the thing: I wrote audio_file = “C:\Users\tinyt\OneDrive\Desktop\Test\Recording.wav” and, well… Python had a meltdown. Apparently, \t is not just a cute path separator—it’s a tab. Classic Windows-path betrayal. Once I added a humble little r in front like this—r”C:\Users\tinyt\…”—the program stopped being dramatic and worked like a charm. But let’s be real: rewriting the file path in code every time?

That’s a cry for a GUI. So, I made one. And it was glorious. Click, select, transcribe—done. Also, a fun fact: if you skip articles in your speech, the program skips them in your text too. No article, no mercy. And if your grammar is questionable? The output will happily reflect that. It’s an honest mirror, not a grammar teacher.

Optional Ideas for Expansion

  • Add a “Save as TXT” button for transcribed text
  • Let users choose between multiple languages for transcription
  • Record audio directly in the app, no file needed

Leave a Reply

Your email address will not be published. Required fields are marked *