Audio-Transcript

Python ★ 25 updated 21d ago

Audio Transcript

A simple command-line tool that converts audio files into text using OpenAI's speech-to-text service. Point it at a file or folder, get a transcript in your terminal or saved as a .txt file. Supports long files, multiple languages, and cost-saving model options.

PythonOpenAI APIffmpegsetup: moderatecomplexity 1/5

Audio Transcript is a small command-line tool that converts audio files into text using OpenAI's speech-to-text service. You point it at an audio file or a folder full of audio files, and it prints the transcript in your terminal. That is the whole thing.

By default it uses OpenAI's higher-quality transcription model. If you have a lot of audio and want to keep costs and processing time down, a single flag switches it to a faster, cheaper model. You can also pass a language code to improve accuracy when the spoken language is not English, or give the model a short context hint with names, acronyms, or topic keywords that might otherwise get transcribed incorrectly.

Long audio files are handled by splitting them into chunks before uploading. Audio APIs have limits on both file size and how much text they return in one response, so splitting avoids transcripts that cut off partway through. The default chunk length is five minutes, and you can adjust it downward if you still get incomplete results. Files in formats that the API does not accept directly are automatically converted to MP3 using ffmpeg before they are sent.

Transcripts print to the terminal by default. Passing a flag saves a .txt file next to each audio file instead. The tool does not do anything else: no database, no web interface, no account system.

To use it you need Python 3.10 or later, an OpenAI API key, and ffmpeg installed on your machine. The README includes clear setup instructions and a troubleshooting section covering the most common failure cases. The code is released under the MIT license.

Where it fits

Transcribe recorded meetings, interviews, or podcasts into text you can search or edit.
Batch-convert a whole folder of voice memos or lecture recordings to text files automatically.
Transcribe audio in languages other than English with improved accuracy using a language hint.
Keep transcription costs low on large audio libraries by switching to the faster, cheaper model.

Open on GitHub → Full breakdown on explaingit →