The script converts speech from audio file to text using the OpenAI API. It handles audio file splitting, sends parts to OpenAI for transcription, combines the text, and optionally creates summaries and timestamps. Supported file types include mp3
, mp4
, mpeg
, mpga
, m4a
, wav
, and webm
.
-
Install Python:
- macOS: Open Terminal and run:
brew install python
- Windows: Download and install from Python website, check "Add Python to PATH".
- macOS: Open Terminal and run:
-
Install Libraries:
pip install pydub openai
- Sign up and log in at OpenAI.
- Generate a new API key in your dashboard.
macOS:
- Open Terminal.
- Edit your shell profile:
nano ~/.bash_profile
- Add:
export OPENAI_API_KEY='your_openai_api_key'
- Save and apply changes:
source ~/.bash_profile
Windows:
- Open Command Prompt.
- Run:
setx OPENAI_API_KEY "your_openai_api_key"
- Restart Command Prompt.
- Save the script as
transcribe.py
. - Open Command Prompt or Terminal, navigate to the script directory.
- Run:
For Python 3, use:
python transcribe.py path_to_your_audio_file
python3 transcribe.py path_to_your_audio_file
Example:
python transcribe.py audio.m4a
--sum
: Create a summary.--time
: Include timestamps.
Example:
python transcribe.py audio.m4a --sum --time