VideoToText

VideoToText is a powerful tool that allows you to download a video from YouTube or use a local video file, extract the audio, and transcribe the text using OpenAI's Whisper model. The project leverages Docker for easy setup and deployment.

Features

Download video from YouTube or use a local video file
Extract audio from the video
Transcribe audio to text using OpenAI's Whisper model
Save transcriptions in SRT and JSON formats

Technologies Used

yt-dlp: A command-line program to download videos from YouTube and other video sites.
ffmpeg: A complete, cross-platform solution to record, convert and stream audio and video.
whisper: OpenAI's Whisper model for state-of-the-art speech recognition.

Project Structure

.
├── Dockerfile
├── docker-compose.yml
├── main.py
├── utils.py
├── README.md
├── /downloads  # Folder where the processed results will be saved
└── /files      # Folder where the videos to be processed should be placed

Getting Started

Clone the Repository

git clone https://github.com/vshloda/VideoToText.git
cd VideoToText

Build the Docker Image

docker compose build

Run the Docker Container

For processing YouTube videos:

docker compose run --rm app python main.py --url "https://www.youtube.com/watch?v=example"

For processing local video files:

docker compose run --rm app python main.py --file "files/video.mp4"

Changing the Model

To change the Whisper model used for transcription, modify the Dockerfile file. Update the line where the model is loaded:

# Define build-model arguments
ARG WHISPER_MODEL=small     # Change 'small' to 'medium', 'large', etc.

Available models include:

tiny
base
small
medium
large

License

This project is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VideoToText

Features

Technologies Used

Project Structure

Getting Started

Clone the Repository

Build the Docker Image

Run the Docker Container

Changing the Model

License

Acknowledgments

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
downloads		downloads
files		files
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
main.py		main.py
requirements.txt		requirements.txt
utils.py		utils.py

License

vshloda/VideoToText

Folders and files

Latest commit

History

Repository files navigation

VideoToText

Features

Technologies Used

Project Structure

Getting Started

Clone the Repository

Build the Docker Image

Run the Docker Container

Changing the Model

License

Acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages