https://youtu.be/SDG8TWBiLbc?si=E0YTwIlI5t0L-kOw
This Flask application converts a video file to text and generates tags based on the extracted text. It uses MoviePy to extract audio from the video, Whisper for speech-to-text transcription, and Gemini API for generating tags.
- MoviePy: A Python library used for video editing, including extracting audio from video files.
- Whisper: An automatic speech recognition (ASR) model by OpenAI, used to transcribe audio to text.
- Gemini API: Google's generative AI service used for generating text, including extracting keywords from the transcribed text.
Before running the application, make sure you have the following installed:
- Python 3.x
- Flask
- MoviePy
- Whisper
- Google Generative AI (Gemini API)
- Python dotenv
You can install the required packages using pip:
pip install flask moviepy openai google-generativeai python-dotenv
-
Create a
.env
file in the project directory with the following content:GOOGLE_API_KEY=your_google_api_key
Replace
your_google_api_key
with your actual Google API key. -
Set up Whisper: Ensure that the Whisper model is available for transcription. You might need to install it if it’s not already available.
- app.py: The main Flask application script.
- templates/index.html: The HTML form for uploading video files.
- Upload a Video: Use the web form to upload a video file.
- Video to Audio: The application extracts the audio from the video using MoviePy.
- Audio to Text: The audio file is transcribed into text using Whisper.
- Generate Tags: The transcribed text is processed to extract keywords using Gemini API.
- Display Results: The generated tags are displayed on the webpage.
-
Run the Application:
python app.py
-
Access the Web Interface:
Open a web browser and go to
http://127.0.0.1:5000/
. -
Upload a Video File:
- Choose a video file and submit it through the form.
- The application processes the file and displays the generated tags.
read_text_from_file(file_path)
: Reads the content of a text file.video_to_audio(video_path, audio_path)
: Converts video to audio using MoviePy.audio_to_text(audio_path)
: Transcribes audio to text using Whisper.generate_tags_chat(prompt_text)
: Uses Gemini API to generate keywords from the transcribed text.