Flask Video to Text Application

Explanation Video

https://youtu.be/SDG8TWBiLbc?si=E0YTwIlI5t0L-kOw

Flowchart

This Flask application converts a video file to text and generates tags based on the extracted text. It uses MoviePy to extract audio from the video, Whisper for speech-to-text transcription, and Gemini API for generating tags.

Overview

MoviePy: A Python library used for video editing, including extracting audio from video files.
Whisper: An automatic speech recognition (ASR) model by OpenAI, used to transcribe audio to text.
Gemini API: Google's generative AI service used for generating text, including extracting keywords from the transcribed text.

Prerequisites

Before running the application, make sure you have the following installed:

Python 3.x
Flask
MoviePy
Whisper
Google Generative AI (Gemini API)
Python dotenv

You can install the required packages using pip:

pip install flask moviepy openai google-generativeai python-dotenv

Configuration

Create a .env file in the project directory with the following content:
```
GOOGLE_API_KEY=your_google_api_key
```
Replace your_google_api_key with your actual Google API key.
Set up Whisper: Ensure that the Whisper model is available for transcription. You might need to install it if it’s not already available.

Application Structure

app.py: The main Flask application script.
templates/index.html: The HTML form for uploading video files.

How It Works

Upload a Video: Use the web form to upload a video file.
Video to Audio: The application extracts the audio from the video using MoviePy.
Audio to Text: The audio file is transcribed into text using Whisper.
Generate Tags: The transcribed text is processed to extract keywords using Gemini API.
Display Results: The generated tags are displayed on the webpage.

Usage

Run the Application:
```
python app.py
```
Access the Web Interface:

Open a web browser and go to http://127.0.0.1:5000/.
Upload a Video File:
- Choose a video file and submit it through the form.
- The application processes the file and displays the generated tags.

Code Explanation

read_text_from_file(file_path): Reads the content of a text file.
video_to_audio(video_path, audio_path): Converts video to audio using MoviePy.
audio_to_text(audio_path): Transcribes audio to text using Whisper.
generate_tags_chat(prompt_text): Uses Gemini API to generate keywords from the transcribed text.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
SampleVideos		SampleVideos
static		static
templates		templates
.gitignore		.gitignore
README.md		README.md
image.png		image.png
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Flask Video to Text Application

Explanation Video

Flowchart

Overview

Prerequisites

Configuration

Application Structure

How It Works

Usage

Code Explanation

About

Releases

Packages

Languages

akshatmiglani/TagGenie

Folders and files

Latest commit

History

Repository files navigation

Flask Video to Text Application

Explanation Video

Flowchart

Overview

Prerequisites

Configuration

Application Structure

How It Works

Usage

Code Explanation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages