Skip to content
@voice-over-vision

Voice-Over Vision

DemosFeaturesInstallationContributionAcknowledgmentsCitation

Voice-Over Vision: The future of the internet is accessible

We present Voice-Over Vision, a tool that transforms YouTube watching for the visually impaired, making every video more accessible and enjoyable. Like a friend sitting next to you, this Chrome Extension narrates the unseen parts of a video, filling in the blanks where audio alone falls short. It smartly sifts through videos, picking out details that you might miss otherwise, and uses text-to-speech technology to bring those visuals to life through vivid descriptions. With Voice-Over Vision, every story is fully told, ensuring everyone gets the complete picture, no matter what.

🎬 Demos

Voice-Over Video Demo

🚀 Features

  • Real-Time Audio Description: Generates audio descriptions for YouTube videos, offering a comprehensive viewing experience for visually impaired users.
  • Ask The Video: Answers questions about the video at any time. With just the click of a button (or a keyboard shortcut), the video pauses and a chat opens to clarify any and all questions about the video!
  • More coming soon!
Work In Progress
  • Customizable Speech Parameters: Adjust voice selection, speech rate, and volume to tailor the audio descriptions to your preferences.
  • Detail Level Settings: Choose the level of detail for descriptions, from basic overviews to in-depth analysis of physical appearances and emotions.
  • Interruption Frequency Control: Select how often you'd like the video's original audio to be interrupted with descriptions, ensuring a balanced experience.

💻 Installation

Instructions on how to install and run Voice-Over Vision (soon to be released at Google Chrome Extensions marketplace)

Prerequisites

  • Google Chrome or any Chromium-based browser (except for Brave, for now).
  • Git installed and configured on your machine
  • Python version: 3.11.8
  • Pip: 24.0

Installing the back-end

1. Clone the repository:

git clone https://github.com/voice-over-vision/vov-backend.git
cd vov-backend

2. Install dependencies

# Create a virtual environment and activate it

## Linux
python3 -m venv env
source env/bin/activate

## Windows
python -m venv env
env\Scripts\activate

# Install dependencies
pip install -r requirements.txt

3. Configure the OpenAI key

# Change directories into the vov_backend app
cd vov_backend

# Create an environment file

# Linux
touch .env

# Windows
cd . > .env
  • The .env file sould contain
OPENAI_API_KEY={OPENAI_API_KEY} # OPENAI_API_KEY should be replaced by your API_KEY from OpenAI

4. Build the Docker image

cd ../ # return to the project's root directory
docker build -t vov-backend .

5. Run the Docker image

cd ../ # return to the project's root directory
docker run -p 8000:8000 vov-backend

After few minutes, everything should be ready to use!

Installing the Chrome Extension

1. Clone the repository:

git clone https://github.com/voice-over-vision/vov-chrome-extension.git

2. Load the extension in Chrome (detailed information here):

  • Open the Manage Extensions page by navigating to chrome://extensions/ in your Chrome browser.



  • Enable Developer mode by toggling the switch at the top-right corner.



  • Click on Load unpacked and select the directory of your cloned repository.



  • The extension should now be installed and visible in your Extensions list, you can pin it if you want by clicking the Pin icon.



3. Enjoy the magic of Voice-Over Video!✨

🌟 Contribution

Davi Giordano
Davi Giordano
Guilherme Mariano
Guilherme Mariano
Mariana Serrao
Mariana Serrão
Murillo Teixeira
Murillo Teixeira

💎 Acknowledgments

Chroma DB

We extend our heartfelt thanks to the developers and community behind Chroma DB for their exceptional AI-native open-source embedding database, a crucial component in our mission to create an accessibility tool for the visually impaired. ChromaDB's robust and efficient data management capabilities have been pivotal in our efforts to make a positive impact.

GPT-4

Our appreciation goes to the OpenAI team for providing foundational AI technology for our project. The robustness of GPT-4 was instrumental in our project's natural language processing and image processing capabilities.

📄 Citation

@software{voice-over-vision,
  author = {Davi Giordano, Guilherme Mariano, Mariana Serrao and Murillo Teixeira},
  title = {Voice-Over Vision: The future of the internet is accessible},
  month = {March},
  year = {2024},
  url = {https://github.com/voice-over-vision}
}

Popular repositories Loading

  1. vov-chrome-extension vov-chrome-extension Public

    Voice-Over Vision's Chrome Extension

    JavaScript 5 1

  2. vov-backend vov-backend Public

    Voice-Over Vision's backend.

    Python 1

  3. .github .github Public

Repositories

Showing 3 of 3 repositories
  • vov-backend Public

    Voice-Over Vision's backend.

    voice-over-vision/vov-backend’s past year of commit activity
    Python 1 Apache-2.0 0 0 0 Updated Mar 31, 2024
  • .github Public
    voice-over-vision/.github’s past year of commit activity
    0 Apache-2.0 0 0 0 Updated Mar 27, 2024
  • vov-chrome-extension Public

    Voice-Over Vision's Chrome Extension

    voice-over-vision/vov-chrome-extension’s past year of commit activity
    JavaScript 5 Apache-2.0 1 1 0 Updated Mar 24, 2024

Top languages

Loading…

Most used topics

Loading…