SceneSifter

SceneSifter is a prototype for interactive video information extraction and retrieval, designed to help content creators organize and search video clips. Leveraging speech-to-text, image-to-text, text-to-vector, and vector similarity search technologies, it processes user-supplied videos and returns clickable relevant timestamps from those videos given a fuzzy search request.

Architecture

Example

Directory Structure

├── backend
│   ├── model    # Vosk model
│   ├── src
│   │   ├── application.properties.ini    # Properties
│   │   ├── chatgpt.mjs             # ChatGPT service
│   │   ├── index.mjs    
│   │   ├── itt.mjs                 # Image-to-text conversion
│   │   ├── postgres_service.mjs    # Postgres service
│   │   ├── process_video.mjs       # Audio extraction, frame extraction, etc.
│   │   ├── stt.mjs                 # Speech-to-text conversion
│   ├── uploaded_videos        # User-uploaded videos
│   ├── package.json
├── example_videos    # Videos used for testing
├── frontend          # React App
├── images            # images for README.md
├── sql
│   ├── video_listing.sql       # Database creation
│   ├── video_embedding.csv     # Some dummy data
└── README.md

Prerequisite

Install FFmpeg

FFmpeg is required for audio & frame extraction. Download and install it from https://ffmpeg.org/download.html

Select a Vosk model for speech recognition

We used vosk-model-en-us-0.22-lgraph. To use a different Vosk model, download one from https://alphacephei.com/vosk/models and unpack it in backend/model/ to overwrite the current one.

Setup PostgreSQL

Launch a Postgres instance using the docker image with pgvector:

mkdir ~/postgres-volume/

docker run --name postgres \
    -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=password \
    -p 5432:5432 \
    -v ~/postgres-volume/:/var/lib/postgresql/data -d ankane/pgvector:latest

Copy the Video listing schema to the container:

docker cp sql/video_listing.sql postgres:/home

Create the video listing table from the schema

docker exec -it postgres psql -U postgres -c '\i /home/video_listing.sql'

Setup remote API keys

Out of safety concerns, we didn't put our OpenAI key in the config file. To enable the ChatGPT service, you need to manually add your own OpenAI key.

In backend/src/application.properties.ini, set

CHATGPT_API_KEY=<your own key>

Same for Hugging Face. To enable image-to-text conversion via the inference API, in backend/src/application.properties.ini, set

HUGGINGFACE_TOKEN=<your own key>

Usage

To run the backend code:

cd backend
npm i
npm start

To run the frontend code:

cd frontend
npm i
npm start

Example videos

File	Length	Description
2010_elevator_Pitch_Winner.mp4	1m15s	An elevator pitch
children_new_billionaires.mp4	2m27s	A short documentary featuring some self-introductions
Garr_Reynolds_Introduction.mp4	1m03s	A self introduction
Matthew_introduction.mp4	1m07s	A self introduction
ted_talk.mp4	1m20s	A Ted Talk clip
ted_youth_unemployment.mp4	1m12s	A Ted Talk clip
tokyo1.mp4	2m01s	The best of Tokyo2020 compilation
tokyo2.mp4	2m01s	The best of Tokyo2020 compilation
tokyo3.mp4	2m01s	The best of Tokyo2020 compilation

References

Header background image: Photo by HD Wallpapers on StockSnap
TV icon

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SceneSifter

Architecture

Example

Directory Structure

Prerequisite

Install FFmpeg

Select a Vosk model for speech recognition

Setup PostgreSQL

Setup remote API keys

Usage

Example videos

References

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
backend		backend
example_videos		example_videos
frontend		frontend
images		images
sql		sql
.gitignore		.gitignore
README.md		README.md

mingchao-zhang/SceneSifter

Folders and files

Latest commit

History

Repository files navigation

SceneSifter

Architecture

Example

Directory Structure

Prerequisite

Install FFmpeg

Select a Vosk model for speech recognition

Setup PostgreSQL

Setup remote API keys

Usage

Example videos

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages