Skip to content

A web app for automatic speech recognition using OpenAI's Whisper model running locally.

Notifications You must be signed in to change notification settings

markgir/whisper-asr-webapp

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

whisper-asr-webapp

Docker GitHub last commit (branch)

A web app for automatic speech recognition using OpenAI's Whisper model running locally.

# Quickstart with Docker:
docker run --rm -it -p 8000:8000 -v whisper_models:/root/.cache/whisper ghcr.io/fluxcapacitor2/whisper-asr-webapp:main

Features

  • Customize the model, language, and initial prompt
  • Enable per-word timestamps (visible in downloaded JSON output)
  • Runs Whisper locally
  • Pre-packaged into a single Docker image
  • View timestamped transcripts in the app
  • Download transcripts in plain text, VTT, SRT, TSV, or JSON formats

Architecture

The frontend is built with Svelte and builds to static HTML, CSS, and JS.

The backend is built with FastAPI. The main endpoint, /transcribe, pipes an uploaded file into ffmpeg, then into Whisper. Once transcription is complete, it's returned as a JSON payload.

In a containerized environment, the static assets from the frontend build are served by the same FastAPI (Uvicorn) server that handles transcription.

Running

  1. Pull and run the image with Docker.
    • Run in an interactive terminal: docker run --rm -it -p 8000:8000 -v whisper_models:/root/.cache/whisper ghcr.io/fluxcapacitor2/whisper-asr-webapp:main
    • Run in the background: docker run -d -p 8000:8000 -v whisper_models:/root/.cache/whisper ghcr.io/fluxcapacitor2/whisper-asr-webapp:main
  2. Visit http://localhost:8000 in a web browser

Development

The easiest way to get started is by using Docker. You can use the premade run.sh shell script or the following commands in the root of the project:

docker build . -t fluxcapacitor2/whisper-asr-webapp:local-dev
docker run -p 8000:8000 --rm -it fluxcapacitor2/whisper-asr-webapp:local-dev

This will build and run a Docker container that hosts both the frontend and backend on port 8000. Navigate to http://localhost:8000 in a web browser to start using the app.

Note: When you make any code changes, you will need to rebuild and restart the Docker container. However, due to caching, this should still be reasonably fast.

About

A web app for automatic speech recognition using OpenAI's Whisper model running locally.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Svelte 75.0%
  • TypeScript 10.4%
  • Python 7.7%
  • Dockerfile 2.3%
  • JavaScript 1.9%
  • CSS 1.2%
  • Other 1.5%