Web application that converts audio and video to text using AI, supporting various formats and self-hosting.
-
Updated
Apr 7, 2025 - Python
Web application that converts audio and video to text using AI, supporting various formats and self-hosting.
A compact (offline) GUI media transcriber that enables you to search for local content based on its spoken words.
Takes audio (mp3) and text input (string) and force aligns the text to the audio. Uses stable-ts and whisperx.
🎬 CLI tool to auto-transcribe, romanize & merge subtitles from MKV files, powered by Whisper
Mac-friendly CLI for speech-to-text with stable-ts (stable_whisper): transcription, forced alignment, SRT/VTT/TXT, CJK-aware wrapping, MPS acceleration.
🗣️ Align audio with text seamlessly on macOS, generating accurate timestamps and subtitles in multiple formats for better accessibility.
Add a description, image, and links to the stable-ts topic page so that developers can more easily learn about it.
To associate your repository with the stable-ts topic, visit your repo's landing page and select "manage topics."