Web application that converts audio and video to text using AI, supporting various formats and self-hosting.
-
Updated
Apr 7, 2025 - Python
Web application that converts audio and video to text using AI, supporting various formats and self-hosting.
Takes audio (mp3) and text input (string) and force aligns the text to the audio. Uses stable-ts and whisperx.
🎬 CLI tool to auto-transcribe, romanize & merge subtitles from MKV files, powered by Whisper
Mac-friendly CLI for speech-to-text with stable-ts (stable_whisper): transcription, forced alignment, SRT/VTT/TXT, CJK-aware wrapping, MPS acceleration.
🗣️ Align audio with text seamlessly on macOS, generating accurate timestamps and subtitles in multiple formats for better accessibility.
Add a description, image, and links to the stable-ts topic page so that developers can more easily learn about it.
To associate your repository with the stable-ts topic, visit your repo's landing page and select "manage topics."