-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
subtitle line remains stuck for 30 mins | awk script > 2mins length #975
Comments
In the process of correcting the ones and here's an instance where 02:42:11.810 --> 02:42:13.840
02:42:21.600 --> 02:42:23.430 02:42:23.430 --> 02:42:25.820 to fix changed to
02:42:21.600 --> 02:42:23.430 |
Ok tried many combinations with max length and with split on word and without split on word. It definitely is some calculation bug with max-length. Well have to use it as it sometimes goes way over 100 characters. So just have to scan for the problem ones and fix it manually. Also it only occurs on about 2% of the ones I've done so hard to say what exactly is the culprit. Once it was music but many other problematic ones didn't have music at all. It seems without max-length it sets it to around 100 characters. maxlength80sow-example.vtt I tried looking at some much code but most is way over my head
This part is from openai-whisper and seems to indicate word timestamps are required when using max length. However I don't want vtt or srt subtitles with word-timestamps as it significantly increases file size. Definitely useful for karaoke or language learning I suppose.
Here's an example of one corrected timecodes are in ( ) 02:34:48.430 --> 02:34:49.280 02:34:49.280 --> 03:41:15.260 (02:34:54.260) (02:34:54.260) 03:41:15.260 --> 02:34:55.120 02:34:55.120 --> 02:34:59.840 |
These are just random notes of code I was looking at but like I said...over my ability think this one is from openai-whisper if I remember correctly
https://github.com/openai/whisper/blob/main/whisper/transcribe.py
https://github.com/openai/whisper/blob/main/whisper/decoding.py
|
Had this happen yesterday also and found it think it was in a 48 hour audiobook I was doing. This one just happened again today with a 10 hour audiobook. So what happens is
this text remains for 32 mins in subs
remains below constantly on for 32 mins and some of the new subtitles that is what can fit show just above it
Obviously to correct this just change
05:14:42.820 --> 05:46:50.320
to
05:14:42.820 --> 05:14:50.320
05:46:50.320 --> 05:14:50.420
to
05:14:50.320 --> 05:14:50.420
in both cases just changing the xx:46:xx.xxx to xx:14:xx.xxx
my current command to pipe wav max length 78 and split at word
for f in *.opus ; do ffmpeg -i "$f" -f wav -ar 16000 -ac 1 - | ~/whisper/whisper.cpp/./main -m ~/whisper/whisper.cpp/models/ggml-medium.en.bin - -ovtt -of "$f" -l en -ml 78 -sow -t 8 ; for f in *.vtt ; do sed -r -i .bak -e 's|Yellow|yellow|g' -e 's|blue|Blue|g' -e 's|Pink|pink|g' "$f" ; done && for i in *opus.vtt ; do mv -i -- "$i" "$(printf '%s\n' "$i" | sed '1s/.opus.vtt/.vtt/')" ; mkdir vttsubs/ ; mv *.vtt vttsubs/ ; done && rm *.bak ; done
I'll try to figure out an awk script to see if it can automatically check duration on a subtitle line say exceeding 2 mins
The text was updated successfully, but these errors were encountered: