-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inferencing result different from original whisper with GPU even when using same model #256
Comments
I don't know this in detail but it's a different implementation, found this bit from the original announcement:
(That's from October though, so I'm not sure if it still applies...things move fast) |
I also found differences on WER calculation results between large models of PyTorch and whisper.cpp. Whisper.cpp got worse WER score for my tests on large model (ie.. 12% vs 18% WER). Is there any way to bring whisper.cpp with same level accuracy by settings? Naive question but I am learning recently. |
The decoding strategy in @RYucel |
I've encountered this as well with the whisper commandline vs. using whisper from a python script (both have different defaults), see here: The default parameters that the python whisper command line tool uses are:
Biggest difference is that the python whisper decoder does beamsearch, conditions the segments on the preceding ones, temperature back-off when a compression ratio signals likely faulty output (see the example in the whisper discussion link). whisper.cpp already mentions in doesn't do beamsearch, my guess it doesn't do any of the other stuff either. You can also try to check if the outputs are more similar if you set best_of=1, beam_size=1 or best_of=None, beam_size=None, basically making Python whisper do greedy decoding too. |
With the latest version the By default, the
You can enable beamsearch via |
Hi @ggerganov You have done something phenomenal with this work! Sorry to comment on a closed issue but I was wondering if there is any switch to set |
@crisdosyago |
Thank you, sir! |
Is there any parameter that needs to be added into the implementation like in https://github.com/openai/whisper/tree/main/whisper/assets/multilingual ?
I've tested all models and found the inferenced results are different compared to using original whisper with GPU.
I'm wondering what is missing in my setup or there is some difference in the implementation of this project?
The text was updated successfully, but these errors were encountered: