We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I installed the latest version via:
pip install git+https://github.com/aarnphm/whispercpp.git -vv
Then I have this 1 minute long wav file.
./main -m models/ggml-small.bin -f out.wav --language auto --max-len 1 whisper_init_from_file_no_state: loading model from 'models/ggml-small.bin' whisper_model_load: loading model whisper_model_load: n_vocab = 51865 whisper_model_load: n_audio_ctx = 1500 whisper_model_load: n_audio_state = 768 whisper_model_load: n_audio_head = 12 whisper_model_load: n_audio_layer = 12 whisper_model_load: n_text_ctx = 448 whisper_model_load: n_text_state = 768 whisper_model_load: n_text_head = 12 whisper_model_load: n_text_layer = 12 whisper_model_load: n_mels = 80 whisper_model_load: ftype = 1 whisper_model_load: qntvr = 0 whisper_model_load: type = 3 whisper_model_load: mem required = 743.00 MB (+ 16.00 MB per decoder) whisper_model_load: adding 1608 extra tokens whisper_model_load: model ctx = 464.68 MB whisper_model_load: model size = 464.44 MB whisper_init_state: kv self size = 15.75 MB whisper_init_state: kv cross size = 52.73 MB system_info: n_threads = 4 / 12 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 | COREML = 0 | main: processing 'out.wav' (958952 samples, 59.9 sec), 4 threads, 1 processors, lang = auto, task = transcribe, timestamps = 1 ... whisper_full_with_state: auto-detected language: ru (p = 0.993210) # recognition results go here ... whisper_print_timings: load time = 571.00 ms whisper_print_timings: fallbacks = 0 p / 0 h whisper_print_timings: mel time = 310.98 ms whisper_print_timings: sample time = 378.25 ms / 426 runs ( 0.89 ms per run) whisper_print_timings: encode time = 11926.32 ms / 4 runs ( 2981.58 ms per run) whisper_print_timings: decode time = 9821.29 ms / 425 runs ( 23.11 ms per run) whisper_print_timings: total time = 23272.83 ms
Total execution time is 23 seconds.
Here's my python code which uses this library:
import time from whispercpp import Whisper start = time.time() w = Whisper.from_pretrained(model_name="/whispercpp/models/ggml-small.bin") w.params.with_language("auto") print(w.transcribe_from_file("out.wav")) end = time.time() print(end - start)
whisper_init_from_file_no_state: loading model from '/whispercpp/models/ggml-small.bin' whisper_model_load: loading model whisper_model_load: n_vocab = 51865 whisper_model_load: n_audio_ctx = 1500 whisper_model_load: n_audio_state = 768 whisper_model_load: n_audio_head = 12 whisper_model_load: n_audio_layer = 12 whisper_model_load: n_text_ctx = 448 whisper_model_load: n_text_state = 768 whisper_model_load: n_text_head = 12 whisper_model_load: n_text_layer = 12 whisper_model_load: n_mels = 80 whisper_model_load: f16 = 1 whisper_model_load: type = 3 whisper_model_load: mem required = 608.00 MB (+ 16.00 MB per decoder) whisper_model_load: adding 1608 extra tokens whisper_model_load: model ctx = 464.56 MB whisper_model_load: model size = 464.44 MB whisper_init_state: kv self size = 15.75 MB whisper_init_state: kv cross size = 52.73 MB whisper_full_with_state: auto-detected language: ru (p = 0.993206) # recognition results go here... 183.6768798828125
Total execution time is 183s.
Difference is almost 9x times.
No response
I'd expect the performance to be the same as can be seen in original whisper.cpp
Macbook pro 16, 2,6 GHz 6-Core Intel Core i7; 32 GB RAM Python 3.10 Latest versions of whisper.cpp and this library as of 5th June 2023.
The text was updated successfully, but these errors were encountered:
I have also experienced a very slow transcription. Did you come up with a solution to this problem? Thanks.
Sorry, something went wrong.
No branches or pull requests
Describe the bug
I installed the latest version via:
Then I have this 1 minute long wav file.
Here's the output of the original whisper.cpp command:
Total execution time is 23 seconds.
Here's my python code which uses this library:
And here's the output on the same file:
Total execution time is 183s.
Difference is almost 9x times.
To reproduce
No response
Expected behavior
I'd expect the performance to be the same as can be seen in original whisper.cpp
Environment
Macbook pro 16, 2,6 GHz 6-Core Intel Core i7; 32 GB RAM
Python 3.10
Latest versions of whisper.cpp and this library as of 5th June 2023.
The text was updated successfully, but these errors were encountered: