Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to generate large-v3 quantized coreml model #2042

Open
dhruv-anand-aintech opened this issue Apr 11, 2024 · 2 comments
Open

Unable to generate large-v3 quantized coreml model #2042

dhruv-anand-aintech opened this issue Apr 11, 2024 · 2 comments

Comments

@dhruv-anand-aintech
Copy link

I get the following error when trying to generate the large-v3 quantized coreml model:

$ ./models/generate-coreml-model.sh large-v3-q5_0
scikit-learn version 1.3.0 is not supported. Minimum required version: 0.17. Maximum required version: 1.1.2. Disabling scikit-learn conversion API.
XGBoost version 1.7.6 has not been tested with coremltools. You may run into unexpected errors. XGBoost 1.4.2 is the most recent version that has been tested.
Traceback (most recent call last):
  File "/Users/dhruvanand/Code/whisper/whisper.cpp/models/convert-whisper-to-coreml.py", line 293, in <module>
    raise ValueError("Invalid model name")
ValueError: Invalid model name
coremlc: error: Model does not exist at models/coreml-encoder-large-v3-q5_0.mlpackage -- file:///Users/dhruvanand/Code/whisper/whisper.cpp/
mv: rename models/coreml-encoder-large-v3-q5_0.mlmodelc to models/ggml-large-v3-q5_0-encoder.mlmodelc: No such file or directory

I have ggml-large-v3-q5_0.bin present in my ./models.

Can someone help figure this out?
I looked at a related issue (#1437) for the main large-v3 model, but the error in that is different from mine.

@ialshjl
Copy link

ialshjl commented Apr 28, 2024

I'm encountering the same issue with you!

@DainisGorbunovs
Copy link

Whisper's CoreML model does not support quantization. See a related question to this issue in #1829 discussion. Additionally, #548 (comment) and #548 (comment) indicate that quantization would not add a performance boost.

For now, the generate-coreml-model script only generates a CoreML encoder for regular models. The decoder is not generated as running it on the CPU is faster than on ANE, according to #566 pull request and #548 (comment) discussion.

You can use the quantized decoder (large-v3-q5_0) with the regular CoreML encoder (large-v3). In fact, whisper.cpp is already looking for the regular CoreML encoder rather than the quantized one, as seen in whisper_get_coreml_path_encoder. A generated regular CoreML encoder is available in ggerganov's Hugging Face repository.

Example command for using the quantized decoder with the regular CoreML encoder:

./main -m models/ggml-large-v3-q5_0.bin -f samples/jfk.wav

Regarding the error message: The reason the generate-coreml-model script is failing is that convert-whisper-to-coreml tries to load a PT model, and there isn't one for large-v3-q5_0. Although there is a ggml_to_pt.py script, it won't work for the quantized models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants