Unable to generate large-v3 quantized coreml model #2042

dhruv-anand-aintech · 2024-04-11T18:53:48Z

I get the following error when trying to generate the large-v3 quantized coreml model:

$ ./models/generate-coreml-model.sh large-v3-q5_0
scikit-learn version 1.3.0 is not supported. Minimum required version: 0.17. Maximum required version: 1.1.2. Disabling scikit-learn conversion API.
XGBoost version 1.7.6 has not been tested with coremltools. You may run into unexpected errors. XGBoost 1.4.2 is the most recent version that has been tested.
Traceback (most recent call last):
  File "/Users/dhruvanand/Code/whisper/whisper.cpp/models/convert-whisper-to-coreml.py", line 293, in <module>
    raise ValueError("Invalid model name")
ValueError: Invalid model name
coremlc: error: Model does not exist at models/coreml-encoder-large-v3-q5_0.mlpackage -- file:///Users/dhruvanand/Code/whisper/whisper.cpp/
mv: rename models/coreml-encoder-large-v3-q5_0.mlmodelc to models/ggml-large-v3-q5_0-encoder.mlmodelc: No such file or directory

I have ggml-large-v3-q5_0.bin present in my ./models.

Can someone help figure this out?
I looked at a related issue (#1437) for the main large-v3 model, but the error in that is different from mine.

The text was updated successfully, but these errors were encountered:

ialshjl · 2024-04-28T12:28:05Z

I'm encountering the same issue with you!

DainisGorbunovs · 2024-06-27T23:55:37Z

Whisper's CoreML model does not support quantization. See a related question to this issue in #1829 discussion. Additionally, #548 (comment) and #548 (comment) indicate that quantization would not add a performance boost.

For now, the generate-coreml-model script only generates a CoreML encoder for regular models. The decoder is not generated as running it on the CPU is faster than on ANE, according to #566 pull request and #548 (comment) discussion.

You can use the quantized decoder (large-v3-q5_0) with the regular CoreML encoder (large-v3). In fact, whisper.cpp is already looking for the regular CoreML encoder rather than the quantized one, as seen in whisper_get_coreml_path_encoder. A generated regular CoreML encoder is available in ggerganov's Hugging Face repository.

Example command for using the quantized decoder with the regular CoreML encoder:

./main -m models/ggml-large-v3-q5_0.bin -f samples/jfk.wav

Regarding the error message: The reason the generate-coreml-model script is failing is that convert-whisper-to-coreml tries to load a PT model, and there isn't one for large-v3-q5_0. Although there is a ggml_to_pt.py script, it won't work for the quantized models.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to generate large-v3 quantized coreml model #2042

Unable to generate large-v3 quantized coreml model #2042

dhruv-anand-aintech commented Apr 11, 2024

ialshjl commented Apr 28, 2024

DainisGorbunovs commented Jun 27, 2024

Unable to generate large-v3 quantized coreml model #2042

Unable to generate large-v3 quantized coreml model #2042

Comments

dhruv-anand-aintech commented Apr 11, 2024

ialshjl commented Apr 28, 2024

DainisGorbunovs commented Jun 27, 2024