Distil-Whisper Models: 6 times speed up and 49% smaller size? #1414
Replies: 6 comments 28 replies
-
Exciting! Edit: initial support has been added in this PR - #1424 Please note that the current implementation is not optimal since it does not support the proposed chunking strategy (see #1414 (reply in thread) for more info). Therefore, the transcription quality using |
Beta Was this translation helpful? Give feedback.
-
They should be released now. Curious to see performance metrics 🙂 |
Beta Was this translation helpful? Give feedback.
-
They are released : https://www.linkedin.com/posts/sanchit-gandhi_distil-whisper-is-now-available-in-transformers-activity-7125902709794189313-aumY?utm_source=share&utm_medium=member_desktop Anyone already ran the model? Would love to see some results :) |
Beta Was this translation helpful? Give feedback.
-
How do I convert this to something whisper.cpp can use? But I cannot find any .pt files in here : https://huggingface.co/distil-whisper/distil-medium.en/tree/main |
Beta Was this translation helpful? Give feedback.
-
I ran a few initial tests using |
Beta Was this translation helpful? Give feedback.
-
Amazing that you're already on it! Linking the other issue here: #1423 |
Beta Was this translation helpful? Give feedback.
-
Tomorrow, HuggingFace's team are set to release their distilled Whisper models, which claim to be "6 times faster, 49% smaller, and perform within 1% WER on out-of-distribution evaluation sets." You should be able to find them here tomorrow: HuggingFace Collection and GitHub.
If they're in PT format, I assume that this script should do the trick, perhaps better minds can confirm tomorrow: Convert PT to GGML Script. This should bring about even faster speed improvements. For those of us (like me) with limited hardware, we might get high-quality ASR on our own devices that's faster than real-time, and doesn't rely on network connectivity. So, tomorrow should be a good day!
Beta Was this translation helpful? Give feedback.
All reactions