MLX for significantly faster inference on Apple Silicone (M1 - M4) #124

menelic · 2025-02-03T00:02:28Z

Please consider implementing MLX, which promises significantly faster inference on Apple Silicone Macs.

Find the version prepared for Whisper here https://github.com/ml-explore/mlx-examples/tree/main/whisper

Implementation info for MLX: https://github.com/ml-explore/mlx

kaixxx · 2025-02-03T12:16:00Z

It would be interesting to use this. But noScribe is deeply integrated with faster-whisper. We have to take a closer look how much the interfaces deviate from each other. Until then, I have some other things to finish. But I will keep this on my list, thank you.

menelic · 2025-02-03T22:31:08Z

The reason I am suggesting this is because it seems NoScribe could be even more optimized for Apple silicon.On a Macbook M4 24 GB NoScribe is not nearly as fast as the inference speed bump observable with LLM Studio before and after they implemented MLX, which leads me to assume that there might be a lot more optimizations possible for NoScribe on Apple hardware. It seems to me this is about unlocking some of the M4 architecture designed for inference, so it seems faster-whisper could benefit from this as welll.

kaixxx added the enhancement New feature or request label Feb 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MLX for significantly faster inference on Apple Silicone (M1 - M4) #124

MLX for significantly faster inference on Apple Silicone (M1 - M4) #124

menelic commented Feb 3, 2025

kaixxx commented Feb 3, 2025

menelic commented Feb 3, 2025

MLX for significantly faster inference on Apple Silicone (M1 - M4) #124

MLX for significantly faster inference on Apple Silicone (M1 - M4) #124

Comments

menelic commented Feb 3, 2025

kaixxx commented Feb 3, 2025

menelic commented Feb 3, 2025