-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faiss runs very slowly on M1 Mac #2386
Comments
@SupreethRao99 It seems that you built faiss as Debug mode. |
@wx257osn2 thank you. I tried the approach that you suggested by adding |
@SupreethRao99 Thanks trying. Hmm... what the BLAS library did you install? It seems that |
Indeed the speed of flat search is dominated by the BLAS sgemm. Maybe the cmake logs indicate which BLAS version is used. |
Thank you @mdouze @wx257osn2 , the CMAKE logs are as follows,
|
also, is there any way in which I can help build and upload FAISS to conda so that people will not have to build from source. Furthermore, are there plans to support GPU acceleration on M1 processors? |
Ah, that is. According to this and this, Apple's Accelerate framework for M1 runs on the AMX coprocessor. It seems that the coprocessor is good at electric efficiency, but not good at run-time speed, especially in multi-thread execution.
I'm not a Meta employee for implementing |
thank you @wx257osn2 , I will rebuild FAISS with OpenBLAS and give it a try. |
Right, as @wx257osn2 says, it is a significant effort to support the GPU version of FAISS in CUDA, so adding support for other types of GPUs like the M1's or Intel or AMD is not planned. |
I put a brief note about porting GPGPU codes from CUDA to other environments. Actually, the porting cost to AMD's GPU on Linux is relatively low than M1, Intel, and AMD's GPU on Windows. AMD is developing a GPGPU environment called ROCm, and HIP, which is a wrapper for CUDA and ROCm. The API of HIP is mostly one of CUDA itself, so porting cost to HIP is not so high, and HIP codes can run on both CUDA and ROCm. There are also high performance libraries and the wrappers like rocBLAS and hipBLAS, and so on. CuPy, which is a GPU-accelerated NumPy/SciPy-compatible python module, is the famous product written in HIP. CuPy had been originally written in CUDA, but that has been ported to HIP at few years ago. However, the porting cost is just relatively low , not zero. CuPy showed that even porting to HIP is not so easy. |
Thanks @wx257osn2 for the overview. To summarize, the issues for us to support alternative hardware are:
So if anyone is willing to take ownership of other hardware accelerators, we'd be very happy to collaborate ;-) |
On my Mac M1, when I search on jupyter notebook it kills my session. I've tried it with python 3.7 and 3.9 both kills the session however it does work when I start a python app from pycharm or terminal. using faiss-cpu |
@kaanbursa This issue is about the performance, not about whether it works or not. You should create your new issue about your problem with more information about installation method, error messages, and so on. |
@SupreethRao99 Do you have any update? Has OpenBLAS helped you? |
@wx257osn2 Yes, OpenBLAS does give a good speed up. Thank you ! |
potentially this should be ran on the gpu on m1 but this requires porting over cuda kernels to metal. |
@SupreethRao99 How did you rebuild FAISS with OpenBLAS ? Newbie to GenAI stuff. |
@SupreethRao99 can you share how to rebuild FAISS using OpenBLAS and make it faster on M1? |
Summary
running inference on a saved index it is painfully slow on M1 Pro (10 core CPU 16 core GPU). The index is about 3.4Gb in size and takes 1.5 seconds for inference on CPU backend on colab but is taking >20 minutes on M1 CPU, what would be the reason for such slow performance ?
Platform
OS: macOS 12.4
Faiss version: 1.7.2
Installed from: compiled by self following install.md, and this issue
Faiss compilation options:
Running on:
Interface:
Reproduction instructions
The code that I'm running is as follows
The text was updated successfully, but these errors were encountered: