-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Newest version not working; OpenBLAS Warining : The number of CPU/Cores(144) is beyond the limit(64). Terminated. #356
Comments
I've been able to run it using the git checkout (been using it awhile too, so I have most of the tools installed manually!) -- So not urgent over here, but would be good to document how to solve it somewhere. |
Hi I think this isn't an OrthoFinder issue as such. As I understand it your open OpenBLAS library is parallelising and this is interacting with a job manager you have running? If this combines with OrthoFinder's parallelisation then you exceed a limit on the number of cores and your job manager throws an error?There was some discussion of it here: #323 (which I think you may have seen already). Setting this to 1 appears to be the solution to this issue. I don't have any system configured in this way myself to test this but will put this info in a FAQ, but any extra information you have would also be useful. Do you have a sysadmin for this computer? They would be a good person to approach for this and I'd be interested to know what they have to say. And could you also confirm if All the best |
Ah yes, setting OPENBLAS_NUM_THREADS=1 worked. But so did running it from the github tag release version via python directly. So I'm wondering if it's something with how you are compiling the python version to a standalone executable? No job manager, just running from the command line. It's definitely a weird/unique situation. |
That's interesting about the compiled version. A question for you but only if you have time, but I'm wondering if you could confirm that the difference behaviour between the compiled vs python versions is observed consistently over a number of repetitions since my first guess is that it would actually be a race condition from the parallelisation (two parallel tasks happen to multiple matrices at the same point in time). I see now what you mean about no job manager, it looks like it's actually OpenBLAS that is terminating it. I will try and have a look at their issues page/documentation to see if anything needs to be reported to them or anything changed in how OrthoFinder behaves. Currently it only uses numpy operations and I assume under the hood numpy is using OpenBLAS. In fact, it could be that the compiled version is using an old version of numpy that is missing a more recent fix. An old version is used currently to provide support for users on old machines, but that might need to be changed in some way. Thanks for your help |
I don't think it's a race condition since I can get the error with the help command (./orthofinder -h). But I do seem to get the error each time I try to run it (10 times so far). How are you compiling the python and the lib? Could be worth checking out. Edit: Nevermind on the JAX front, it'd be too much effort and probably have the same issues. |
Hi That's really useful info, thanks! That'll save me from a long excursion in the wrong direction. I use the pyinstaller program. You can recreate it yourself but downloading the orthofinder source code and running it on the orthofinder.py file:
All the best |
There is some good information on this issue here: I've hard-coded the "OPENBLAS_NUM_THREADS=1" into OrthoFinder now, I think this is the best thing to do. If someone who previously had the issue could try running the latest version of the code from the master branch that would be really useful. If it works then I'll create a new release package with this fix in. Many thanks |
The text was updated successfully, but these errors were encountered: