-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Two much threads. #323
Comments
Hi Do you know what stage OrthoFinder got to and what it's most recent output was? I'm not sure how these systems are interacting but OrthoFinder only has control of how many processes it launches and what these do, it doesn't have control over the cpus as such. In terms of what OrthoFinder will do with your command: it will run 12 blast processes at once. It might be worth checking the individual blast processes aren't each running in parallel. You can see the blastp command OrthoFinder uses using the '-op' option instead of '-og'. When I wrote this command, there was no need to specify the number of threads as this would run in blastp in serial, you could check for the blast version you are using if this is still the case, I don't know if this has changed. Edit: the blast command looks like this: In terms of how your job scheduler counts CPUs, there will be a thread for the main orthofinder process and there might be threads associated with running each of the 12 blast processes but these will all be inactive--you can confirm this by looking at top, there will only be 12 processes actively using CPU cycles at any one time. I don't know how the job scheduler works but maybe it could be counting threads rather than CPU usage? They are the best ideas I can come up with for why PBS is reporting that. But the command is good as far as I'm concerned and (unless blast has started running in parallel) it should only be using 12 cpus at once, which can be confirmed using top. Let me know if you find anything. All the best |
Thank you David, I have launch the job myself, in order to test more deeply and be able to give you more details. I still have my job killed by PBS enforce system. It happens during the
What can I do in order to understand what happens? Fred |
Hi Fred I've checked in the code to confirm that OrthoFinder will only be using 1 thread here, that's the "1 thread(s) for OrthoFinder algorithm" bit in the output you posted above. You should be able to confirm this by monitoring 'top' at this point. You can restart at almost exactly this stage so as to see this (i.e. from the completed all-versus-all sequence search) using the '-b' option specifiying the result directory from your attempted run above. Unfortunately I don't know why PBS is behaving this way, but top should confirm that OrthoFinder isn't using 34 cpus. All the best |
Hi David, not to hijack this thread, but I'm finding that OrthoFinder does not appear to be passing the
|
Hi Could you check it's usage using top or some other program. OrthoFinder shouldn't be running more than 20 high core usage threads at once, although there may be some idle threads at the same time but they won't be using CPU cycles. The '-t' option is used to control how many parallel tasks are run at once (diamond searches, alignments, tree inference) and it is definitely being propagated through the program. Do you have any info as to when in the run this occurred? My only guess is that your BLAS library could be doing some parallelisation itself under the hood. Have you googled the error message it's returned? I don't know much about it, but could any of the results be relevant to your situation? All the best |
Hi David, thanks for your speculation - my colleague pointed out that the version of Orthofinder that I downloaded from the tutorial was old (v2.3.1). I do not have this error when using v2.3.7. |
Hi David, Im currently using the last version v2.3.11 on CENTOS7 and I'm getting the same erros as posted before: "OpenBLAS Warining : The number of CPU/Cores(96) is beyond the limit(64). Terminated." the comand line that Im using to submit my job is Orthofinder/orthofinder -f ~/data/2020/test1. Kind regards. Ivan |
Hi Ivan Thanks for confirming you're seeing similar behaviour on v.2.3.11. Could you check the suggestions in my previous reply and let me know how they relate to your case? Many thanks |
Hi David. I managed to run Orthofinder 2.3.11 via a bash scrip using export OPENBLAS_NUM_THREADS=1 as explained in this link https://fossies.org/linux/OpenBLAS/USAGE.md, at the time of tis post im still wating for the whole software to end its run, but I will report when its finished. Kind regards Ivan |
That's great, thanks! |
Ok, the run of Orthofinder is complete without any issues, so for other users if you encounter the OpenBLAS Warining the way to solve it is to use export OPENBLAS_NUM_THREADS=1. Regards Ivan |
Hi,
we are using PBS Pro as job scheduler. We inforce memory and cpu usage.
When we launch like this:
PBS kills the job:
It means that we asked for 24 cpus and orthofinder used at least(!) 544!
So there is something that we don't understand. Is our command line good?
thank you for your ideas.
The text was updated successfully, but these errors were encountered: