-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
light gbm hangs when loading a model file in subprocess #6137
Comments
@assassin5615 Thanks for using LightGBM. Did you try setting the environment variable |
@shiyu1994 in my environment, OMP_NUM_THREADS is always 1 as I ran into other issues that requires set OMP_NUM_THREADS 1, so yes. |
I also tried to print the value of OMP_NUM_THREADS in the script, it's 1 before calling train and prediction. |
I encountered very same problem, any solutions so far? |
Ok just found a HACK after 3 hours struggling I put the training stages into a subprocess instead of running it under main process, than subprocess load model by |
Description
train two models in the main process and save them into two model files.
then use Multiprocessing.pool to load these two model files in subprocess, the subprocess will hang.
part of the stack trace by using pyrasite-shell is as below
File "simple_lgbm.py", line 77, in predict
x = lgb.Booster(model_file=file_name)
File ".../lightgbm/basic.py", line 2087, in init
_safe_call(_LIB.LGBM_BoosterCreateFromModelfile(
gdb shows more detail, the CreateBoosting function calls something like __kmp_api_GOMP_parallel_40_alias() and it hung at __kmp_suspend_64()
in light gbm FAQ, it mentioned that due to openmp bug, it could hang with multithreading and fork on linux. and suggest to use nthreads=1 to close multithreading. but setting nthreads=1 has no effect for lgb.Booster when loading model file.
is there a workaround or fix for this?
Reproducible example
the code is based on simple_example.py from light gbm repo.
Environment info
LightGBM version or commit hash: 4.0.0
Command(s) you used to install LightGBM
Additional Comments
The text was updated successfully, but these errors were encountered: