You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When attempting to train a model (classification or regression) with a dataset containing a categorial feature with 1035+ categories, LGBM hangs without displaying any messages. It remains stuck without processing anything (I checked the system monitor and it doesn't seem to be doing anything), and it also hangs for a really long time (I don't know exactly how long, the longest I've waited is 2 hours and it still didn't finish executing).
This issue only occurs with LGBM versions 4.2.0 and 4.3.0.
Reproducible example
importnumpyasnpimportpandasaspdimportlightgbmaslgbnum_categories=1034# this worksnum_categories=1035# this hangs lgbm infinitelyX=pd.DataFrame(
np.random.random((10000, 5)),
columns=[f'num_{i}'foriinrange(5)]
)
X["cat"] =np.arange(10000) %num_categoriesy= (np.random.random(10000) >0.5).astype(int)
dset=lgb.Dataset(X, y, categorical_feature=["cat"])
model=lgb.train({'objective':'binary', 'verbose':2}, dset, num_boost_round=10)
Environment info
LightGBM version or commit hash: 4.2.0 and 4.3.0
Command(s) you used to install LightGBM
conda install -c conda-forge lightgbm==4.3.0
I tested it in OSX: 14.4 (23E214)
Additional Comments
I also tested it in Kaggle (original environment 2024-02-27) and got the same issue.
The text was updated successfully, but these errors were encountered:
jameslamb
changed the title
LGBM hangs with high number of categories
[python-package] LGBM hangs with high number of categories
Apr 1, 2024
Thanks for using LightGBM, and for the excellent write-up!
This looks identical to the issue reported in #6273, and we have an in-progress pull request to fix it: #6394.
Sorry you're experiencing this. This is a bug that was introduced around lightgbm==4.2.0. You could try downgrading to lightgbm==4.1.0 to work around it until a release with that fix is published.
I'm going to close this as a duplicate of #6273 and add a comment there mentioning it. If you think they are different issues, please let me know.
I also want to say... I REALLY appreciate the effort you put into making this write-up clear and the example minimal and reproducible. Made it very easy to understand what was being reported and connect it to that existing bug.
Description
When attempting to train a model (classification or regression) with a dataset containing a categorial feature with 1035+ categories, LGBM hangs without displaying any messages. It remains stuck without processing anything (I checked the system monitor and it doesn't seem to be doing anything), and it also hangs for a really long time (I don't know exactly how long, the longest I've waited is 2 hours and it still didn't finish executing).
This issue only occurs with LGBM versions 4.2.0 and 4.3.0.
Reproducible example
Environment info
LightGBM version or commit hash: 4.2.0 and 4.3.0
Command(s) you used to install LightGBM
I tested it in OSX: 14.4 (23E214)
Additional Comments
I also tested it in Kaggle (original environment 2024-02-27) and got the same issue.
The text was updated successfully, but these errors were encountered: