Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception: 'access violation reading' while training with init_model #2249

Closed
MaxtoqOV opened this issue Jul 4, 2019 · 8 comments · Fixed by #2251
Closed

Exception: 'access violation reading' while training with init_model #2249

MaxtoqOV opened this issue Jul 4, 2019 · 8 comments · Fixed by #2251
Labels

Comments

@MaxtoqOV
Copy link

MaxtoqOV commented Jul 4, 2019

Hi, I get an 'OSError: exception: access violation reading 0x000001C0D2851338' while using lightgbm.train with an already existing Booster as 'init_model' (see code below).
The error occurs after a few training iterations. The number of iterations isn't always the same: I've had 4, 7 and 3598, and sometimes it works well... completely random.
I had never seen this error before starting using the 'init_model' feature so I assume there's a bug with it.

Environment info

Operating System: Windows 10
CPU/GPU model: Intel i7 6700HQ
C++/Python/R version: Python 3.7.1
LightGBM version or commit hash: 2.2.3

Error message

OSError Traceback (most recent call last)
<ipython-input-197-4c6e4d4696e6> in <module>
29 valid_sets=lgb_eval,
30 early_stopping_rounds=100,
---> 31 init_model=gbm
32 )
33
~\Anaconda3\lib\site-packages\lightgbm\engine.py in train(params, train_set, num_boost_round, valid_sets, valid_names, fobj, feval, init_model, feature_name, categorical_feature, early_stopping_rounds, evals_result, verbose_eval, learning_rates, keep_training_booster, callbacks)
216 evaluation_result_list=None))
217
--> 218 booster.update(fobj=fobj)
219
220 evaluation_result_list = []
~\Anaconda3\lib\site-packages\lightgbm\basic.py in update(self, train_set, fobj)
1800 _safe_call(_LIB.LGBM_BoosterUpdateOneIter(
1801 self.handle,
-> 1802 ctypes.byref(is_finished)))
1803 self._is_predicted_cur_iter = [False for _ in range(self.__num_dataset)]
1804 return is_finished.value == 1
OSError: exception: access violation reading 0x000001C0D2851338

Reproducible examples

`gbm = None
for i in (0.4, 0.3, 0.2):
train_X, test_X, train_y, test_y = train_test_split(X, y, test_size=i, shuffle=False)

# Create LGB datasets
lgb_train = lgb.Dataset(train_X, train_y)
lgb_eval = lgb.Dataset(test_X, test_y, reference=lgb_train)

params = {
    'learning_rate': 0.005,
    'boosting_type': 'dart',
    'objective': 'regression',
    'metric': {'l2', 'l1'},
    'feature_fraction': 0.8,
    'bagging_fraction': 0.8,
    'bagging_freq': 5,
    'max_bin': 1000,
    'num_leaves': 100
}

# Train the model
gbm = lgb.train(
    params, 
    lgb_train, 
    num_boost_round=10000, 
    valid_sets=lgb_eval, 
    early_stopping_rounds=100, 
    init_model=gbm
)`

Steps to reproduce

see code.

@guolinke guolinke added the bug label Jul 5, 2019
@guolinke
Copy link
Collaborator

guolinke commented Jul 5, 2019

@MaxtoqOV did you try it without dart?

@MaxtoqOV
Copy link
Author

MaxtoqOV commented Jul 5, 2019

@guolinke I've just tried with the other available algorithms and the error doesn't occur.

@MaxtoqOV
Copy link
Author

MaxtoqOV commented Jul 5, 2019

Also I noticed that early stopping does not work with dart. I don't know if that's normal, I didn't see anything about that in the doc.

@guolinke
Copy link
Collaborator

guolinke commented Jul 5, 2019

for early stopping, refer to #1893, I think some documents is added.

@guolinke
Copy link
Collaborator

guolinke commented Jul 5, 2019

@MaxtoqOV did you train the init_model by the dart as well?

@MaxtoqOV
Copy link
Author

MaxtoqOV commented Jul 5, 2019

@guolinke Yes, as you can see in the code (despite the display bug, sorry about that), the booster is created during the first iteration of the loop. Thus the init_model has the exact same parameters than the following ones.

@guolinke
Copy link
Collaborator

guolinke commented Jul 5, 2019

indeed, I find a bug, fixes are in #2251, can you have a try?

@MaxtoqOV
Copy link
Author

MaxtoqOV commented Jul 8, 2019

Thanks for the quick fix. It seems to be working well now.

@MaxtoqOV MaxtoqOV closed this as completed Jul 8, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Mar 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants