Exception: 'access violation reading' while training with init_model #2249

MaxtoqOV · 2019-07-04T09:21:51Z

Hi, I get an 'OSError: exception: access violation reading 0x000001C0D2851338' while using lightgbm.train with an already existing Booster as 'init_model' (see code below).
The error occurs after a few training iterations. The number of iterations isn't always the same: I've had 4, 7 and 3598, and sometimes it works well... completely random.
I had never seen this error before starting using the 'init_model' feature so I assume there's a bug with it.

Environment info

Operating System: Windows 10
CPU/GPU model: Intel i7 6700HQ
C++/Python/R version: Python 3.7.1
LightGBM version or commit hash: 2.2.3

Error message

OSError Traceback (most recent call last)
<ipython-input-197-4c6e4d4696e6> in <module>
29 valid_sets=lgb_eval,
30 early_stopping_rounds=100,
---> 31 init_model=gbm
32 )
33
~\Anaconda3\lib\site-packages\lightgbm\engine.py in train(params, train_set, num_boost_round, valid_sets, valid_names, fobj, feval, init_model, feature_name, categorical_feature, early_stopping_rounds, evals_result, verbose_eval, learning_rates, keep_training_booster, callbacks)
216 evaluation_result_list=None))
217
--> 218 booster.update(fobj=fobj)
219
220 evaluation_result_list = []
~\Anaconda3\lib\site-packages\lightgbm\basic.py in update(self, train_set, fobj)
1800 _safe_call(_LIB.LGBM_BoosterUpdateOneIter(
1801 self.handle,
-> 1802 ctypes.byref(is_finished)))
1803 self._is_predicted_cur_iter = [False for _ in range(self.__num_dataset)]
1804 return is_finished.value == 1
OSError: exception: access violation reading 0x000001C0D2851338

Reproducible examples

`gbm = None
for i in (0.4, 0.3, 0.2):
train_X, test_X, train_y, test_y = train_test_split(X, y, test_size=i, shuffle=False)

# Create LGB datasets
lgb_train = lgb.Dataset(train_X, train_y)
lgb_eval = lgb.Dataset(test_X, test_y, reference=lgb_train)

params = {
    'learning_rate': 0.005,
    'boosting_type': 'dart',
    'objective': 'regression',
    'metric': {'l2', 'l1'},
    'feature_fraction': 0.8,
    'bagging_fraction': 0.8,
    'bagging_freq': 5,
    'max_bin': 1000,
    'num_leaves': 100
}

# Train the model
gbm = lgb.train(
    params, 
    lgb_train, 
    num_boost_round=10000, 
    valid_sets=lgb_eval, 
    early_stopping_rounds=100, 
    init_model=gbm
)`

Steps to reproduce

see code.

The text was updated successfully, but these errors were encountered:

guolinke · 2019-07-05T02:24:00Z

@MaxtoqOV did you try it without dart?

MaxtoqOV · 2019-07-05T08:06:24Z

@guolinke I've just tried with the other available algorithms and the error doesn't occur.

MaxtoqOV · 2019-07-05T08:10:53Z

Also I noticed that early stopping does not work with dart. I don't know if that's normal, I didn't see anything about that in the doc.

guolinke · 2019-07-05T08:19:58Z

for early stopping, refer to #1893, I think some documents is added.

guolinke · 2019-07-05T08:37:06Z

@MaxtoqOV did you train the init_model by the dart as well?

MaxtoqOV · 2019-07-05T08:44:50Z

@guolinke Yes, as you can see in the code (despite the display bug, sorry about that), the booster is created during the first iteration of the loop. Thus the init_model has the exact same parameters than the following ones.

guolinke · 2019-07-05T08:49:10Z

indeed, I find a bug, fixes are in #2251, can you have a try?

MaxtoqOV · 2019-07-08T09:08:45Z

Thanks for the quick fix. It seems to be working well now.

guolinke added the bug label Jul 5, 2019

guolinke mentioned this issue Jul 5, 2019

fix bug when using dart with init_model #2251

Merged

MaxtoqOV closed this as completed Jul 8, 2019

lock bot locked as resolved and limited conversation to collaborators Mar 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exception: 'access violation reading' while training with init_model #2249

Exception: 'access violation reading' while training with init_model #2249

MaxtoqOV commented Jul 4, 2019

guolinke commented Jul 5, 2019

MaxtoqOV commented Jul 5, 2019

MaxtoqOV commented Jul 5, 2019

guolinke commented Jul 5, 2019

guolinke commented Jul 5, 2019

MaxtoqOV commented Jul 5, 2019

guolinke commented Jul 5, 2019

MaxtoqOV commented Jul 8, 2019

Exception: 'access violation reading' while training with init_model #2249

Exception: 'access violation reading' while training with init_model #2249

Comments

MaxtoqOV commented Jul 4, 2019

Environment info

Error message

Reproducible examples

Steps to reproduce

guolinke commented Jul 5, 2019

MaxtoqOV commented Jul 5, 2019

MaxtoqOV commented Jul 5, 2019

guolinke commented Jul 5, 2019

guolinke commented Jul 5, 2019

MaxtoqOV commented Jul 5, 2019

guolinke commented Jul 5, 2019

MaxtoqOV commented Jul 8, 2019