Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large bagging is very slow #628

Closed
Laurae2 opened this issue Jun 16, 2017 · 12 comments
Closed

Large bagging is very slow #628

Laurae2 opened this issue Jun 16, 2017 · 12 comments

Comments

@Laurae2
Copy link
Contributor

Laurae2 commented Jun 16, 2017

Bagging is very slow. I am not sure what is causing it. See #562 for the dataset. I am using 0.40 subsampling to have this issue, it is not reproducible when subsampling is 0.60. I think bagging uses only 1 core, but I don't see this issue when using 0.60.

Using DLL compiled with Visual Studio 2017.

image

@guolinke
Copy link
Collaborator

can you change 0.5 in this line : https://github.com/Microsoft/LightGBM/blob/master/src/boosting/gbdt.cpp#L150, to 1e-6 , and try again?

@guolinke
Copy link
Collaborator

can you also try the 'bagging' branch ?

@Laurae2
Copy link
Contributor Author

Laurae2 commented Jun 17, 2017

@guolinke Switching to 1e-6 seems to fix the issue.

On bagging branch, it was the same (branch is gone now?).

@Laurae2 Laurae2 closed this as completed Jun 17, 2017
@guolinke
Copy link
Collaborator

@Laurae2 can you try the latest master branch?

@guolinke guolinke reopened this Jun 17, 2017
@guolinke
Copy link
Collaborator

@Laurae2
I add some timing output in the "bagging" branch. Can you run with it and paste the logs (enable verbose) ?

@Laurae2
Copy link
Contributor Author

Laurae2 commented Jun 17, 2017

Here some logs. I think some logs are out of place, no idea why. I'll retry with CLI.

> Laurae::timer_func_print({model <- lgb.train(params = list(objective = "binary",
+                                                            metric = "auc",
+                                                            bin_construct_sample_cnt = 2250000L,
+                                                            early_stopping_round = 25),
+                                              train,
+                                              5,
+                                              list(test = test),
+                                              verbose = 2)})
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=11
[1]:	test's auc:0.501172 
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=11
[2]:	test's auc:0.501379 
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=8
[3]:	test's auc:0.502558 
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=12
[4]:	test's auc:0.502981 
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=10
[5]:	test's auc:0.50398 
The function ran in 37827.132 milliseconds.
[1] 37827.13
> rm(model)
> gc()
[LightGBM] [Info] GBDT::boosting costs 0.027171
[LightGBM] [Info] GBDT::train_score costs 0.012051
[LightGBM] [Info] GBDT::out_of_bag_score costs 0.000001
[LightGBM] [Info] GBDT::valid_score costs 0.006769
[LightGBM] [Info] GBDT::metric costs 0.000000
[LightGBM] [Info] GBDT::bagging costs 0.000003
[LightGBM] [Info] GBDT::bagging_subset_time costs 0.000000
[LightGBM] [Info] GBDT::reset_tree_learner_time costs 0.000000
[LightGBM] [Info] GBDT::sub_gradient costs 0.000000
[LightGBM] [Info] GBDT::tree costs 27.997018
[LightGBM] [Info] SerialTreeLearner::init_train costs 2.393088
[LightGBM] [Info] SerialTreeLearner::init_split costs 12.377513
[LightGBM] [Info] SerialTreeLearner::hist_build costs 10.837631
[LightGBM] [Info] SerialTreeLearner::find_split costs 2.226329
[LightGBM] [Info] SerialTreeLearner::split costs 0.070301
[LightGBM] [Info] SerialTreeLearner::ordered_bin costs 14.763519
          used (Mb) gc trigger (Mb) max used (Mb)
Ncells  692175 37.0    1168576 62.5  1168576 62.5
Vcells 3516642 26.9    5133766 39.2  4078954 31.2
> Laurae::timer_func_print({model <- lgb.train(params = list(objective = "binary",
+                                                            metric = "auc",
+                                                            bin_construct_sample_cnt = 2250000L,
+                                                            early_stopping_round = 25,
+                                                            bagging_freq = 1,
+                                                            bagging_seed = 1,
+                                                            bagging_fraction = 0.6),
+                                              train,
+                                              5,
+                                              list(test = test),
+                                              verbose = 2)})
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
[LightGBM] [Debug] Re-bagging, using 1350000 data to train
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=12
[1]:	test's auc:0.500272 
[LightGBM] [Debug] Re-bagging, using 1350000 data to train
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=11
[2]:	test's auc:0.500702 
[LightGBM] [Debug] Re-bagging, using 1350000 data to train
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=11
[3]:	test's auc:0.501856 
[LightGBM] [Debug] Re-bagging, using 1350000 data to train
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=11
[4]:	test's auc:0.503777 
[LightGBM] [Debug] Re-bagging, using 1350000 data to train
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=10
[5]:	test's auc:0.50587 
The function ran in 24566.072 milliseconds.
[1] 24566.07
> 
> 
> rm(model)
> gc()
[LightGBM] [Info] GBDT::boosting costs 0.079639
[LightGBM] [Info] GBDT::train_score costs 0.025451
[LightGBM] [Info] GBDT::out_of_bag_score costs 0.065390
[LightGBM] [Info] GBDT::valid_score costs 0.016837
[LightGBM] [Info] GBDT::metric costs 0.000000
[LightGBM] [Info] GBDT::bagging costs 0.013739
[LightGBM] [Info] GBDT::bagging_subset_time costs 0.000000
[LightGBM] [Info] GBDT::reset_tree_learner_time costs 0.000000
[LightGBM] [Info] GBDT::sub_gradient costs 0.000000
[LightGBM] [Info] GBDT::tree costs 50.325836
[LightGBM] [Info] SerialTreeLearner::init_train costs 6.186171
[LightGBM] [Info] SerialTreeLearner::init_split costs 20.463570
[LightGBM] [Info] SerialTreeLearner::hist_build costs 18.954661
[LightGBM] [Info] SerialTreeLearner::find_split costs 4.454569
[LightGBM] [Info] SerialTreeLearner::split costs 0.111960
[LightGBM] [Info] SerialTreeLearner::ordered_bin costs 26.636777
          used (Mb) gc trigger (Mb) max used (Mb)
Ncells  694113 37.1    1168576 62.5  1168576 62.5
Vcells 3522613 26.9    6240519 47.7  4383248 33.5
> Laurae::timer_func_print({model <- lgb.train(params = list(objective = "binary",
+                                                            metric = "auc",
+                                                            bin_construct_sample_cnt = 2250000L,
+                                                            early_stopping_round = 25,
+                                                            bagging_freq = 1,
+                                                            bagging_seed = 1,
+                                                            bagging_fraction = 0.4),
+                                              train,
+                                              5,
+                                              list(test = test),
+                                              verbose = 2)})
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
[LightGBM] [Debug] use subset for bagging
[LightGBM] [Debug] Re-bagging, using 900000 data to train
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=11
[1]:	test's auc:0.501405 
[LightGBM] [Debug] Re-bagging, using 900000 data to train
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=13
[2]:	test's auc:0.502849 
[LightGBM] [Debug] Re-bagging, using 900000 data to train
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=10
[3]:	test's auc:0.504528 
[LightGBM] [Debug] Re-bagging, using 900000 data to train
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=9
[4]:	test's auc:0.506207 
[LightGBM] [Debug] Re-bagging, using 900000 data to train
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=13
[5]:	test's auc:0.506727 
The function ran in 90240.890 milliseconds.
[1] 90240.89
> rm(model)
> gc()
[LightGBM] [Info] GBDT::boosting costs 0.165529
[LightGBM] [Info] GBDT::train_score costs 0.124710
[LightGBM] [Info] GBDT::out_of_bag_score costs 0.065391
[LightGBM] [Info] GBDT::valid_score costs 0.023227
[LightGBM] [Info] GBDT::metric costs 0.000000
[LightGBM] [Info] GBDT::bagging costs 76.685486
[LightGBM] [Info] GBDT::bagging_subset_time costs 28.801937
[LightGBM] [Info] GBDT::reset_tree_learner_time costs 47.856741
[LightGBM] [Info] GBDT::sub_gradient costs 0.007842
[LightGBM] [Info] GBDT::tree costs 61.569484
[LightGBM] [Info] SerialTreeLearner::init_train costs 7.088206
[LightGBM] [Info] SerialTreeLearner::init_split costs 26.233300
[LightGBM] [Info] SerialTreeLearner::hist_build costs 21.110779
[LightGBM] [Info] SerialTreeLearner::find_split costs 6.784831
[LightGBM] [Info] SerialTreeLearner::split costs 0.141303
[LightGBM] [Info] SerialTreeLearner::ordered_bin costs 33.290067
          used (Mb) gc trigger (Mb) max used (Mb)
Ncells  694853 37.2    1168576 62.5  1168576 62.5
Vcells 3523226 26.9    6240519 47.7  4389355 33.5

@Laurae2
Copy link
Contributor Author

Laurae2 commented Jun 17, 2017

@guolinke Better logs below:

Lolo@Laurae MINGW64 /e/lightgbm
$ E:/lightgbm/lightgbm.exe data=../benchmark_lot/data/reput_train_lgb_na.data num_threads=8 learning_rate=0.25 max_depth=3 num_leaves=7 min_gain_to_split=1 min_sum_hessian_in_leaf=1 min_data_in_leaf=1 num_trees=5 metric=binary_logloss bagging_freq=1 bagging_seed=1 bagging_fraction=1.0 test=../benchmark_lot/data/reput_test_lgb_na.data app=binary
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Finished loading data in 7.466967 seconds
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
[LightGBM] [Info] Finished initializing training
[LightGBM] [Info] Started training...
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:1, valid_1 binary_logloss : 0.669789
[LightGBM] [Info] 3.083114 seconds elapsed, finished iteration 1
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:2, valid_1 binary_logloss : 0.65686
[LightGBM] [Info] 6.534374 seconds elapsed, finished iteration 2
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:3, valid_1 binary_logloss : 0.649763
[LightGBM] [Info] 9.767502 seconds elapsed, finished iteration 3
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:4, valid_1 binary_logloss : 0.645923
[LightGBM] [Info] 12.838336 seconds elapsed, finished iteration 4
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:5, valid_1 binary_logloss : 0.643904
[LightGBM] [Info] 15.916650 seconds elapsed, finished iteration 5
[LightGBM] [Info] Finished training
[LightGBM] [Info] GBDT::boosting costs 0.028511
[LightGBM] [Info] GBDT::train_score costs 0.009519
[LightGBM] [Info] GBDT::out_of_bag_score costs 0.000001
[LightGBM] [Info] GBDT::valid_score costs 0.002431
[LightGBM] [Info] GBDT::metric costs 0.005669
[LightGBM] [Info] GBDT::bagging costs 0.000002
[LightGBM] [Info] GBDT::bagging_subset_time costs 0.000000
[LightGBM] [Info] GBDT::reset_tree_learner_time costs 0.000000
[LightGBM] [Info] GBDT::sub_gradient costs 0.000000
[LightGBM] [Info] GBDT::tree costs 15.870458
[LightGBM] [Info] SerialTreeLearner::init_train costs 2.178705
[LightGBM] [Info] SerialTreeLearner::init_split costs 3.308739
[LightGBM] [Info] SerialTreeLearner::hist_build costs 9.902303
[LightGBM] [Info] SerialTreeLearner::find_split costs 0.456796
[LightGBM] [Info] SerialTreeLearner::split costs 0.021500
[LightGBM] [Info] SerialTreeLearner::ordered_bin costs 5.481398

Lolo@Laurae MINGW64 /e/lightgbm
$ E:/lightgbm/lightgbm.exe data=../benchmark_lot/data/reput_train_lgb_na.data num_threads=8 learning_rate=0.25 max_depth=3 num_leaves=7 min_gain_to_split=1 min_sum_hessian_in_leaf=1 min_data_in_leaf=1 num_trees=5 metric=binary_logloss bagging_freq=1 bagging_seed=1 bagging_fraction=0.6 test=../benchmark_lot/data/reput_test_lgb_na.data app=binary
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Finished loading data in 6.990656 seconds
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
[LightGBM] [Info] Finished initializing training
[LightGBM] [Info] Started training...
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:1, valid_1 binary_logloss : 0.669724
[LightGBM] [Info] 3.145600 seconds elapsed, finished iteration 1
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:2, valid_1 binary_logloss : 0.656882
[LightGBM] [Info] 5.911323 seconds elapsed, finished iteration 2
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:3, valid_1 binary_logloss : 0.649771
[LightGBM] [Info] 8.587378 seconds elapsed, finished iteration 3
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:4, valid_1 binary_logloss : 0.645973
[LightGBM] [Info] 11.349179 seconds elapsed, finished iteration 4
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:5, valid_1 binary_logloss : 0.64391
[LightGBM] [Info] 14.342477 seconds elapsed, finished iteration 5
[LightGBM] [Info] Finished training
[LightGBM] [Info] GBDT::boosting costs 0.027847
[LightGBM] [Info] GBDT::train_score costs 0.008080
[LightGBM] [Info] GBDT::out_of_bag_score costs 0.015252
[LightGBM] [Info] GBDT::valid_score costs 0.002572
[LightGBM] [Info] GBDT::metric costs 0.003098
[LightGBM] [Info] GBDT::bagging costs 0.013149
[LightGBM] [Info] GBDT::bagging_subset_time costs 0.000000
[LightGBM] [Info] GBDT::reset_tree_learner_time costs 0.000000
[LightGBM] [Info] GBDT::sub_gradient costs 0.000000
[LightGBM] [Info] GBDT::tree costs 14.272429
[LightGBM] [Info] SerialTreeLearner::init_train costs 3.820590
[LightGBM] [Info] SerialTreeLearner::init_split costs 2.218611
[LightGBM] [Info] SerialTreeLearner::hist_build costs 7.742144
[LightGBM] [Info] SerialTreeLearner::find_split costs 0.451989
[LightGBM] [Info] SerialTreeLearner::split costs 0.017599
[LightGBM] [Info] SerialTreeLearner::ordered_bin costs 6.033382

Lolo@Laurae MINGW64 /e/lightgbm
$ E:/lightgbm/lightgbm.exe data=../benchmark_lot/data/reput_train_lgb_na.data num_threads=8 learning_rate=0.25 max_depth=3 num_leaves=7 min_gain_to_split=1 min_sum_hessian_in_leaf=1 min_data_in_leaf=1 num_trees=5 metric=binary_logloss bagging_freq=1 bagging_seed=1 bagging_fraction=0.4 test=../benchmark_lot/data/reput_test_lgb_na.data app=binary
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Finished loading data in 7.311082 seconds
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
[LightGBM] [Info] Finished initializing training
[LightGBM] [Info] Started training...
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:1, valid_1 binary_logloss : 0.669779
[LightGBM] [Info] 15.066362 seconds elapsed, finished iteration 1
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:2, valid_1 binary_logloss : 0.65695
[LightGBM] [Info] 32.950422 seconds elapsed, finished iteration 2
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:3, valid_1 binary_logloss : 0.649819
[LightGBM] [Info] 47.337700 seconds elapsed, finished iteration 3
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:4, valid_1 binary_logloss : 0.645965
[LightGBM] [Info] 60.975131 seconds elapsed, finished iteration 4
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:5, valid_1 binary_logloss : 0.64389
[LightGBM] [Info] 73.995944 seconds elapsed, finished iteration 5
[LightGBM] [Info] Finished training
[LightGBM] [Info] GBDT::boosting costs 0.027556
[LightGBM] [Info] GBDT::train_score costs 0.034221
[LightGBM] [Info] GBDT::out_of_bag_score costs 0.000001
[LightGBM] [Info] GBDT::valid_score costs 0.002442
[LightGBM] [Info] GBDT::metric costs 0.003057
[LightGBM] [Info] GBDT::bagging costs 68.762191
[LightGBM] [Info] GBDT::bagging_subset_time costs 28.349320
[LightGBM] [Info] GBDT::reset_tree_learner_time costs 40.400013
[LightGBM] [Info] GBDT::sub_gradient costs 0.007574
[LightGBM] [Info] GBDT::tree costs 5.158859
[LightGBM] [Info] SerialTreeLearner::init_train costs 0.882581
[LightGBM] [Info] SerialTreeLearner::init_split costs 1.307824
[LightGBM] [Info] SerialTreeLearner::hist_build costs 2.537724
[LightGBM] [Info] SerialTreeLearner::find_split costs 0.419427
[LightGBM] [Info] SerialTreeLearner::split costs 0.008666
[LightGBM] [Info] SerialTreeLearner::ordered_bin costs 2.188091

@Laurae2
Copy link
Contributor Author

Laurae2 commented Jun 17, 2017

@guolinke This is with 1e-6 fix:

Lolo@Laurae MINGW64 /e/lightgbm
$ E:/lightgbm/lightgbm.exe data=../benchmark_lot/data/reput_train_lgb_na.data num_threads=8 learning_rate=0.25 max_depth=3 num_leaves=7 min_gain_to_split=1 min_sum_hessian_in_leaf=1 min_data_in_leaf=1 num_trees=5 metric=binary_logloss bagging_freq=1 bagging_seed=1 bagging_fraction=1.0 test=../benchmark_lot/data/reput_test_lgb_na.data app=binary
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Finished loading data in 7.291155 seconds
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
[LightGBM] [Info] Finished initializing training
[LightGBM] [Info] Started training...
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:1, valid_1 binary_logloss : 0.669789
[LightGBM] [Info] 3.088217 seconds elapsed, finished iteration 1
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:2, valid_1 binary_logloss : 0.65686
[LightGBM] [Info] 6.512720 seconds elapsed, finished iteration 2
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:3, valid_1 binary_logloss : 0.649763
[LightGBM] [Info] 9.766369 seconds elapsed, finished iteration 3
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:4, valid_1 binary_logloss : 0.645923
[LightGBM] [Info] 12.856576 seconds elapsed, finished iteration 4
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:5, valid_1 binary_logloss : 0.643904
[LightGBM] [Info] 15.938305 seconds elapsed, finished iteration 5
[LightGBM] [Info] Finished training
[LightGBM] [Info] GBDT::boosting costs 0.031661
[LightGBM] [Info] GBDT::train_score costs 0.009531
[LightGBM] [Info] GBDT::out_of_bag_score costs 0.000001
[LightGBM] [Info] GBDT::valid_score costs 0.002449
[LightGBM] [Info] GBDT::metric costs 0.003105
[LightGBM] [Info] GBDT::bagging costs 0.000002
[LightGBM] [Info] GBDT::bagging_subset_time costs 0.000000
[LightGBM] [Info] GBDT::reset_tree_learner_time costs 0.000000
[LightGBM] [Info] GBDT::sub_gradient costs 0.000000
[LightGBM] [Info] GBDT::tree costs 15.891500
[LightGBM] [Info] SerialTreeLearner::init_train costs 2.169989
[LightGBM] [Info] SerialTreeLearner::init_split costs 3.359920
[LightGBM] [Info] SerialTreeLearner::hist_build costs 9.847964
[LightGBM] [Info] SerialTreeLearner::find_split costs 0.481699
[LightGBM] [Info] SerialTreeLearner::split costs 0.029584
[LightGBM] [Info] SerialTreeLearner::ordered_bin costs 5.522944

Lolo@Laurae MINGW64 /e/lightgbm
$ E:/lightgbm/lightgbm.exe data=../benchmark_lot/data/reput_train_lgb_na.data num_threads=8 learning_rate=0.25 max_depth=3 num_leaves=7 min_gain_to_split=1 min_sum_hessian_in_leaf=1 min_data_in_leaf=1 num_trees=5 metric=binary_logloss bagging_freq=1 bagging_seed=1 bagging_fraction=0.6 test=../benchmark_lot/data/reput_test_lgb_na.data app=binary
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Finished loading data in 7.777340 seconds
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
[LightGBM] [Info] Finished initializing training
[LightGBM] [Info] Started training...
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:1, valid_1 binary_logloss : 0.669724
[LightGBM] [Info] 2.760088 seconds elapsed, finished iteration 1
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:2, valid_1 binary_logloss : 0.656882
[LightGBM] [Info] 5.420121 seconds elapsed, finished iteration 2
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:3, valid_1 binary_logloss : 0.649771
[LightGBM] [Info] 7.992326 seconds elapsed, finished iteration 3
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:4, valid_1 binary_logloss : 0.645973
[LightGBM] [Info] 10.777746 seconds elapsed, finished iteration 4
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:5, valid_1 binary_logloss : 0.64391
[LightGBM] [Info] 13.426416 seconds elapsed, finished iteration 5
[LightGBM] [Info] Finished training
[LightGBM] [Info] GBDT::boosting costs 0.026495
[LightGBM] [Info] GBDT::train_score costs 0.008024
[LightGBM] [Info] GBDT::out_of_bag_score costs 0.015815
[LightGBM] [Info] GBDT::valid_score costs 0.002523
[LightGBM] [Info] GBDT::metric costs 0.003107
[LightGBM] [Info] GBDT::bagging costs 0.012842
[LightGBM] [Info] GBDT::bagging_subset_time costs 0.000000
[LightGBM] [Info] GBDT::reset_tree_learner_time costs 0.000000
[LightGBM] [Info] GBDT::sub_gradient costs 0.000000
[LightGBM] [Info] GBDT::tree costs 13.357567
[LightGBM] [Info] SerialTreeLearner::init_train costs 3.719762
[LightGBM] [Info] SerialTreeLearner::init_split costs 1.805211
[LightGBM] [Info] SerialTreeLearner::hist_build costs 7.380577
[LightGBM] [Info] SerialTreeLearner::find_split costs 0.435972
[LightGBM] [Info] SerialTreeLearner::split costs 0.014171
[LightGBM] [Info] SerialTreeLearner::ordered_bin costs 5.518769

Lolo@Laurae MINGW64 /e/lightgbm
$ E:/lightgbm/lightgbm.exe data=../benchmark_lot/data/reput_train_lgb_na.data num_threads=8 learning_rate=0.25 max_depth=3 num_leaves=7 min_gain_to_split=1 min_sum_hessian_in_leaf=1 min_data_in_leaf=1 num_trees=5 metric=binary_logloss bagging_freq=1 bagging_seed=1 bagging_fraction=0.4 test=../benchmark_lot/data/reput_test_lgb_na.data app=binary
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Finished loading data in 7.265347 seconds
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
[LightGBM] [Info] Finished initializing training
[LightGBM] [Info] Started training...
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:1, valid_1 binary_logloss : 0.669779
[LightGBM] [Info] 2.088153 seconds elapsed, finished iteration 1
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:2, valid_1 binary_logloss : 0.65695
[LightGBM] [Info] 4.460803 seconds elapsed, finished iteration 2
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:3, valid_1 binary_logloss : 0.649819
[LightGBM] [Info] 6.536185 seconds elapsed, finished iteration 3
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:4, valid_1 binary_logloss : 0.645965
[LightGBM] [Info] 8.671731 seconds elapsed, finished iteration 4
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:5, valid_1 binary_logloss : 0.64389
[LightGBM] [Info] 10.821296 seconds elapsed, finished iteration 5
[LightGBM] [Info] Finished training
[LightGBM] [Info] GBDT::boosting costs 0.030278
[LightGBM] [Info] GBDT::train_score costs 0.006794
[LightGBM] [Info] GBDT::out_of_bag_score costs 0.022166
[LightGBM] [Info] GBDT::valid_score costs 0.002548
[LightGBM] [Info] GBDT::metric costs 0.003119
[LightGBM] [Info] GBDT::bagging costs 0.013182
[LightGBM] [Info] GBDT::bagging_subset_time costs 0.000000
[LightGBM] [Info] GBDT::reset_tree_learner_time costs 0.000000
[LightGBM] [Info] GBDT::sub_gradient costs 0.000000
[LightGBM] [Info] GBDT::tree costs 10.743162
[LightGBM] [Info] SerialTreeLearner::init_train costs 3.490598
[LightGBM] [Info] SerialTreeLearner::init_split costs 1.424444
[LightGBM] [Info] SerialTreeLearner::hist_build costs 5.369412
[LightGBM] [Info] SerialTreeLearner::find_split costs 0.439387
[LightGBM] [Info] SerialTreeLearner::split costs 0.016834
[LightGBM] [Info] SerialTreeLearner::ordered_bin costs 4.910205

@guolinke
Copy link
Collaborator

@Laurae2 Thanks for the help.
I think the latest master branch have fixed this.

@Laurae2
Copy link
Contributor Author

Laurae2 commented Jun 17, 2017

@guolinke I am getting segmentation fault instead now.

Lolo@Laurae MINGW64 /e/lightgbm
$ E:/lightgbm/lightgbm.exe data=../benchmark_lot/data/reput_train_lgb_na.data num_threads=8 learning_rate=0.25 max_depth=3 num_leaves=7 min_gain_to_split=1 min_sum_hessian_in_leaf=1 min_data_in_leaf=1 num_trees=5 metric=binary_logloss bagging_freq=1 bagging_seed=1 bagging_fraction=1.0 test=../benchmark_lot/data/reput_test_lgb_na.data app=binary
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Finished loading data in 7.356559 seconds
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
[LightGBM] [Info] Finished initializing training
[LightGBM] [Info] Started training...
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:1, valid_1 binary_logloss : 0.669789
[LightGBM] [Info] 3.090586 seconds elapsed, finished iteration 1
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:2, valid_1 binary_logloss : 0.65686
[LightGBM] [Info] 6.522369 seconds elapsed, finished iteration 2
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:3, valid_1 binary_logloss : 0.649763
[LightGBM] [Info] 9.768736 seconds elapsed, finished iteration 3
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:4, valid_1 binary_logloss : 0.645923
[LightGBM] [Info] 12.862355 seconds elapsed, finished iteration 4
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:5, valid_1 binary_logloss : 0.643904
[LightGBM] [Info] 15.948677 seconds elapsed, finished iteration 5
[LightGBM] [Info] Finished training

Lolo@Laurae MINGW64 /e/lightgbm
$ E:/lightgbm/lightgbm.exe data=../benchmark_lot/data/reput_train_lgb_na.data num_threads=8 learning_rate=0.25 max_depth=3 num_leaves=7 min_gain_to_split=1 min_sum_hessian_in_leaf=1 min_data_in_leaf=1 num_trees=5 metric=binary_logloss bagging_freq=1 bagging_seed=1 bagging_fraction=0.6 test=../benchmark_lot/data/reput_test_lgb_na.data app=binary
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Finished loading data in 7.292828 seconds
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
Segmentation fault

Lolo@Laurae MINGW64 /e/lightgbm
$ E:/lightgbm/lightgbm.exe data=../benchmark_lot/data/reput_train_lgb_na.data num_threads=8 learning_rate=0.25 max_depth=3 num_leaves=7 min_gain_to_split=1 min_sum_hessian_in_leaf=1 min_data_in_leaf=1 num_trees=5 metric=binary_logloss bagging_freq=1 bagging_seed=1 bagging_fraction=0.4 test=../benchmark_lot/data/reput_test_lgb_na.data app=binary
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Finished loading data in 7.078148 seconds
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
Segmentation fault

@guolinke
Copy link
Collaborator

@Laurae2 sorry, it has a bug. I just use "push -f" to fix it.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants