Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary format needed - slow model reading #372

Closed
gugatr0n1c opened this issue Mar 29, 2017 · 14 comments
Closed

Binary format needed - slow model reading #372

gugatr0n1c opened this issue Mar 29, 2017 · 14 comments

Comments

@gugatr0n1c
Copy link

I have built many models, some of them are big 300M file or bigger (10k trees ie).

In such cases predict phase is slow (not too much but still, when one is using stacking with cross validation this can slow done prediction calculation badly ~2hours).

I find out it is not problem of calling model.predict() this is reasonable fast.
Problem is loading model from disk:

model = lg.Booster(model_file = workingDir + '/modely/model_' + str(cv) + '_' + str(sc) + '.txt')

Is there any way to speed this up?
There is save_binary property, but only for datasets.

I am saving models with:
model.save_model(working_dir + '/modely/model_' + str(cv) + '_' + str(sc) + '.txt', num_iteration = model.best_iteration)

thx for any help..

@Laurae2
Copy link
Contributor

Laurae2 commented Mar 29, 2017

@guolinke Do you think a binary format for models is appropriate? (and to add to the API)

This is what xgboost does for speeding up model saving/loading.

Just for comparison:

  • xgboost uses a binary format for fast save/load
  • LightGBM does not have (?) a binary format for fast load

I tested on a private dataset with 10,000 iterations and 256 leaves, using a 2GBps PCI-E SSD to make sure the SSD is not a bottleneck, run 10x, reported in milliseconds:

Model Binary Time to load Time to save Prediction Time File size
xgboost
(.save)
Yes (Any) 598 411 3383 127,273,371 bytes
xgboost
(.RDS)
Yes (R) 789 6592 3568 46,122,319 bytes
LightGBM
(.save)
No 9206 24289 2147 133,351,088 bytes
LightGBM
(.RDS)
Yes (R) 15451 34731 2146 47,424,458 bytes

N.B: xgboost RDS got a consistent prediction speed loss (tested 50 times .save and .RDS prediction time), but is usable as is unlike LightGBM (which makes uses of .save/.load indirectly via RDS to be re-usable).

@guolinke
Copy link
Collaborator

@Laurae2 yes. The binary model format is needed. But I am busy with other things recently.
So I add a call of contribution first.
Welcome to contribution 😄

@gugatr0n1c gugatr0n1c changed the title [Question] - slow model reading Binary format needed - slow model reading Jun 14, 2017
@guolinke guolinke added this to the v3.0 milestone Aug 3, 2017
@limexp
Copy link
Contributor

limexp commented Aug 8, 2017

@guolinke
Do you have any suggestions about binary file format? Is it an issue for a discussion?

There is no flag to indicate that binary or text mode required in current interface of GBDT::SaveModelToFile(int num_iteration, const char* filename) method.
How to understand what format to use?

  1. Always use binary format
  2. Add new method for binary saving
  3. Change interface and add a parameter with a default value
  4. Use new/existing config parameter like save_binary (is_save_binary_file)
  5. ...

@guolinke
Copy link
Collaborator

guolinke commented Aug 9, 2017

@limexp
we only have text format now. Binary format is a to-do item.

@limexp
Copy link
Contributor

limexp commented Aug 9, 2017

@guolinke
I understand this and want to clarify the task and estimate the impact of it on existing programs before starting to code. It is better to decide beforehand than to modify afterwards.

It's hard to change binary file format in the future and not to ruin existing saved models, or it would require to add versions support and make code complex. Of course there are universal solutions like protobuf.

Decision about interface is not so critical, but it has direct impact on codebase, tests and compatibility.

I'm not asking for a final solution but looking for a direction if you have one.

@i3v
Copy link
Contributor

i3v commented Aug 29, 2017

AFAIU, concerning the benchmark:

  • One of the possible reasons behind high "Time to save" is the stringstream perfomance. It looks like it might be ~10x slower than += (even though it is a bit counter-intuitive). This might vary from OS to OS and from compiler to compiler though (those benchmark uses gcc 4.6). The += also doesn't suffer from "2GB limit" in MSVS.

  • Existing save\load code is single-threaded. It should be fairly easy to parallelize it (easier than designing a binary format). More than 99% percent of the file is occupied by Tree parts, and it should be easy to create them in parallel. (No one is going to use LGBM to create just one insanely huge tree, right?) Still, it would be hard to beat xgboost this way.

  • Existing "convertion to string" is also used for some other purposes, (e.g. here), not just for file saving. If might be a good idea to use new serialization method in such cases as well, but this would make it even more complicated. (If it would be only for saving to files - hdf5 might look like a good choice.)

@wxchan
Copy link
Contributor

wxchan commented Sep 13, 2017

anyone interested in helping test loading and saving model with protobuf?

it's in branch: https://github.com/wxchan/LightGBM/tree/proto, you can cmake with -DUSE_PROTO=ON to install with protobuf, and add model_format=proto in config to load and save models with protobuf.

a simple test:
text ->proto
save: 0.612300s -> 0.023627s
load: 0.471999s -> 0.024712s
size: 13M -> 5.9M

@AbdealiLoKo
Copy link

I personally find the text format the best feature of lightgbm. You can easily check things like how many trees and so on are being used without any additional commands which is complicated know binary models

@AbdealiLoKo
Copy link

Seems like the protobuf model has been merged and so this issue can be closed ?

@wxchan
Copy link
Contributor

wxchan commented Nov 5, 2017

it might be reverted, we are looking for a better solution @AbdealiJK

@guolinke
Copy link
Collaborator

guolinke commented Jan 5, 2018

I think model read/write is much faster now. Please have a try.

@guolinke guolinke closed this as completed Jan 5, 2018
@gugatr0n1c
Copy link
Author

yes, it is much faster now... great job

@ericwang915
Copy link

Any solution to this? I have trained and saved a LGB model and the file is almost 18GB.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants