Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] lgb.convert functions should warn on unconverted columns of unsupported types #2681

Closed
jameslamb opened this issue Jan 12, 2020 · 2 comments

Comments

@jameslamb
Copy link
Collaborator

jameslamb commented Jan 12, 2020

Summary

lgb.convert(), and lgb.convert_with_rules() should tell users in a log message if any columns remain unconverted because they are of a type that the function does not support. The names of those columns should be logged.

Motivation

The R package currently exports four functions that can be used to convert tabular datasets into model-ready form:

  • lgb.convert(): converts columns of type "character" and "factor" to "integer"
  • lgb.convert2(): similar to lgb.convert(), but returns a set of "rules" describing how non-numeric values were mapped to integer values. Also allows for user-provided rules, useful for cases where you want to be sure the encoding is the same on multiple datasets (e.g. training, test, and validation datasets)

These functions are intended to make it easier to create a model-ready dataset (all numeric or integer). The user expectation is likely that after using calling one of these functions on a dataset, that dataset is ready to use in a model. If that dataset contains columns of other types (not integer or numeric), the user should be notified in a log message.

Column types that these functions are unlikely to support:

  • POSIX*
  • Date
  • list
  • data.frame
  • data.table
@jameslamb
Copy link
Collaborator Author

Closed in favor of being in #2302. We decided to keep all feature requests in one place.

Welcome to contribute this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature.

@jameslamb jameslamb changed the title [R-package] lgb.prepare functions should warn on unconverted columns of unsupported types [R-package] lgb.convert functions should warn on unconverted columns of unsupported types Aug 1, 2020
@jameslamb
Copy link
Collaborator Author

I just updated this description and title now that #3095 has been merged.

jameslamb added a commit that referenced this issue Aug 6, 2020
#2678, #2681) (#3269)

* [R-package] improvements to lgb.convert() functions (fixes #2678, #2681)

* more stuff

* update docs

* remove lgb.convert()

* put internal functions back

* update index
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant