Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] lgb.prepare2() and lgb.prepare_rules2() should convert numeric columns to integer #2680

Closed
jameslamb opened this issue Jan 12, 2020 · 2 comments

Comments

@jameslamb
Copy link
Collaborator

Summary

lgb.prepare2() and lgb.prepare_rules2() should convert columns of type "numeric" to type "integer".

Motivation

The R package currently exports two functions that can be used to convert non-integer columns in tabular datasets to integer.

  • lgb.prepare2(): converts columns of type "character" and "factor" to "integer"
  • lgb.prepare_rules2(): similar to lgb.prepare(), but returns a set of "rules" describing how non-integer values were mapped to integer values. Also allows for user-provided rules, useful for cases where you want to be sure the encoding is the same on multiple datasets (e.g. training, test, and validation datasets)

These functions are intended to make it easier to create a model-ready dataset (all integer). The fact that it does not convert numeric columns to integer could cause issues in programs that require every column to be integer.

@jameslamb
Copy link
Collaborator Author

Closed in favor of being in #2302. We decided to keep all feature requests in one place.

Welcome to contribute this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature.

@jameslamb
Copy link
Collaborator Author

This is now irrelevant since #3095 has been merged, so I'm marking it wontfix.

lgb.convert()' (the function that replaces lgb.prepare2() will now take in a data frame and guarantee that it returns one that has only integer and numeric columns. It is completely fine for a training dataset to mix integer and numeric values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant