Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Better Preprocessing #79

Merged
merged 24 commits into from
Sep 17, 2024

Conversation

akashsaravanan-georgian
Copy link
Contributor

@akashsaravanan-georgian akashsaravanan-georgian commented Sep 6, 2024

  • Feat: Resolve Is there a way to save the preprocessing objects for inference? (OneHotEncoder, Scaler) #76 by saving numerical and categorical transformers for inference usage.
  • Feat: CategoricalFeatures now uses a fit(), transform(), and fit_transform() method.
  • Feat: Decouple CategoricalFeatures and the dataset. The object is now independent of dataset and performs transformations based on information from the fit() step. It can now be used separately for inference.
  • Feat: Created a new NumericalFeatures object with the same functions as above for consistency in use.
  • Feat: Updated NaN handling for both categorical and numerical features. Users can specify if NaNs should be handled and what they should be replaced by. Numerical features can be replaced by the median, mean or a custom value while categorical features can be replaced by a custom value only. Also resolves Imputation of numerical data #69.
  • Feat: Update tests & main.py to support new features.
  • Feat: Update default types for several variables such as categorical_cols and label_list to use lists instead of None.
  • Feat: Resolve Pass through argument 'handle_unknown' of sklearn OneHotEncoder #66 by adding handle_unknown argument for OneHotEncoders in the config.
  • Feat: Add a new inference.py script to showcase how to use the saved feature transformers.
  • Fix: Misc bugfixes
  • Build: Update requirements to resolve dependabot alerts.
  • Refactor: Rename the notebooks folder into an examples folder.
  • Refactor: Update all function calls to explicitly name parameters to avoid confusion.
  • Style: Reformat entire library with black.
  • Docs: Update repository maintainers as Kyryl is no longer involved.

@akashsaravanan-georgian akashsaravanan-georgian merged commit b2f05ee into master Sep 17, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants