Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python][scikit-learn] Support for multiple evaluation metrics #3165

Closed
wants to merge 3 commits into from

Conversation

giresg
Copy link
Contributor

@giresg giresg commented Jun 14, 2020

This PR:

  • train, cv: Allows for multiple custom evaluation metrics
  • scikit-learn API: Allows multiple custom evaluation metrics
  • scikit-learn API: Allows using a mix of custom evaluation functions and LGBM's evaluation metrics as defined here
  • scikit-learn API: Fixes a bug that prevented the use of a list of LGBM's evaluation metrics in LGBMClassifier

The training and cv APIs allow the use of multiple LGBM's evaluation metrics defined here but allow only one custom evaluation function. With this changes the user will be able to monitor multiple custom evaluation functions.

A similar situation happens in the scikit-learn API. The eval_metric parameter in the fit function accepts a list of LGBM's evaluation metrics or one callable. With these changes the user will be able to monitor multiple LGBM's evaluation metrics, custom metrics, or a mix of both.

@ghost
Copy link

ghost commented Jun 14, 2020

CLA assistant check
All CLA requirements met.

@guolinke
Copy link
Collaborator

guolinke commented Jul 7, 2020

@gramirezespinoza can you merge the latest master branch to pass CI?
@StrikerRUS what is your opinion for this PR?

@StrikerRUS
Copy link
Collaborator

@StrikerRUS what is your opinion for this PR?

@gramirezespinoza Can you please clarify the intent of this PR? Is it a refactoring? Because all points listed in the starting comment for this PR, like multiple custom metrics and the mix of custom and built-in metrics, are already supported in LightGBM.

@giresg
Copy link
Contributor Author

giresg commented Jul 7, 2020

hey @StrikerRUS, @guolinke thanks for having a look at the PR!

The idea of the PR is to have total flexibility on the number and types of metrics that the user can monitor during training. To the best of my knowledge, the functionality on this PR is not yet implemented.

Happy to answer more questions if needed.

All details
Will start with the most important: the PR fixes a small bug in the LGBMClassifier class. The code as-is breaks when eval_metric is a list of strings. This is meant to work (as referenced in the documentation) and does work on other derived classes of LGBMModel except LGBMClassifier. The issue is in the following code in the master branch:

if self._n_classes > 2:
# Switch to using a multiclass objective in the underlying LGBM instance
ova_aliases = {"multiclassova", "multiclass_ova", "ova", "ovr"}
if self._objective not in ova_aliases and not callable(self._objective):
self._objective = "multiclass"
if eval_metric in {'logloss', 'binary_logloss'}:
eval_metric = "multi_logloss"
elif eval_metric in {'error', 'binary_error'}:
eval_metric = "multi_error"
else:
if eval_metric in {'logloss', 'multi_logloss'}:
eval_metric = 'binary_logloss'
elif eval_metric in {'error', 'multi_error'}:
eval_metric = 'binary_error'

with the fix the following line works as expected (see tests in PR):

gbm = lgb.LGBMClassifier(**params).fit(eval_metric=['fair', 'error'], **params_fit)

The second, is added functionality to allow mix of "string" metrics and custom metrics in the sklearn API. Now, it is possible to do this (see tests in PR):

gbm = lgb.LGBMClassifier(**params).fit(eval_metric=[custom_recall, custom_precision, "fair"], **params_fit)

where custom_recall, and custom_precision are callables.

Finally, the train and cv python APIs are modified to allow multiple feval like this:

model = lgb.train(
    ...
    feval=[custom_recall, custom_precision],
    ...)

For any other examples, please see the tests in the PR.

@giresg
Copy link
Contributor Author

giresg commented Jul 7, 2020

@gramirezespinoza can you merge the latest master branch to pass CI?

done

@StrikerRUS
Copy link
Collaborator

@gramirezespinoza OK, seems I got it! This PR is a bug fix for

Will start with the most important: the PR fixes a small bug in the LGBMClassifier class. The code as-is breaks when eval_metric is a list of strings.

and a refactoring for better usage of mix of custom and built-in metrics in one parameter (namely, feval/eval_metric), because right now one should use different arguments to achieve the goal: #2182 (comment).

Will it make sense to split this PR into two?

@giresg
Copy link
Contributor Author

giresg commented Jul 8, 2020

@StrikerRUS sure will split this into two PRs

@giresg
Copy link
Contributor Author

giresg commented Jul 12, 2020

Bugfix of LGBMClassifier in this PR

@giresg
Copy link
Contributor Author

giresg commented Jul 27, 2020

Second part of this PR implemented in 3254.

Closing this PR.

@giresg giresg closed this Jul 27, 2020
@giresg giresg deleted the feature/multiple_eval_metrics branch July 27, 2020 14:13
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants