[python][scikit-learn] Support for multiple evaluation metrics #3165

giresg · 2020-06-14T11:49:38Z

This PR:

train, cv: Allows for multiple custom evaluation metrics
scikit-learn API: Allows multiple custom evaluation metrics
scikit-learn API: Allows using a mix of custom evaluation functions and LGBM's evaluation metrics as defined here
scikit-learn API: Fixes a bug that prevented the use of a list of LGBM's evaluation metrics in LGBMClassifier

The training and cv APIs allow the use of multiple LGBM's evaluation metrics defined here but allow only one custom evaluation function. With this changes the user will be able to monitor multiple custom evaluation functions.

A similar situation happens in the scikit-learn API. The eval_metric parameter in the fit function accepts a list of LGBM's evaluation metrics or one callable. With these changes the user will be able to monitor multiple LGBM's evaluation metrics, custom metrics, or a mix of both.

… LGBMClassifier

…ng unit-tests

ghost · 2020-06-14T11:49:51Z

All CLA requirements met.

guolinke · 2020-07-07T10:21:50Z

@gramirezespinoza can you merge the latest master branch to pass CI?
@StrikerRUS what is your opinion for this PR?

StrikerRUS · 2020-07-07T13:00:08Z

@StrikerRUS what is your opinion for this PR?

@gramirezespinoza Can you please clarify the intent of this PR? Is it a refactoring? Because all points listed in the starting comment for this PR, like multiple custom metrics and the mix of custom and built-in metrics, are already supported in LightGBM.

…eval_metrics

giresg · 2020-07-07T14:11:11Z

hey @StrikerRUS, @guolinke thanks for having a look at the PR!

The idea of the PR is to have total flexibility on the number and types of metrics that the user can monitor during training. To the best of my knowledge, the functionality on this PR is not yet implemented.

Happy to answer more questions if needed.

All details
Will start with the most important: the PR fixes a small bug in the LGBMClassifier class. The code as-is breaks when eval_metric is a list of strings. This is meant to work (as referenced in the documentation) and does work on other derived classes of LGBMModel except LGBMClassifier. The issue is in the following code in the master branch:

LightGBM/python-package/lightgbm/sklearn.py

Lines 785 to 798 in 1e2013a

    
           if self._n_classes > 2: 
        
               # Switch to using a multiclass objective in the underlying LGBM instance 
        
               ova_aliases = {"multiclassova", "multiclass_ova", "ova", "ovr"} 
        
               if self._objective not in ova_aliases and not callable(self._objective): 
        
                   self._objective = "multiclass" 
        
               if eval_metric in {'logloss', 'binary_logloss'}: 
        
                   eval_metric = "multi_logloss" 
        
               elif eval_metric in {'error', 'binary_error'}: 
        
                   eval_metric = "multi_error" 
        
           else: 
        
               if eval_metric in {'logloss', 'multi_logloss'}: 
        
                   eval_metric = 'binary_logloss' 
        
               elif eval_metric in {'error', 'multi_error'}: 
        
                   eval_metric = 'binary_error'

with the fix the following line works as expected (see tests in PR):

gbm = lgb.LGBMClassifier(**params).fit(eval_metric=['fair', 'error'], **params_fit)

The second, is added functionality to allow mix of "string" metrics and custom metrics in the sklearn API. Now, it is possible to do this (see tests in PR):

gbm = lgb.LGBMClassifier(**params).fit(eval_metric=[custom_recall, custom_precision, "fair"], **params_fit)

where custom_recall, and custom_precision are callables.

Finally, the train and cv python APIs are modified to allow multiple feval like this:

model = lgb.train(
    ...
    feval=[custom_recall, custom_precision],
    ...)

For any other examples, please see the tests in the PR.

giresg · 2020-07-07T15:04:57Z

@gramirezespinoza can you merge the latest master branch to pass CI?

done

StrikerRUS · 2020-07-07T19:04:38Z

@gramirezespinoza OK, seems I got it! This PR is a bug fix for

Will start with the most important: the PR fixes a small bug in the LGBMClassifier class. The code as-is breaks when eval_metric is a list of strings.

and a refactoring for better usage of mix of custom and built-in metrics in one parameter (namely, feval/eval_metric), because right now one should use different arguments to achieve the goal: #2182 (comment).

Will it make sense to split this PR into two?

giresg · 2020-07-08T11:11:42Z

@StrikerRUS sure will split this into two PRs

giresg · 2020-07-12T11:51:35Z

Bugfix of LGBMClassifier in this PR

giresg · 2020-07-27T14:12:23Z

Second part of this PR implemented in 3254.

Closing this PR.

github-actions · 2023-08-24T12:19:02Z

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

German I Ramirez-Espinoza added 2 commits June 14, 2020 19:20

Fixes a bug that prevented the use of a list of evaluation strings in…

2009941

… LGBMClassifier

Adds support for multiple custom evaluation functions and correspondi…

3a3568d

…ng unit-tests

giresg requested review from chivee, guolinke, henry0312, jameslamb, Laurae2, StrikerRUS and wxchan as code owners June 14, 2020 11:49

jameslamb added the feature label Jun 23, 2020

Merge remote-tracking branch 'upstream/master' into feature/multiple_…

201eb4d

…eval_metrics

jameslamb removed their request for review July 13, 2020 15:38

giresg mentioned this pull request Jul 27, 2020

[Python] Refactors scikit-learn API to allow a list of evaluation metrics #3254

Merged

giresg closed this Jul 27, 2020

giresg deleted the feature/multiple_eval_metrics branch July 27, 2020 14:13

github-actions bot locked as resolved and limited conversation to collaborators Aug 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python][scikit-learn] Support for multiple evaluation metrics #3165

[python][scikit-learn] Support for multiple evaluation metrics #3165

giresg commented Jun 14, 2020

ghost commented Jun 14, 2020 •

edited by ghost

Loading

guolinke commented Jul 7, 2020

StrikerRUS commented Jul 7, 2020

giresg commented Jul 7, 2020

giresg commented Jul 7, 2020

StrikerRUS commented Jul 7, 2020

giresg commented Jul 8, 2020

giresg commented Jul 12, 2020

giresg commented Jul 27, 2020

github-actions bot commented Aug 24, 2023

[python][scikit-learn] Support for multiple evaluation metrics #3165

[python][scikit-learn] Support for multiple evaluation metrics #3165

Conversation

giresg commented Jun 14, 2020

ghost commented Jun 14, 2020 • edited by ghost Loading

guolinke commented Jul 7, 2020

StrikerRUS commented Jul 7, 2020

giresg commented Jul 7, 2020

giresg commented Jul 7, 2020

StrikerRUS commented Jul 7, 2020

giresg commented Jul 8, 2020

giresg commented Jul 12, 2020

giresg commented Jul 27, 2020

github-actions bot commented Aug 24, 2023

ghost commented Jun 14, 2020 •

edited by ghost

Loading