Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an example with binary classification #1

Closed
wants to merge 3 commits into from

Conversation

Vincent-Maladiere
Copy link
Member

Here is a first quick example that doesn't use Mandr for now. It extends one of the scikit-learn examples introducing the TunedThresholdClassifierCV.

It displays some plots a user might want to see for this task, although many are missing (especially considering the non-existent EDA part). It notably features plots with multiple models, which could be a good candidate for the aggregated feature of Mandr (something MLFlow doesn't have).

Feedback on the methodology (or anything else really) is more than welcome.

'logisticregression__C': np.logspace(0.0001, 1, 10),
}

with warnings.catch_warnings(action="ignore"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this was a production setting ... why would you ignore the warnings? Wouldn't we want these to appear in our logs? These warnings may give us a hint when something breaks down completely.

Copy link
Member Author

@Vincent-Maladiere Vincent-Maladiere Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I should have mentioned that I was worried about the readability of the notebook, as the same (low-priority) warning was printed repeatedly across the cv.
So, this is not a production setting, this is an exploratory notebook setting :)
But I can try to fix the root cause of this warning.


y_proba = named_results["y_proba"][:, 0]

CalibrationDisplay.from_predictions(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not implemented it yet, but when I looked at this guide I got the impression that we might also be able to do this:

disp = CalibrationDisplay(...)

Then we could store the disp unto the mander and from a template we might do something like:

<sklearn-display disp='@mander.disp'/>

How would that feel for you?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that's an interesting option! To play the devil's advocate here, should we generalize this and try using e.g. matplotlib figures instead?

<matplotlib fig='@mander.my_calibration_fig'>

Ideally, I'd like frequently used plots to be built in Mandr, so that I don't need to write CalibrationDisplay(...) myself. How do you think we could achieve that with Mandr?



from sklearn.dummy import DummyClassifier
from sklearn.model_selection import cross_val_predict
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

brrrrr :)

I see that you don't use it so this is fine :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha good catch, I need to remove it.
A good follow-up on this example would be to add confidence intervals on the roc curve, and error bars on the metrics. What would be the best way to achieve uncertainty quantification here?

Copy link

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick pass, I'm still reading :)

Comment on lines +51 to +52
tnr = tn / (tn + fp)
return 1 - tnr

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same but more straightforward

Suggested change
tnr = tn / (tn + fp)
return 1 - tnr
return fp / (fp + tn)

#
# The goal of this example, based on [an existing scikit-learn example](https://scikit-learn.org/stable/auto_examples/model_selection/plot_cost_sensitive_learning.html), is to showcase various baseline models and evaluation metrics for binary classification. The use-case we are dealing with is predicting the acceptance of candidates applying for a loan. The application can either be marked as good or bad, hence the binary nature of the task.
#
# We start with loading the dataset from openML. Note that in real-life setting, we would probably have to join and aggregate multiple tables from multiple sources before landing a neat dataframe ready to use for ML. We would also need to worry about data quality, feature selection, missing values etc. It we wanted to deploy this model into some production system, the availability of our features at the prediction time must also be checked.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# We start with loading the dataset from openML. Note that in real-life setting, we would probably have to join and aggregate multiple tables from multiple sources before landing a neat dataframe ready to use for ML. We would also need to worry about data quality, feature selection, missing values etc. It we wanted to deploy this model into some production system, the availability of our features at the prediction time must also be checked.
# We start with loading the dataset from OpenML. Note that in real-life setting, we would probably have to join and aggregate multiple tables from multiple sources before landing a neat dataframe ready to use for ML. We would also need to worry about data quality, feature selection, missing values etc. It we wanted to deploy this model into some production system, the availability of our features at the prediction time must also be checked.


# We then add our utility function "cost gain". Unlike the previous metrics which are generic to the binary classification setting, this utility function is specific to the problem at hand and must be carefuly considered.
#
# Here, we set this utility function using a cost matrix, which indicates how much error costs, and how much correct predictions yield. This is where a strong intuition of the use-case is needed. We considered the coefficients to be fixed, but note that we could also pass variables from our dataframe (e.g. a False positive could be proportional to the amount of the loan).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Here, we set this utility function using a cost matrix, which indicates how much error costs, and how much correct predictions yield. This is where a strong intuition of the use-case is needed. We considered the coefficients to be fixed, but note that we could also pass variables from our dataframe (e.g. a False positive could be proportional to the amount of the loan).
# Here, we set this utility function using a cost matrix, which indicates how much errors cost, and how much correct predictions yield. This is where a strong intuition of the use-case is needed. We considered the coefficients to be fixed, but note that we could also pass variables from our dataframe (e.g. a False positive could be proportional to the amount of the loan).


# We then add our utility function "cost gain". Unlike the previous metrics which are generic to the binary classification setting, this utility function is specific to the problem at hand and must be carefuly considered.
#
# Here, we set this utility function using a cost matrix, which indicates how much error costs, and how much correct predictions yield. This is where a strong intuition of the use-case is needed. We considered the coefficients to be fixed, but note that we could also pass variables from our dataframe (e.g. a False positive could be proportional to the amount of the loan).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is where a strong intuition of the use-case is needed.

It's more than intuition. It's knowledge of the domain, of the application, of business constraints, ...

#
# Here, we set this utility function using a cost matrix, which indicates how much error costs, and how much correct predictions yield. This is where a strong intuition of the use-case is needed. We considered the coefficients to be fixed, but note that we could also pass variables from our dataframe (e.g. a False positive could be proportional to the amount of the loan).
#
# In this example, correct classifications yield 0, false negatives yield -5 and false positives yield -1. We can already understand that penalizing false negatives 5x time more than false positives will put the classification threshold closer to 0 than to 0.5.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the unit ? I think these numbers should be motivated with concrete examples

@Vincent-Maladiere
Copy link
Member Author

Thanks for your comments, I'm closing this PR in favor of #24, because it deals with a more concrete use case.

@thomass-dev thomass-dev deleted the binary_example branch September 16, 2024 07:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants