-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an example with binary classification #1
Conversation
'logisticregression__C': np.logspace(0.0001, 1, 10), | ||
} | ||
|
||
with warnings.catch_warnings(action="ignore"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this was a production setting ... why would you ignore the warnings? Wouldn't we want these to appear in our logs? These warnings may give us a hint when something breaks down completely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I should have mentioned that I was worried about the readability of the notebook, as the same (low-priority) warning was printed repeatedly across the cv.
So, this is not a production setting, this is an exploratory notebook setting :)
But I can try to fix the root cause of this warning.
|
||
y_proba = named_results["y_proba"][:, 0] | ||
|
||
CalibrationDisplay.from_predictions( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not implemented it yet, but when I looked at this guide I got the impression that we might also be able to do this:
disp = CalibrationDisplay(...)
Then we could store the disp
unto the mander and from a template we might do something like:
<sklearn-display disp='@mander.disp'/>
How would that feel for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that's an interesting option! To play the devil's advocate here, should we generalize this and try using e.g. matplotlib figures instead?
<matplotlib fig='@mander.my_calibration_fig'>
Ideally, I'd like frequently used plots to be built in Mandr, so that I don't need to write CalibrationDisplay(...) myself. How do you think we could achieve that with Mandr?
|
||
|
||
from sklearn.dummy import DummyClassifier | ||
from sklearn.model_selection import cross_val_predict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
brrrrr :)
I see that you don't use it so this is fine :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha good catch, I need to remove it.
A good follow-up on this example would be to add confidence intervals on the roc curve, and error bars on the metrics. What would be the best way to achieve uncertainty quantification here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quick pass, I'm still reading :)
tnr = tn / (tn + fp) | ||
return 1 - tnr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same but more straightforward
tnr = tn / (tn + fp) | |
return 1 - tnr | |
return fp / (fp + tn) |
# | ||
# The goal of this example, based on [an existing scikit-learn example](https://scikit-learn.org/stable/auto_examples/model_selection/plot_cost_sensitive_learning.html), is to showcase various baseline models and evaluation metrics for binary classification. The use-case we are dealing with is predicting the acceptance of candidates applying for a loan. The application can either be marked as good or bad, hence the binary nature of the task. | ||
# | ||
# We start with loading the dataset from openML. Note that in real-life setting, we would probably have to join and aggregate multiple tables from multiple sources before landing a neat dataframe ready to use for ML. We would also need to worry about data quality, feature selection, missing values etc. It we wanted to deploy this model into some production system, the availability of our features at the prediction time must also be checked. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# We start with loading the dataset from openML. Note that in real-life setting, we would probably have to join and aggregate multiple tables from multiple sources before landing a neat dataframe ready to use for ML. We would also need to worry about data quality, feature selection, missing values etc. It we wanted to deploy this model into some production system, the availability of our features at the prediction time must also be checked. | |
# We start with loading the dataset from OpenML. Note that in real-life setting, we would probably have to join and aggregate multiple tables from multiple sources before landing a neat dataframe ready to use for ML. We would also need to worry about data quality, feature selection, missing values etc. It we wanted to deploy this model into some production system, the availability of our features at the prediction time must also be checked. |
|
||
# We then add our utility function "cost gain". Unlike the previous metrics which are generic to the binary classification setting, this utility function is specific to the problem at hand and must be carefuly considered. | ||
# | ||
# Here, we set this utility function using a cost matrix, which indicates how much error costs, and how much correct predictions yield. This is where a strong intuition of the use-case is needed. We considered the coefficients to be fixed, but note that we could also pass variables from our dataframe (e.g. a False positive could be proportional to the amount of the loan). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Here, we set this utility function using a cost matrix, which indicates how much error costs, and how much correct predictions yield. This is where a strong intuition of the use-case is needed. We considered the coefficients to be fixed, but note that we could also pass variables from our dataframe (e.g. a False positive could be proportional to the amount of the loan). | |
# Here, we set this utility function using a cost matrix, which indicates how much errors cost, and how much correct predictions yield. This is where a strong intuition of the use-case is needed. We considered the coefficients to be fixed, but note that we could also pass variables from our dataframe (e.g. a False positive could be proportional to the amount of the loan). |
|
||
# We then add our utility function "cost gain". Unlike the previous metrics which are generic to the binary classification setting, this utility function is specific to the problem at hand and must be carefuly considered. | ||
# | ||
# Here, we set this utility function using a cost matrix, which indicates how much error costs, and how much correct predictions yield. This is where a strong intuition of the use-case is needed. We considered the coefficients to be fixed, but note that we could also pass variables from our dataframe (e.g. a False positive could be proportional to the amount of the loan). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is where a strong intuition of the use-case is needed.
It's more than intuition. It's knowledge of the domain, of the application, of business constraints, ...
# | ||
# Here, we set this utility function using a cost matrix, which indicates how much error costs, and how much correct predictions yield. This is where a strong intuition of the use-case is needed. We considered the coefficients to be fixed, but note that we could also pass variables from our dataframe (e.g. a False positive could be proportional to the amount of the loan). | ||
# | ||
# In this example, correct classifications yield 0, false negatives yield -5 and false positives yield -1. We can already understand that penalizing false negatives 5x time more than false positives will put the classification threshold closer to 0 than to 0.5. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the unit ? I think these numbers should be motivated with concrete examples
Thanks for your comments, I'm closing this PR in favor of #24, because it deals with a more concrete use case. |
Here is a first quick example that doesn't use Mandr for now. It extends one of the scikit-learn examples introducing the
TunedThresholdClassifierCV
.It displays some plots a user might want to see for this task, although many are missing (especially considering the non-existent EDA part). It notably features plots with multiple models, which could be a good candidate for the aggregated feature of Mandr (something MLFlow doesn't have).
Feedback on the methodology (or anything else really) is more than welcome.