-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partial dependence plots #721
Conversation
…gorical features.
…ethod falls to predict_proba.
46c934c
to
18dfccc
Compare
logger.warning('The length of `target_names` does not match the number of predicted outputs. ' | ||
'Ensure that the lengths match, otherwise a call to the `plot_pd` method might ' | ||
'raise an error or produce undesired labeling.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is a possibility to check whether the number of targets that we computed the PD/ICE for matches the length of target_names
without a dummy call. Altough the user should be aware of that I decided to include a warning message in case that happens. Shall we go even further and raise an error?
…um-cat, sharey num-cats, updated images.
Following an offline discussion it was decided to simplify the implementation and user interface by splitting the implementation into two distinct classes:
This allows us to remove all of the slightly confusing arguments discussed previously. Also, @RobertSamoilescu checked that the recursive algorithm returns slightly different values than performing the brute-force PD on the same estimators which further justifies splitting the implementation into two public classes (similar to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, looks great!
… docs for helper plotting functions.
* Partial API & docs * included sanity checks, grid construction for both numerical and categorical features. * Finalized explain and build_explanation. Included one feature numerical plots. * included kwargs for all kinds of plots * Implemented plotting functionality for every feature types and combinations. * Fixed share y axis * Minor plot fixes. * Minor refactoring. Docstring for PartialDependece class * Docstring for plotting function. Minor grammar corrections. * Included test for sanity check. To be cleaned and optimized. * Partial cleaning of the tests. * Finalized params sanity checks tests. * Included test for number of features. * Test explanation shapes for numerical features. * Included black-box wrapper for classification and regression & corresponding tests * Included a PD simple example. * Included Adult example. Minor one way numerical plot fix for labels. * Solved flak8 warnings * Solved mypy errors. * Improved docs for plots. * Some comments and TODOs. * included list of TODOs * Consider by default all single features to compute the pd for. * Included custom grid_points. * Updated categorical graph from barplot to lineplot. * Included two target outputs for binary classification when response_method falls to predict_proba. * Solved flake8 warnings. * Solved mypy errors. * isort and mypy * Fixed fitted check for older versions of sklearn. * Cleaned the docs for partial_dependence.py * Tuple in numpy/list inconsistency * Reintroducing seaborn because of the heatmap * Introduced contour plots levels args. * minor changes to the docs * Started pdp bike dataset example * Included argument to display custom number of ice curves. * Unfinished example PD bike dataset * Finalized partial dependece bike example. * Draft method description. Some variable renaming. * Refactoring - explanation object. * Updated docs entries. * Readme indentation * setup.py minor correction * Minor documentation corrections * Removed initial pd exampled notebook * Changed links to Introduction section * Revert "Changed links to Introduction section" This reverts commit 67dce6f. * Fix problem with equation in method docs * Included test for pd computation against the sklearn implementation for numerical and adapted categorical features. * Addressed comments - part 1 * Addressed comments - part 2 * Removed seed. * Included progress bar while explaining features * A few changes to the pdp example notebook. * Removed ipynb for examples. * Addressed notebook comments. * Addressed comments regarding the plots. * Removed meta from data field. * Literal for some arguments * Allow 2 way PDP for kind both. Improved method description. Cleaned example. * Minor cleaning. * Updated docs in the method description. * Corrections to the example text. * Minor punctuation correction in the method description. * Removed features_list from method notebook * Update link: latest with stable * Changed kernel env back to Python. * Minor docstrings correction. * Fixed spelling error in example notebook. * Replaced centered with center. Improved docs for center flag. * Removed seaborn from dependencies. Implemented on matplotlib heatmap. * Included clarification for the response_method. * Replaced sns heatmap plot with the matplotlib heatmap in method description. * Removed wget. * Fixed links. Moved model sanity check to constructor. Removed panadas installation. Updated tests. * Integrate sklearn pd functions. In progress... * Removed sklearn private methods * Refactored blackbox case. * Updated method page. * Minor clarifications. * Solved mypy issues. * Addressed minor comments. * Minor correction for error messages and comments conventions. * Fixed test from previous commit. * isort on utils/visualizations.py * Fixed deciles computation. * Fixed for binary classification. * Add process pd and ice for plotting. * Included warning for the length. * Removed auto options * Removed auto. Fixed tests and docs. * Minor corrections. * Revert to deciles on full dataset * Changed a few error messages. Included one test for unknown kind. * Improved error messages and included a warning for target_names * Removed blank lines. * Fixed tests. * Split inital implementation into PartialDependence and TreePartialDependence. * Fixed tests. * Isort and removed unnecessary fixture from conftest.py * Updated PD example. * Updated method description. * Removed uncessary classes and imports. * Minor doc updates. Minor sanity checks refactoring. * Improved docs for TreePartialDependence. * Included ABC for the base class and removed for the derived one. * Fixed deciles display * Improved plots by adding pd limits * Fixed min-max pd plots for share_y. * Improved plots: zoom in for num and cat plots, add decile ticks for num-cat, sharey num-cats, updated images. * Solved minor display bug in num-cat for deciles. * Updated linear regression plots. * Minor docstrings correction: pairs of features -> tuples of features. * Removed unnecessary argument in _compute_pd_limits and updated return docs for helper plotting functions. Co-authored-by: Ashley Scillitoe <ashley.scillitoe@seldon.io>
Implementation of the partial dependence (PD) and individual conditional expectation (ICE) leveraging
sklearn
implementation.Some functionalities that it includes
sklearn
estimators)TODOs: