Download LawSchool dataset directly from SEAPHE #359
Labels
datasets
Issue relating to new or existing datasets
easy
Beginner issues
good first issue
Good for newcomers
http://www.seaphe.org/databases.php
This way we can remove the dependency on tempeh. We can essentially copy this file (preserving the copyright notice): https://github.com/microsoft/tempeh/blob/main/tempeh/datasets/seaphe_datasets.py
See also meps_datasets.py for another example of downloading/unzipping.
Relevant files:
tempeh_datasets.py
law_school_gpa_dataset.py
See demo_grid_search_reduction_regression_sklearn.ipynb for example usage.
Behavior should be essentially the same as tempeh except dropping of NAs can be handled later so these should be kept.
The text was updated successfully, but these errors were encountered: