Run these files in the following order:
- gather_data.py
- KNN_Modeling.py
- Project-6103_lm.py
- Project-6103-logit.py
- gather_data.py downloads and integrates 26 .dat files hosted on a NOAA and GLERL site. This script is prone to connection and timeout errors due to restrictions on their server. To work around these constraints, we also hosted a copy of the full dataset on a GCP Bucket. If gather_data.py causes problems, the following scripts directly download the clean dataset from a cloud storage bucket.
- KNN_Modeling.py predicts which lake a set of characteristics belongs to. This model's output is more than twice as accurate as a naive baseline.
- Project-6103_lm.py predicts ice concentration based on surface temperature.
- Project-6103-logit.py predicts whether ice concentration will exceed a given threshold based on surface temperature and physical characteristics.
- KNN_Modeling.py also displays a GUI that allows users to input a k-value and displays the error at that value.
- Project-6103-logit.py also displays a GUI that allows users to input a k-value and displays the model accuracy at that value.