Data science project carried out in collaboration with my friend Lise Aujoulat (https://github.com/Lise-AJT), during our data scientist training. The aim was to develop machine learning and deep learning models for predicting rainy days in Australia. You will find various elements in this rep, in particular the code put in production for the project in Jupyter Notebook format, as well as the project report in PDF format (WeatherAUS_Rapport.pdf).
We worked from the "Rain in Australia" database available on https://www.kaggle.com/jsphyg/weather-dataset-rattle-package.
- Become familiar with the subject, understand the contents of the database.;
- Exploratory data analysis, includid data viz production;
- Search for the most appropriate machine learning model;
- Data pre-processing and Random Forest modelling;
- Attempt to improve the performance of the model with revised preprocessing;
- Development of a deep learning model.
- Exploratory data analysis and data viz - WeatherAUS_data_description.ipynb ;
- Machine learning model (random forest) - WeatherAUS_ML.ipynb ;
- Deep learning model - WeatherAUS_DL.ipynb;
- Streamlit app code - MeteoAUS.py.
Note that the code used to determine the best machine learning model, as well as the code for the time series analysis is not made available here, however the approach taken is explained in the project report in PDF.
Bonne lecture ! Enjoy your reading! 🐊