This code, gbdt-forecast
, is a method for energy and weather forecasting using gradient boosting decision trees. It considers the forecasting problem as a tabular problem without the spatio-temporal aspects included in the modelling prior. Instead spatio-temporal features can be included as (lagged) features in the tabular data. The code integrates four most popular gradient boosting implementations:
4) scikit-learn
The solar power forecasting benchmark is performed using the parameters in params/params_gefcom2014_solar_competition.json
.
Clone and install the necessary libaries through conda.
git clone git@github.com:greenlytics/gbdt-forecast.git
conda env create -f environment.yml
conda activate gbdt-forecast
Download the GEFCom2014 data and place the file 1-s2.0-S0169207016000133-mmc1.zip
in the data
folder.
To replicate the results above run the following scripts:
./run_gefcom2014_load.sh
./run_gefcom2014_solar.sh
./run_gefcom2014_wind.sh
The results will be saved to the results
folder and plots will be saved to plots
folder.
Download the GEFCom2014 data and place the file 1-s2.0-S0169207016000133-mmc1.zip
in the data
folder.
Extract the data by running:
python preprocess/extract_gefcom2014_wind_solar_load.py
the raw data files will be saved to:
Wind track data saved to: ./data/raw/gefcom2014-wind-raw.csv
Solar track data saved to: ./data/raw/gefcom2014-solar-raw.csv
Load track data saved to: ./data/raw/gefcom2014-load-raw.csv
Next step is to preprocess the data with feature extraction relavent for the forecasting task at hand. This repo includes examples of feature extraction for the different GEFCom2014 tracks:
preprocess/preprocess_gefcom2014_wind_example.py
preprocess/preprocess_gefcom2014_solar_example.py
preprocess/preprocess_gefcom2014_load_example.py
These preprocessing scripts should be updated with the relevant feature engineering and takes input from the parameter files. To run the preprocessing script for the wind track (other tracks are analog) as:
python preprocess/preprocess_gefcom2014_wind_example.py params/params_competition_gefcom2014_wind_example.json
the processed data file will be saved to:
Wind track preprocessed data saved to: ./data/gefcom2014/preprocessed/gefcom2014-wind-preprocessed.csv
To train models, predict and save the results run the following script:
python ./main.py params/params_competition_gefcom2014_wind_example.json
The results will be saved to the results
folder. Train models for other tracks by changing the parameters file.
Lastly, generate plots by running the following:
python ./plots/generate_plots_wind.py
Plots will be saved to the plots
folder.
The authors of this code would like to thank the Swedish Energy Agency for their financial support for this research work under the grant VindEL project number: 47070-1.