Skip to content

Model in Kim (2022), using the R code there for MCMC, for the analysis and prediction of air pollution data in Lombardy (AGRIMONIA data)

Notifications You must be signed in to change notification settings

CamiloSinningUN/SBART

 
 

Repository files navigation

SBART Implementation on AgrImOnIA dataset

Group project for the Bayesian Statistics course in POLIMI (Politecnico di Milano) A.A. 2023/2024.

image

Main goal

Apply the model in Kim (2022), using the R code there for MCMC, for the analysis and prediction of air pollution data in Lombardy (AGRIMONIA data)

Get started

To run the project you need to have R installed on your machine. You can download it from here. Additionally, you need to install the R package manager renv.

Install dependencies

To install the dependencies you need to run the following command in the R console:

 renv::restore()

Run the project

You can either run the console version by executing "main.R" file or Run the more user friendly version.

Console version

 renv::run("main.R") 

Datasets

Remember to put the datasets AGC_Dataset_v_3_0_0.csv and Agrimonia_Dataset_v_3_0_0.csv in the folder data/AgrImOnIA/raw.

Tune it up

Model parameters

You can tune the model by changing the parameters in the config.R file. The parameters are:

  • n_iterations: number of iterations for the MCMC algorithm.
  • n_trees: number of trees in the BART model.
  • warmup: number of warmup iterations for the MCMC algorithm (See Kim article for more information).

Additionally, you can change other parameters like:

  • model_filename: name of the file where the results will be saved.
  • date_begin: starting date to cut the dataset.
  • date_end: ending date to cut the dataset.
  • response_variable: name of the response variable in the dataset.
  • covariates_of_interest: names of the covariates in the dataset.

Results

When the model is run, the results are saved in the output folder. The results consist of: - covariates_selection_chain: covariates selection through the iterations. - spatial_theta_chain: spatial theta through the iterations. - sigma2_chain: sigma2 through the iterations. - trees_chain: trees through the iterations. - w_selection_chain: Weight matrix selection through the iterations. - dt_history: Tree structures history. - y_predictions: Final prediction history. - y_predictions_history: Predictions history.

The information will be saved in a .RData file, the name of the file can be changed in the config.R file.

About

Model in Kim (2022), using the R code there for MCMC, for the analysis and prediction of air pollution data in Lombardy (AGRIMONIA data)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 74.3%
  • Jupyter Notebook 25.7%