Skip to content

Vignette on bootstrapping and applications in machine learning.

Notifications You must be signed in to change notification settings

PSTAT197-F23/vignette-bootstrapping

Repository files navigation

vignette-bootstrapping

Vignette on implementing and demonstrating the effects of using bootstrapping and bagging techniques on data modeled with random forest; created as a class project for PSTAT197A in Fall 2023.

Contributors: Sarah Liang, Sharanya Sharma, Dannah Golich, Jason Siu

Vignette abstract: This vignette covers the basics of bootstrapping, applications to machine learning, and related resampling methods. A data set containing grade information for UC Santa Barbara students from 2009 to 2023 is used in demonstrating the impacts of bootstrapping on estimating sampling distributions and GPA prediction.

Repository contents: The completed vignette in qmd and html format can be found at the root directory. The data folder contains both the raw and preprocessed data used in the vignette. The scripts folder includes the preprocessing script, all code for the final vignette, and drafts completed by each contributor. Finally, the images folder contains the images used in our document.

References

Biswal, A. (2023, Aug 10). Bagging in Machine Learning: Step to Perform And Its Advantages. Retrieved from https://www.simplilearn.com/tutorials/machine-learning-tutorial/bagging-in-machine-learning

Mwiti, D. (2023 Sept 1). Random Forest Regression: When Does It Fail and Why?. Retrieved from https://neptune.ai/blog/random-forest-regression-when-does-it-fail-and-why

Random Forests. (n.d.). AFIT Data Science Lab R Programming Guide. Retrieved from https://afit-r.github.io/random_forests#basic

Tim C. "What Teachers Should Know About the Bootstrap: Resampling in the Undergraduate Statistics Curriculum." The American Statistician, vol. 69, no. 4, 2015, pp. 371--86, https://doi.org/10.1080/00031305.2015.1089789.

What is Bootstrapping?. (n.d.). Retrieved from https://www.mastersindatascience.org/learning/machine-learning-algorithms/bootstrapping/

Yu, Guo. "Lecture 8: Cross-Validation & Bootstrap", PSTAT-131/231: Introduction to Statistical Machine Learning, Oct.26, 2023, UC Santa Barbara.

About

Vignette on bootstrapping and applications in machine learning.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •