Skip to content

mine-cetinkaya-rundel/tidymodels-uscots-2021

Repository files navigation

Tidy up your models

USCOTS 2021
Friday, June 28th, 2021
2:30 pm – 3:45 pm ET


💻 Slides

🖊️ Case studies (All of these can be found on RStudio Cloud as well, they're provided here for participants choosing to use their local setup)


The tidymodels framework is a collection of R packages for modeling and machine learning using tidyverse principles. In this session, we'll introduce the tidymodels framework and discuss how it can be incorporated into an introductory data science curriculum. We will walk through case studies for building linear and logistic regression models with the goals of prediction and classification, performing cross-validation, and evaluating model performance. We will also compare the tidymodels approach to more traditional modeling frameworks in R as well as touch on how to do statistical inference within the tidymodels framework.

During the session, we will make use of polling questions and live coding. Participants will have the option to follow along with the exercises on RStudio Cloud.

The intended audience for the session is anyone who is interested in an introduction to tidymodels, either for their own use or in their teaching. The session will assume familiarity with R and basics of tidyverse (e.g. pipe operator, dplyr, ggplot2) as well as with modeling (linear and logistic regression).

Agenda

Instructors

Mine Çetinkaya-Rundel (Duke University, RStudio) is Professor of the Practice position at the Department of Statistical Science at Duke University and Data Scientist and Professional Educator at RStudio. Mine’s work focuses on innovation in statistics and data science pedagogy, with an emphasis on computing, reproducible research, student-centered learning, and open-source education as well as pedagogical approaches for enhancing retention of women and under-represented minorities in STEM. Mine works on integrating computation into the undergraduate statistics curriculum, using reproducible research methodologies and analysis of real and complex datasets. She also organizes ASA DataFest, an annual two-day competition in which teams of undergraduate students work to reveal insights into a rich and complex dataset. Mine has been working on the OpenIntro project since its founding and as part of this project she co-authored four open-source introductory statistics textbooks. She is also the creator and maintainer of datasciencebox.org and she teaches the popular Statistics with R MOOC on Coursera.

Debbie Yuster (Ramapo College) is an Assistant Professor of Data Science and Mathematics at Ramapo College of New Jersey. At Ramapo College, Debbie has developed new courses including Introduction to Data Science and Ethics for Data Science. Debbie enjoys participating in communities of practice around curricula and open-source tools, contributing by employing her eagle-eye editing skills to refine instructional materials, as well as by helping fellow newcomers through onboarding and troubleshooting. Debbie received a Ph.D. in Mathematics from Columbia University, and a B.A. in Mathematics (with Computer Science concentration) from Cornell University.