GitHub

Jennie was here # Applying Statistical Concepts: Linear regression, classification, and resampling

Content ok ok ok

Description
Learning Outcomes
Assignments
Contacts
Delivery of the Learning Module
Schedule
Requirements
Resources
- How to get help
Folder Structure

Description

This module introduces the skills required to design, implement and test logistic regression and classification, as well as validate it with resampling. The module compares the differences between modelling for prediction purposes and inference. It explores the choices between prediction accuracy and model interpretability, and the bias-variance trade-off.

Learning Outcomes

By the end of the module, participants will be able to:

Implement and interpret the results from several supervised learning approaches for regression and classification
Use resampling methods to select a model
Determine the requirements for reproducible learning
Analyze the uncertainties associated with model results and the ethical consequences of acting on these results
Explain the different trade offs and considerations for the statistical methods in their toolkit to both technical and non-technical audiences

Assignments

Participants should review the Assignment Submission Guide for instructions on how to complete assignments in this module.

Assignment 1

Assignment 2

Assignment 3

Assignment Due-dates

Assessment	Content	Due Date
Assignment 1	Sessions 1, 2, 3	June 2
Assignment 2	Sessions 4, 5, 6	June 9
Assignment 3	Sessions 7, 8, 9	June 16

Contacts

Questions can be submitted to the #cohort-3-help channel on Slack

Technical Facilitator: Holly Xie (She/Her). Emails can be sent to xhonglei2007@gmail.com
Learning Support Staff: Ananya Jha (She/Her). Emails can be sent to ananya.jha@mail.utoronto.ca
Learning Support Staff: Amanda Ng (She/Her). Emails can be sent to waiyuamanda.ng@mail.utoronto.ca
Learning Support Staff: Vishnou Vinayagame (He/Him). Emails can be sent to vishnouvina@cs.toronto.edu

Delivery of the Learning Module

This module will include live learning sessions and optional, asynchronous work periods. During live learning sessions, the Technical Facilitator will introduce and explain key concepts and demonstrate core skills. Learning is facilitated during this time. Before and after each live learning session, the instructional team will be available for questions related to the core concepts of the module. Optional work periods are to be used to seek help from peers, the Learning Support team, and to work through the homework and assignments in the learning module, with access to live help. Content is not facilitated, but rather this time should be driven by participants. We encourage participants to come to these work periods with questions and problems to work through. Participants are encouraged to engage actively during the learning module. They key to developing the core skills in each learning module is through practice. The more participants engage in coding along with the instructional team, and applying the skills in each module, the more likely it is that these skills will solidify.

The technical facilitator will introduce the concepts through a collaborative live coding session using the Python notebooks found under /01_slides. The technical facilitator will upload any live coding files to this repository for any participants to revisit under ./06_cohort_three/live_code.

Schedule

Session	Date	Topic	ISLP Chapter
1	May 28	Key Concepts of Statistical Analysis	2
2	May 29	Simple linear regression	3
3	May 30	Multiple linear regression, interactions and Qualitative Predictors	3
4	June 4	Classification vs Regression	4
5	June 5	Classification (Logistic Regression)	4
6	June 6	Classification (Generalized Linear Model)	4
7	June 11	Resampling Methods (Leave One Out Cross Validation)	5
8	June 12	Resampling Methods (K-fold Cross Validation)	5
9	June 13	Resampling Methods (Bootstrap)	5

Requirements

Participants are expected to have completed Shell, Git, and Python learning modules.
Participants are encouraged to ask questions, and collaborate with others to enhance learning.
Participants must have a computer and an internet connection to participate in online activities.
Participants must not use generative AI such as ChatGPT to generate code in order to complete assignments. It should be use as a supportive tool to seek out answers to questions you may have.
We expect participants to have completed the steps in the onboarding repo.
We encourage participants to default to having their camera on at all times, and turning the camera off only as needed. This will greatly enhance the learning experience for all participants and provides real-time feedback for the instructional team.

Resources

Feel free to use the following as resources:

Documents

Introduction to Statistical Learning with Python Documentation (ISLP)

Videos

Introduction to Statistical Learning with Python Video Playlist

Simple Linear Regression

Multiple linear regression, interactions, qualitative predictors

Multiple Regression, Clearly Explained!!!

Classification (logistic regression, generative models)

Resampling methods (CV, bootstrap) and Linear model selection and regularization

Alternative Textbook: Data Science: A First Introduction (Chapters 5-10)

How to get help

Folder Structure

.
├── 01_slides
├── 02_assignments
├── 03_exercises
├── 04_homework
├── 05_instructors
├── 06_additional_resources
├── LICENSE
├── README.md
├── requirements.txt
└── steps_to_ask_for_help.png

slides: Module slides as PDF files.
exercises: Work to be done alongside learning sessions.
homework: Homework to practice concepts covered in learning modules.
assignments: Assignments.
additional resources: Extra material not covered by the module.
instructors: This folder provides guidance for Technical Facilitators and the Learning Support team on teaching methodologies and content delivery.
README: This file!
.gitignore: Files to exclude from this folder, specified by the Technical Facilitator

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Content ok ok ok

Description

Learning Outcomes

Assignments

Contacts

Delivery of the Learning Module

Schedule

Requirements

Resources

Documents

Videos

Simple Linear Regression

Multiple linear regression, interactions, qualitative predictors

Classification (logistic regression, generative models)

Resampling methods (CV, bootstrap) and Linear model selection and regularization

How to get help

Folder Structure

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 329 Commits
.github		.github
01_slides		01_slides
02_assignments		02_assignments
03_exercises		03_exercises
04_homework		04_homework
05_instructional_team		05_instructional_team
06_cohort_three		06_cohort_three
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
steps_to_ask_for_help.png		steps_to_ask_for_help.png

License

moejennie/applying_statistical_concepts

Folders and files

Latest commit

History

Repository files navigation

Content ok ok ok

Description

Learning Outcomes

Assignments

Contacts

Delivery of the Learning Module

Schedule

Requirements

Resources

Documents

Videos

Simple Linear Regression

Multiple linear regression, interactions, qualitative predictors

Classification (logistic regression, generative models)

Resampling methods (CV, bootstrap) and Linear model selection and regularization

How to get help

Folder Structure

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages