Skip to content

Latest commit

 

History

History
11 lines (6 loc) · 1.43 KB

File metadata and controls

11 lines (6 loc) · 1.43 KB

Medical School Admissions Dataset Curation via Web Scraping and Exploratory Data Analysis

The average acceptance rate out of the ~170 medical schools in the U.S. is around 5.5%. Airfare for interviewing alone can exceed $500, on top of the other hundreds of dollars to apply and send primary/secondary applications to just a single school. Despite these expenses, it's necessary to apply to 20-30 schools to get an acceptance, and for many, you cannot afford, both literally and figuratively, to not get accepted and reapply the following year. How do you pick your list of schools to maximize your chances of acceptance?

In this final project, medical school admission statistics are scraped from the internet and turned into a dataset. This dataset includes numerous things such as MCAT/GPA quantiles, in/out-of-state acceptance rates/bias, demographics, geographics, funding and institution type, residency match rates by specialty, etc. Exploratory data analysis is then performed to determine the list of schools my fiancée, based on her background, should apply to, to maximize her chances of acceptance this cycle.

Drexel class: Data Science 521 Data Analysis and Interpretation

Data Usage

Note that due to data usage policies by AAMC, I am not allowed to share or distribute the data I collected nor the code used to parse/transform the data into a dataset. I can share my exploratory data analysis and presentation which I hope you'll enjoy.