In this project, I aim to find the most important factors from this dataset to predict a high school student's college GPA. I then use these factors to create a model that can be used to predict the college GPA for other students.
According to the regression analysis and keeping all the other variables unchanged, a student that has high SAT scores identifying with the female gender is predicted to have a relatively good college GPA. However, the opposite is the case for a student with a large graduating high school class, a high rank in the high school class and a high school percentile position. Being an athlete alone is not considered significant towards predicting a student’s college GPA. However, based on the interaction between athlete and the student’s high school percentile number (this interaction is considered to be useful), we can see that if in fact a student-athlete also holds a top position in the high school percentile, then they are predicted to also have a high college GPA. Lastly, it should be noted that despite these results, this model is overall, not a strong predictor for college GPA (can only explain 29.7% of changes in college GPA).