Skip to content

Latest commit

 

History

History
9 lines (6 loc) · 813 Bytes

README.md

File metadata and controls

9 lines (6 loc) · 813 Bytes

Data Mining Project 2022

Business objective

Our objective is to identify popular movies to invest in US-produced movies’ copyrights that will likely have a high ROI, as measured by popularity amongst movie-goers.

General approach

In this project, we tested multiple supervised predictive models and dived into a detailed examination of the top three models: XGBRegressor, GradientBoostingRegressor, and RandomForestRegressor. We expect to measure performance using adjusted R2(given the number of features)and RMSE.Based on our analysis, we believe ourXGBoostmodel with the predictors explains 69% of the variation in log-transformed target variable and as measured by adjustedR2.

The data directory has the small datasets used. The ipynb and html versions of the code are in 'notebooks'.