The Movie Recommendation System is designed to predict user preferences for movies based on historical rating data. The project involves cleaning and preparing data, implementing collaborative filtering techniques, and evaluating model performance using different estimation methods and similarity measures.
- Collaborative Filtering: Implemented both item-based and user-based collaborative filtering techniques.
- Similarity Measures: Used Cosine Similarity and Pearson Correlation for measuring similarity.
- Estimation Methods: Compared the Standard Estimate method and SVD Estimate method for rating predictions.
- Performance Evaluation: Calculated Mean Absolute Error (MAE) to evaluate prediction accuracy.
The system uses the MovieLens dataset, which contains 100,836 ratings across 9,742 movies from 610 users. The dataset includes:
movies.csv
: Movie informationratings.csv
: User ratingstags.csv
: User-generated tagslinks.csv
: External movie information
Data Source: MovieLens
- Merged the datasets (
movies.csv
,ratings.csv
,links.csv
). - Filtered out movies with fewer than 100 ratings, reducing the dataset to 138 movies.
- Implemented 8 different combinations:
- Standard Estimate Method | Cosine Similarity | Manual function
- Standard Estimate Method | Cosine Similarity | Inbuilt function
- Standard Estimate Method | Pearson Correlation | Manual function
- Standard Estimate Method | Pearson Correlation | Inbuilt function
- SVD Estimate Method | Cosine Similarity | Manual function
- SVD Estimate Method | Cosine Similarity | Inbuilt function
- SVD Estimate Method | Pearson Correlation | Manual function
- SVD Estimate Method | Pearson Correlation | Inbuilt function
- Evaluated using similar approaches as Item-based filtering.
- Compared the Standard Estimate and SVD Estimate methods using MAE.
- Results indicated that the SVD method outperformed the Standard Estimate in computational efficiency.
- Python 3.x
- NumPy, Pandas, SciPy for data manipulation
- Scikit-learn for similarity calculations
- The system provided accurate movie recommendations, with different approaches yielding comparable recommendations.
- SVD Estimate significantly reduced computation time compared to the Standard Estimate method.
The project showcases how collaborative filtering can be applied to build an efficient recommendation system. Various methods and similarity measures were explored to optimize performance.