Merge 3 .dat files using primary keys and apply exploratory data analysis and visualisations for insights and opportunities
- Introduction
- Data Cleaning
- Missing Data
- Data Stories and Visualisations
Conduct an Exploratory Data Analysis (EDA) on the MovieLens dataset to get insights, clean the data and answer some questions. The data needs to be joined as it is in 3 different files and then analysed.
• Finding patterns in data.
• Determining relationships in data.
• Checking of assumptions.
• Drawing conclusions.
-
Determine user rating for a single movie or determine performance between sequels.
-
Understand viewership by age group for targeted marketing, retention activities or market expansion.
-
Insight to the top 25 movies by viewership rating, indicating type of movies or sequels that perform better.
-
At a user leveL, gauge the type of movie genres the customer enjoyed and therefore able to make recommendations around similar genres.
-
Insights to user level detail that can lead to more movies being watched and enjoyed and potential for higher NPS scores.
-
Possible future insights for the movie production industry on whom to market, including cinema insight for their websites.
MovieID – numeric count of the number of movies, range between 1 and 3952
Title – movie title (and year)
Genres – classification of movie type
User Id – numeric assigned to the user, range between 1 & 6040
Rating – categorical rating of a movie, range from 1 to 5
Timestamp – measured in seconds
Gender – male or female
Age – age of the user
Occupation – occupation of the user
Zip Code – postal/ area code of the user