Skip to content

Latest commit

 

History

History
57 lines (31 loc) · 1.8 KB

File metadata and controls

57 lines (31 loc) · 1.8 KB

Capstone-III-Movielens-EDA-Visualisation

Merge 3 .dat files using primary keys and apply exploratory data analysis and visualisations for insights and opportunities

CONTENTS

  1. Introduction
  2. Data Cleaning
  3. Missing Data
  4. Data Stories and Visualisations

Conduct an Exploratory Data Analysis (EDA) on the MovieLens dataset to get insights, clean the data and answer some questions. The data needs to be joined as it is in 3 different files and then analysed.

Approach

• Finding patterns in data.

• Determining relationships in data.

• Checking of assumptions.

• Drawing conclusions.

What question(s) are you trying to solve (or prove wrong)?

  1. Determine user rating for a single movie or determine performance between sequels.

  2. Understand viewership by age group for targeted marketing, retention activities or market expansion.

  3. Insight to the top 25 movies by viewership rating, indicating type of movies or sequels that perform better.

  4. At a user leveL, gauge the type of movie genres the customer enjoyed and therefore able to make recommendations around similar genres.

  5. Insights to user level detail that can lead to more movies being watched and enjoyed and potential for higher NPS scores.

  6. Possible future insights for the movie production industry on whom to market, including cinema insight for their websites.

Feature Attributes:

MovieID – numeric count of the number of movies, range between 1 and 3952

Title – movie title (and year)

Genres – classification of movie type

User Id – numeric assigned to the user, range between 1 & 6040

Rating – categorical rating of a movie, range from 1 to 5

Timestamp – measured in seconds

Gender – male or female

Age – age of the user

Occupation – occupation of the user

Zip Code – postal/ area code of the user