Project 1: Investigate-a-Dataset

Udacity Data Analyst Nanodegree

Table of contents

Introduction
Data Wrangling
Exploratory Data Analysis
Installation
Requirements
Author

Introduction

In this project the dataset that is being investigated is the TMDb Movie Dataset which has over 10000 movies with release dates dating from 1960 to 2015.

The questions that are investigated in this project are:

Investigating the trend in popularity of movies as the time progresses from 1960 to 2015
Investigating the correlation of popularity, budget, vote count with the revenue

Data-Wrangling

The dataset is loaded into the Jupyter Notebook and checks are done to see if the dataset does not contain any missing values, upon checking it is discovered that there are variables with missing values. It is decided that all the missing values should be dropped so that all the variables have the same number of data points. Columns that were initially thought to not be useful in the investigation were dropped. The release date was converted to datetime so that it can be used to plot a time series to show the trend of some of the variables as time progresses. The dataframe was sorted in asscending order using the release date as the key, this made it easier to plot time series graphs.

Exploratory-Data-Analysis

Statistics were computed and visualizations were created with the goal of addressing the research questions in the Introduction section. In this case time series plots and scatter plots were created to answer the research questions.

Installation

git clone https://github.com/imukoki/Investigate-a-Dataset.git
cd Investigate-a-Dataset
Jupyter notebook

Requirements

Pandas
Numpy
Matplotlib

Author

👤 Innocent Mukoki

GitHub: Innocent Mukoki
LinkedIn: Innocent Mukoki

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Investigate_a_Dataset.html		Investigate_a_Dataset.html
Investigate_a_Dataset.ipynb		Investigate_a_Dataset.ipynb
README.md		README.md
tmdb-movies.csv		tmdb-movies.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project 1: Investigate-a-Dataset

Udacity Data Analyst Nanodegree

Introduction

Data-Wrangling

Exploratory-Data-Analysis

Installation

Requirements

Author

About

Releases

Packages

Languages

imukoki/Investigate-a-Dataset

Folders and files

Latest commit

History

Repository files navigation

Project 1: Investigate-a-Dataset

Udacity Data Analyst Nanodegree

Introduction

Data-Wrangling

Exploratory-Data-Analysis

Installation

Requirements

Author

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages