Skip to content

Collect data from Metacritic, reads the file into Jupyter Notebook and perform an analysis.

License

Notifications You must be signed in to change notification settings

tiffanivick/metacritic-movies

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

metacritic-movies

This project uses Python and regular expressions to create a web scraper that searches for movie titles, dates, descriptions, metascores, and images in Metacritic. It gets the Metacritic url and constructs a list of movies from a particular year and page and then writes it to a csv file. It then reads the file and performs an analysis on the data.

The project is built using Python and Regual Expressions in Jupyter Notebook.

Built With

Visual Studio Code Jupyter Notebook Python Pandas Matplotlib MongoDB

Getting Started

Imports used to run this program:

  • re
  • urlib3
  • certifi
  • json
  • pymongo
  • time
  • pandas
  • matplotlib (pyplot and FormatStrFormatter)
  • Seaborn

To install in terminal:

  1. Open terminal
  2. path\to\project\file: pip3 install {package to install}

How To Use

This project uses two files, one for the scraper and another for the analysis.

metacritic-scraper

Connect to MongoDB

with open("/fileLocation/credentialsFileName.json") as f:
  data = json.load(f)
  mongo_connection_string = data ['mongodb']

Retrieve the data in your MongoDB collection

client = pymongo.MongoClient(mongo_connection_string, tlsCAFile=certifi.where())
db1_database = client['databaseName']
metacritic_data = db1_database['collectionName']

Get the Metacritic url

url = "https://www.metacritic.com/browse/movies/score/metascore/year/filtered?year_selected=(year)&sort=desc&view=detailed&page=(page)"

metacritic-analysis

Retrieve credentials from json credentials file stored on local computer and fetch the MongoDB collection

# Retrieve credentials
with open("/fileLocation/credentialsFileName.json") as f:
  data = json.load(f)
  mongo_connection_string = data ['mongodb']
  
# Fetch the database named "DB1"
client = pymongo.MongoClient(mongo_connection_string, tlsCAFile=certifi.where())
db1_database = client['databaseName']
metacritic_data = db1_database['collectionName']
metacritic = pd.DataFrame(metacritic_data.find())

Add year and month columns to dataframe

metacritic['year'] = metacritic.release_date.dt.year
metacritic['month'] = metacritic.release_date.dt.month

img

License

Distributed under the MIT license. See LICENS.txt for more information.

About

Collect data from Metacritic, reads the file into Jupyter Notebook and perform an analysis.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published