This dataset that we found is from IMDB which is a platform where critics review the movies quality for better or worse. As investigate various genre and movies, the goal of this project is to find what makes a good movie. We will use this dataset to analyse the information to get the answer.
Based on the massive movie information, it would be interesting to understand what are the important factors that make a movie more successful than others. So, would like to analyse what kind of movies are more successful, in other words, get higher IMDB score and profit. In this project, we take IMDB scores as response variable and focus on operating predictions by analysing the rest of variables in the IMDB 5000 movie data. The results can help film companies to understand the secret of generating a commercial success movie.
In order to create a good movie. We have chosen the score of 4.0 and above over 5.0. We will use the data to analyse what is needed to create a good movie.
Scope 1 – Predict money wastage from their budget based on sentiment analysis
• Descriptive Analytics
Sentiment analysis of movies that people like (tokenization) and what has been tokenized, you can factor it relating to its genre. The way is using the positive and negative of plot to analyse the profit and loss of the movie. This analysis is important as it will show what is the trend, and show which movies are highly invested but poorly performed. From the result, we can analyse why is the movies poorly performed and is it positive or negative. From there, we can avoid money wastes on the plot that people do not like and put more effort on what people like.
• Predictive Analytics
Based on the analysis of what has been tokenized and factors relating to genre, we can predict money wastage from budget. Then, we can minimize the wastage of budget and emphasise on which more profit to us. Budget is an important term related closely to profit and loss. Minimize wastage of budget is a key of success. For example, we know that negative plot is less acceptable by 20% of people, then we can emphasis on the other 80% of people. And to further support if it being poorly perform, we use the social media likes and score as descriptive analysis to support the money wastage and factor why people hate or like the movie.