Movie Analysis

Author

NM

Introduction:

Movies are used for storytelling, evoking emotions, sparking conversations, and shaping cultural trends. They take people to different worlds, allowing them to escape reality, explore diverse cultures, and reflect on their own lives. One of my favorite animated movies is Grave of the Fireflies which is a heart-wrenching story that left me deeply moved (and forever sad)and grateful for the family bonds I have. Movies like this one remind us of the impact that cinema can have.

But with millions of films produced worldwide, what makes a movie truly successful? Is it the plot, the budget, the actors, or even the runtime that plays a role? In this project, I aim to explore the key factors that drive a movie’s success. Specifically, I want to understand whether variables like budget, genre, and movie length can help predict a film’s popularity and financial outcome.

Data Introduction:

For this analysis, I will be using a dataset from Kaggle, containing detailed information about 4,803 movies. However, after some data cleaning, there were a total of 3,766 movies (did not make sense for a movie to have a $0 budget). The dataset includes variables such as budget, genres, production companies, release date, revenue, runtime, and popularity. It offers a comprehensive view of different movie characteristics, allowing for in-depth analysis. The data was last updated two months ago.

Here is the link to the data: https://myxavier-my.sharepoint.com/:x:/g/personal/munnichas_xavier_edu/EfECKAQGCF5JlalfLQq4GvoBVcLGZJMp5DUqsYNoZvYmfA?download=1

Additionally, I chose to scrape data from IMDB because it ranks the top 100 greatest films of all time, offering a benchmark for comparison. This supplementary dataset helps identify how the best-rated movies compare to those in the larger Kaggle dataset.

Data Dictionary:

  • Title: Original title of the movie

  • Budget: Budget of the movie

  • Genres: Genres the movie fall under

  • Revenue: Amount the movie generated

  • Vote average: Average rating given to the movie by users

  • Vote Count: Number of votes received by the movie

  • Runtime: Duration of the movie in minutes

  • Popularity: Popularity score of movie

  • Production Company: Which companies were involved in the making

Research Question: What are some key factors that drive the success of a movie, and can people predict what movies will be a hit based on variables like budget, genre, and other characteristics.

IMDB List: Top 100 Greatest Films of All Time (Secondary Source)

To supplement the primary dataset, I collected data from the IMDB website, which ranks the top 100 greatest movies of all time.

Here is the link to the data: https://myxavier-my.sharepoint.com/:x:/g/personal/munnichas_xavier_edu/Ed7etyOCi6NBpN798oxocLQB_RmwKQr5mI026Z9qTYligQ?download=1

Here is the link to the web page: https://www.imdb.com/list/ls031185008/

Data Dictionary (variables I collected):

  • Title: Movie name

  • Release Year: Year the movie was released

  • Duration: Total movie runtime in minutes

  • Star Rating: IMDB user rating

  • Number of votes: Total user votes

  • Movie description: Plot summary

Summary Statistic

     budget             revenue             runtime       vote_average   
 Min.   :        1   Min.   :0.000e+00   Min.   :  0.0   Min.   : 0.000  
 1st Qu.:  8000000   1st Qu.:6.010e+06   1st Qu.: 95.0   1st Qu.: 5.700  
 Median : 23000000   Median :3.883e+07   Median :105.5   Median : 6.300  
 Mean   : 37042838   Mean   :1.040e+08   Mean   :109.3   Mean   : 6.226  
 3rd Qu.: 50000000   3rd Qu.:1.221e+08   3rd Qu.:120.0   3rd Qu.: 6.900  
 Max.   :380000000   Max.   :2.788e+09   Max.   :338.0   Max.   :10.000  
                                         NA's   :2                       
   vote_count     
 Min.   :    0.0  
 1st Qu.:  114.0  
 Median :  365.5  
 Mean   :  856.5  
 3rd Qu.:  969.2  
 Max.   :13752.0  
                  

Revenue vs Budget

Due to having many different genres, it is slightly hard to tell what is what. However, there is a slight upward trend that as budget increase, there is also a slight increase in revenue. Spending more on the budget could result in a higher box office earnings. However some outliers show that a few high end budget can still results in a flop and also vice versa. Looking at genres, action and adventure tend to have a higher budget probably due to all the stunt doubles and CGI.

Popularity vs Revenue

This graph helps us understand whether popularity is a good predictor of revenue. Generally, movies with higher popularity scores tend to have greater audience engagement and visibility. When a movie that has significant attention and becomes widely talked about, it naturally attracts more viewers, leading to increased ticket sales, streaming views, and overall earnings. By examining the correlation between popularity and revenue, we can infer that movies with strong audience engagement are more likely to become box office hits. Therefore, popularity can be considered a valuable indicator of a movie’s financial success, as higher engagement often translates to greater revenue potential.

Average Popularity by Genre

This bar plot shows us immediately which genre has the highest average popularity score which is the family genre while second is science, and adventure comes third. Diving deeper, this can tell us a lot about the family genre and that since there is a broad appeal for family friendly content where everyone (kids, parents, adults) want to watch something wholesome and heart-warming.

Average Popularity by Genre

It’s interesting to see all of the genres next to each other to see the different variations between each other. I felt it was important to visualize these differences because understanding the typical runtime for each genre can provide insights of what the audience expects and the trends in the industry. Looking at the boxplot, it’s interesting that foreign films tend to be the shortest duration including with TV which can show us that they have different production style or the audience expect something different. The genre with the longest median duration is adventure which could reflect the genre’s focus on storytelling and different complex plots.

Vote Average vs. Revenue

This graph will help us understand if movies who have a higher average ratings also get higher revenue. Based on the graph though, there’s not really any correlation with except for action and adventure movies, they stand out more with slightly higher revenue from the rest. More of the foreign, history, and horror movies tend to have an average rating but doesn’t result into much revenue.

Top 5 Movies Based on Highest Popularity Score

Based on the highest popularity score, Minions happens to be the most popular. This emphasizes what we learned in the previous graphs that people enjoy family movies since many children and adults can enjoy it. Another noteworthy film is Interstellar (which is one of my all time favorite) with a compelling storyline and also still having underline tones of family as well. The next three films are actions which makes sense with intense plot twists and a lot of high energy.

Top 5 Movies Based on Highest Vote Average

Based on the highest vote average, “Me You and Five Bucks” stands out as the highest voted average. This romance is successful due to its relatable story line which resonated with the audience. Interestingly, the other four movies are all dramas which dive deeper into human complexity and their experiences. Movies with higher emotional connections tend to influence audience’s perception and ratings.

IMDB: Top 5 Movies Based on Star Rating

On the other hand, the top 5 movies based on star rating in IMDB comes out to be The Godfather, The Godfather Part II, Star Wars: Empire Strikes Back, Casablanca, and Citizen Kane. All of these are known for their storytelling and compelling characters. The are all in different genres, however, they do different groundbreaking stuff.

Conclusion

Based on the analysis, some key factors that pay influence a movie’s success includes, budget, popularity, genre, and audience engagement. There is a general trend that higher budgets are associated with increase revenue but that not be always the case. Popularity does tend to be a significant indicator of financial success. For instance, Minions did really well due to having a broader audience where adults and children can enjoy. Genre can also play a role, where family oriented films are more liked than others. To reach success, the movie must stand out in a certain area whether it’s the storytelling, film making, music or CGI.