Final Project – BAIS 462

Author

Katie Morrill

The Power of Movies

Movies are not just a source of entertainment, they are a multi-billion-dollar global industry. There are so many iconic movies, but it does not mean they had the biggest budget or have the most famous cast. Understanding what makes a movie financially successful is fascinating from both a business and artistic perspective. As a movie fan, I’m curious about what drives a film’s box office performance. I wanted to investigate how genre, production budget, star power, or the release timing affect a movie’s success.

Introduction to Box Office Mojo

Movie Mojo is a website that has movie data. You can explore movies from different years, countries, date etc. I found the dataset fascinating and would encourage other people to investigate trends of movies. A link to the data can be found at https://myxavier-my.sharepoint.com/:x:/g/personal/morrillk_xavier_edu/EW3CbU6yKeFJvJv7cK-BX6IBXkfJ9_f0sTzv6WYc5z6adA?download=1

The data set includes all the movies from the past 10 years, 2015-2025. This data set includes eight variables that are used for analysis. Please note that the variables Genre, Budget, and Running Time on the datasheet are unavailable for use. They are variables on the website but were not able to be used. I scraped the information from Box Office Mojo cleaned and exported to a csv file. My hypothesis the higher budget, large theater showing, and distributor name are associated with higher box office success. I intend to explore how those variables correlate to box office success.

Data Dictionary

Release: The name of the movie.

Rank: Box office rank that year

Genre: Movie category

Total Gross: The total revenue of the movie

Theaters: The number of theaters the movie played in

Release Date: The date the movie came out.

Distributor: the name of the company that made the movie

Year: The year the movie came out

IMDB Data set

I also dove into the IMDB data set, as it provided a deeper insight into movies. I liked this data set because it provided information such as budget and genre that Box Office Mojo did not provide. I wanted to see how budget specifically affects a movies success. The data set included various variables that are useful for analysis such as:

Names: The title of the movie

date_x: The release date

score: Rating of the movie out of 100

genre: genre of the movie

overview: A description of the movie

crew: crew members

orig_title: Original title of the movie

status: release status

orig_land: originally released in this language

budget_x: Movie budget

# A tibble: 1 × 1
  AverageBudget
          <dbl>
1     64882379.
# A tibble: 1 × 1
  AverageScore
         <dbl>
1         63.5

SUMMARY STATISTICS

Average movie budget: $64,882,379

Average Score: 63.5

Top 10 Distributors by Gross

I wanted to dive into the logistics of which distributors have made the most money on movies over the past 10 years. Looking at this graph, you can see Walt Disney has made the most amount of money from their movies, grossing over 20 billion dollars. Universal Pictures and Warner Bros follow, but there is a noticeable drop compared to Disney. There is a steep decline in total gross for Sony Pictures, Paramount, and Lionsgate. Disney Studios is dominating the movie industry in terms of revenue. It is most likely due to their Marvel and Star Wars franchises.

This graph explores the relationship between the number of theaters a movie is shown in and its total box office revenue. The x-axis represents the number of theaters, while the y-axis shows the gross revenue in dollars. Each dot represents an individual movie, color-coded by its release year. Overall, the trend suggests that movies released in more theaters tend to earn higher box office revenue. This is expected, as wider distribution generally leads to a larger audience and higher earnings. Notably, revenue often increases sharply once a movie is shown in more than 3,000 theaters. This observation also raised questions for me about how streaming-only movies generate revenue, since they don’t earn income from traditional box office sales.

Total Box Office Success by Year

This graph examines the relationship between total box office revenue and year. It displays the combined gross revenue of all movies released annually from 2015 to 2025. The data shows that box office revenue peaked around 2018–2019, reaching nearly $12 billion. A sharp decline occurred in 2020, corresponding with the COVID-19 pandemic and widespread theater closures. Since then, revenue has gradually rebounded, indicating that audiences are returning to theaters. The noticeable drop in 2025 is likely due to the fact that only a few months of data are available so far.

Number of Movies by Distributor from 2015-2025

The graph shows the number of movies each distributor has released in the past ten years. The number of movies is on the x-axis while the distributor name is on the y-axis. Warner Bros has released the most movies during this time with 191 titles. Universal is closely behind with 184 movies. A24 and Fathom Events are smaller distributors, but they made the top ten. Going back to the total revenue by distributor, Walt Disney was number one, meaning they made the most money, but they were number 4 in the number of movies released over the ten years. It shows that each Disney movie is successful with fewer releases.

Count of Gross by Movie Ranking

This graph shows the sum of the gross of the ranking for the past ten years.

Movie Revenue by Score

The visualization below uses data from an IMDB file from Kaggle. The graph shows the relationship between audience/ critic score (x-axis) and movie revenue in USD (y-axis). Each dot on the graph represents a movie. Looking at this graph, you can see that most movies score between 50 and 80. 100 is the highest score you can receive on a review. A movie generating over $1 billion is rare. You can see a few outliers. The regression line trends slightly upward. The line suggests a weak positive correlation. Looking at the graph, you can see it is rare for a movie to receive over an 80 on a review.

Budget by Revenue

This graph illustrates the relationship between a movie’s budget (x-axis) and its revenue (y-axis). The blue line represents the line of best fit, indicating a positive trend, as a movie’s budget increases, its revenue also rises. However, there are notable outliers: some lower-budget films generated billions in revenue, while some high-budget films underperformed financially. This highlights that while budget is a factor in box office success, it doesn’t guarantee it.

Conclusion

In this project, I combined two datasets to explore factors that may influence a movie’s financial success. The primary data set, scraped from Box Office Mojo, contained detailed box office performance data from 2015 to 2025. The secondary dataset, from IMDb via Kaggle, included key attributes such as movie scores, genres, and production budgets. One particularly interesting insight was the performance of Disney compared to its competitors. Despite releasing significantly fewer films, Disney consistently produced box office hits. For example, Warner Bros. released 61 more movies than Disney during this period, but Disney generated over $10 billion more in total gross revenue compared to Warner Bros. This highlights Disney’s efficiency in creating movies that are box office successes.

Additionally, the data revealed that a high production budget does not always guarantee box office success. While some high-budget films performed well, others did not. Demonstrating a complex relationship between budget size and box office performance. My hypothesis was correct because their is a positive correlation, but it is not always a guarantee. I also believed the number of theaters the movie is shown in and distritherebutor name have a part in the success of a movie. That has been proven correct through the analysis.