Introduction
Our project is focused on online movie database and reviewing service, IMDb. We will specifically be looking at the years 2011 to 2021 and examining the progression of various variables over the last decade. We are also interested in how the years 2020 and 2021 compare to the years prior, due to the impact of COVID on movie theaters and the film industry as a whole. Our overall goal is to see how much the movie industry has changed over from the 2010s until now, as well as possibly discovering different trends related to the variables we focused on.
Data Collection
We scrapped our data directly from IMDb into R using the SelectorGadget tool. We visited the webpage associated with each year (for example, 2021: https://www.imdb.com/search/title/?year=2021&title_type=feature&sort=boxoffice_gross_us,desc) and used the read_html function to acquire our desired variables.
Variables
The variables we will be analyzing are the following:
title: The listed title name on IMDb
genre: The listed genre (up to 3 can be stated) on IMDb
genre1: A seperated variable taken from the original genre variable, only taking the first listed genre
gross: The listed box office gross ($USD) on IMDb
starRating: The listed star rating (scale of 1-10) on IMDb
ageRating: The listed age rating (R, PG-13, PG, G) on IMDb
length: The listed length of movies (in minutes) on IMDb
year: The listed year of release on IMDb
Top Genres, Gross, and Age Rating
This graph represents the relationship between the top genres and their movies’ respected gross over the past 10 years. As you can see, the majority of top movies across the various years are usually action films. It is also interesting to note that almost every action movie is PG-13 or R rated. The other key point to notice is the drop in gross after 2019. This is because of COVID-19 as it became very hard for producers to make good quality movies in a short span of time, and it was harder the for average movie-theater family to go and actually watch a movie in a theater.
We can delve deeper into the age rating information presented in the previous graph. It is important to note that IMDb allows for films to have up to 3 listed genres, but we will only be looking at the first listed genre for each film. Like previously stated, we can see that action films are mostly rated PG-13, but have the most age-rating variety compared to other genres. Animation seems to be the most family friendly genre, with only PG and G ratings. Interestingly, we see comedy, crime and horror are all purely rated R. Though, this can just mean that very few films have their first genre listed as comedy/crime/horror.
Comparing the Top Grossing Year (2015) vs the Pandemic (2020) and the Following Year (2021)
We can take a closer look into the effects of the pandemic. In 2015, Star Wars: The Force Awakens nearly hit a billion gross earnings, while Bad Boys for Life in 2020 made a little over 200 million gross earnings. These two movies are the top earning movies in their respective years, but have vastly different results. There are many other factors influencing what we see (for example, Star Wars is a vastly popular series), but we can take a look at the other movies. All movies in 2015 are able to make over 200 million gross earnings, while only the top movie of 2020 accomplishes the same feat.
As pandemic restrictions became slightly looser in 2021, we still see pretty low gross earnings overall. While most people are able to go out to theaters now, some may not feel safe to go out and view movies in person. There is another culprit: streaming services. Some production companies (like Disney) released their films onto their own services first, then into theaters after. As a result, people were able to view the films online first, eliminating the need for going to theaters.
Do Longer Movies Make More Money?
This next graph shows the relationship between the length of the top movies and their respected gross. The first thing to point out is that there isn’t much data for the not as popular movie genres. However, for the genres that have a lot of data, this is very interesting to look at. For a genre like action movies, it looks like movies that are either really short or really long do very well or very bad in the box office. For example, there is an action movie around 165 minutes long which did not do so well, but there is an action movie at 180 minutes that did extremely well. It is also important to note a genre like horror does worse in the box office as the movies get longer.
Relationship Between Star Rating and Gross
This graph highlights the relationship between the imdb star rating and the movies’ gross. The first obvious thing to note is that as the ratings increase, the gross also increases (with some slight variation). However, the outliers are very intriguing. There are some films that received not so good ratings, yet still did pretty well in the box office compared to a movie that was rated much higher. It is also important to note that many of the movies that didn’t do well in the box office nor a good star rating were usually R rated movies. However, if we compared the R rated films to PG-13 rated films, it is easy to see that PG-13 films are oftentimes more successful in both star rating and gross.
Examining star rating closer, we can look at the lowest and highest rated movies according to IMDb users. We have 4 highest rated movies at 8.4 and one lowest rated at 4.9. Interestingly, Joker and Twilight: Breaking Dawn made close to the same amount of gross earnings, despite being rated on opposite ends of the spectrum. Overall, we can see that being rated highly by IMDb users doesn’t necessarily mean making more profit.
Conclusion
With so many movies being produced and released every year, it’s hard to keep up with the continuously growing industry. Fortunately, this project allowed us to take a closer look into specific movies that performed the best and worst from the past decade, as well as how different variables (such as age rating or length) have an effect on gross earnings or star rating. We were also able to visually see how much of an impact the pandemic had on gross earnings. Will box office gross earnings improve in the years going forward, or will streaming services from the comfort of home continue to hold their grasp on movie-watchers? How will these low box office earnings continue to affect the movie theaters, and the movie industry as a whole? Hopefully, as we continue to work around COVID-19, the situation can improve.