I am interested in how movies have changed over time. As I’ve gotten older, I’ve noticed that movies have gotten substantially longer. For example, movies such as Avatar: Way of Water and John Wick 4 both ran upwards of 3 hours in duration, as these were both release this year, 2023. I’m curious to see how movies have either gained more run-time or lost run-time throughout the years, and whether or not the increase/decrease in run-time has influenced other factors of the movie as well. I am also interested in how some variables affect others, such as run-time’s affect on ratings.
The raw data was sourced from https://www.imdb.com/search/title/?groups=top_1000, but the data we will work with is hosted by myself. If you’d like my version of the data, just ask!
To answer the question, I will use the dataset scraped from the top 1000 movies on IMDb. The data set contains information on titles, years, genres, run-times, IMDb ratings, Metascores, and descriptions. This data set offers a comprehensive take on the necessary information to analyze complex relationships between IMDb ratings, Metascores, and genres for the top 1000 movies. Here’s a table that helps the explain the column headers:
| id | years | titles | genres | rated | runtime | ratings | metascores | descriptions |
|---|---|---|---|---|---|---|---|---|
| Movie ID | Release Year | Movie Name | Action, Crime, Thriller, etc. | R, PG, etc. | Movie Length | Ratings, 0-10 | Metascore Rating, 0-100 | Movie Info |
As I menioned earlier, movies have seemingly gotten much longer over the years. So, let’s see if this is a fragment of our imaginations, or if movies are just getting bigger and bigger.
My method of working this problem out was to group by years and then calculate the average duration of the movie. The results are very interesting, with the longest duration “era” being right before the 1970s. This was a time where most movies were either WWII-esque or a Western. According to the graph, these movies were just under 180 minutes.
After the 70s, the duration of movies certainly decreased to around 130-140 minutes. As we move throughout the 2000s, the movie duration trend stayed fairly stagnant until around 2016. This was when Marvel really started to ramp-up their movies, and the new Star Wars trilogy released around these years as well. Both of these studious/movies have longer run-times than usual, so it’s not an unexpected uptick in duration. However, around 2022, the duration of movies skyrocketed to a point we haven’t seen since the 1970s.
There does not seem to be much of a correlation between the run-time (duration) of a movie, and the Metascore rating. I used Metascore here because these are more official reviews taken by groups of accredited movie critics. While the curve of the relationship between the two variables somewhat increases, the standard error is quite high (given by the shadowing). Even throughout the years, this trend has stayed the same.
I needed to multiply the IMDb ratings by 10, less the graph would have been all over the place. Since Metascore is graded on 0-100, I made overall ratings follow suit. This is an interesting result overall. First, the average runtime seems again to have little influence over the ratings/Metascore of movies. However, there are some instances, such as right after 1985, where there was an inverse relationship between runtime and ratings. The confidence of that is fairly low, though. What caught my eye was how the Average IMDb ratings and the Average Metascore intertwined. They mostly go hand-in-hand, but as of the 2000s, IMDb reviews have plateaued. Unlike IMDb ratings, Metascores have been on a roller coaster throughout the 2000s, and peaking in 2020. 2023 seems to be a good year for movies, as both the Average IMDb ratings and the Average Metascore have startred to rise!
From this graph, there hardly seems to be any correlation between higher ratings and the genre type. However, when I ran this next graph to include only the top-rated best genres, things were a bit different:
Now, we see that Westerns have the best overall IMDb ratings, followed distantly by horror, drama, and comedy.
There is a lot of interesting information in the IMDb data set that is completely hidden to the naked eye. Personally, I was happy to see that movies have indeed increased in run-time over the 2000s. It is actually a significant increase too, and people are definitely starting to notice it more with the release of these newer, longer movies. Unfortunately, there did not seem to be a strong correlation between run-times and ratings either. I was hoping for one, as that would have massive implications for the movie industry, and also explain why movies have gotten longer. Overall, there was a lot of interesting data here that helped to clear things up.
Thanks for reading!