Introduction

This lab explores whether 1980s movies with more “action elements”—explosions, stunts, and car chases—performed better commercially or critically. The audience is casual movie fans who believe that “more explosions = better movie.” Using ggplot2 and tidyverse tools, I compare explosion counts across movies, the relationship between explosions and box office revenue, and how action elements differ across genres.

Libraries and Global Theme

library(readxl)
library(ggplot2)
library(dplyr)
library(RColorBrewer)
library(here)

theme_set(theme_minimal(base_size = 14))
theme_update(
plot.title = element_text(face = "bold"),
plot.subtitle = element_text(margin = margin(b = 6)),
plot.caption = element_text(size = 10, color = "grey40"),
legend.position = "top",
panel.grid.minor = element_blank()
)

pal_fill  <- scale_fill_brewer(palette = "Set2")
pal_color <- scale_color_brewer(palette = "Set1")

Load Data

# Load the movie dataset from excel
# here() is used so the path works on any computer when data is in the project folder
movies <- readxl::read_xlsx(here("data", "80s_Movie_Database.xlsx"))
# Print data types and the first few rows
glimpse(movies)
## Rows: 400
## Columns: 20
## $ Title                       <chr> "The Untouchables (1)", "Scarface (2)", "B…
## $ Year                        <dbl> 1984, 1989, 1985, 1983, 1989, 1983, 1987, …
## $ Director                    <chr> "James Cameron", "Steven Spielberg", "Ridl…
## $ Genre                       <chr> "Adventure", "Horror", "Sci-Fi", "Sci-Fi",…
## $ `Box Office (USD Millions)` <dbl> 281.4, 161.4, 684.1, 187.5, 183.5, 257.0, …
## $ `IMDb Rating`               <dbl> 6.2, 7.9, 9.5, 6.7, 8.6, 9.4, 7.2, 6.6, 8.…
## $ `Rotten Tomatoes (%)`       <dbl> 54, 67, 70, 74, 41, 83, 96, 43, 90, 59, 43…
## $ `Main Actor`                <chr> "Arnold Schwarzenegger", "Tom Cruise", "Ma…
## $ `Main Actress`              <chr> "Molly Ringwald", "Sigourney Weaver", "Lin…
## $ `Filming Location`          <chr> "Chicago", "Los Angeles", "London", "Tokyo…
## $ `Number of Oscars`          <dbl> 3, 0, 6, 4, 6, 5, 8, 4, 5, 2, 5, 1, 6, 1, …
## $ `Runtime (min)`             <dbl> 146, 144, 125, 89, 128, 83, 104, 92, 122, …
## $ `Budget (Millions)`         <dbl> 48.8, 8.2, 27.5, 83.6, 51.3, 68.5, 21.7, 6…
## $ Tagline                     <chr> "I'll be back.", "Roads? Where we’re going…
## $ `Villain Name`              <chr> "Darth Vader", "The Thing", "The Thing", "…
## $ `Famous Quote`              <chr> "Yippee-ki-yay!", "Say hello to my little …
## $ `Stunt Count`               <dbl> 7, 1, 1, 16, 13, 19, 38, 40, 16, 18, 5, 41…
## $ `Car Chase Scenes`          <dbl> 3, 1, 1, 5, 1, 6, 5, 9, 3, 10, 7, 9, 6, 3,…
## $ Explosions                  <dbl> 6, 4, 22, 1, 22, 21, 15, 0, 23, 1, 9, 13, …
## $ `Fun Fact`                  <chr> "The director almost cast a completely dif…

Plot 1 - Distribution of Explosions across the top 3 genres

# Identify which genres have the highest average explosions
# slice_max() returns the top 3 genres with the highest mean explosions
top_genres <- movies %>% 
  group_by(Genre) %>% 
  summarize(mean_explosions = mean(Explosions, na.rm = TRUE)) %>% 
  slice_max(order_by = mean_explosions, n = 3) %>% 
  pull(Genre)

# Create density plot showing the distribution of explosions for the 3 genres
Plot_Explosions_Distribution <- movies %>%
  filter(Genre %in% top_genres) %>%
  ggplot(aes(x = Explosions, fill = Genre)) +
  geom_density(alpha = 0.6, linewidth = 1) +
  pal_fill +
  labs(
    title = "How Explosive Were 80s Movies?",
    subtitle = "Distribution of explosions among the three most explosive genres",
    x = "Explosion Count",
    y = "Density",
    caption = "Figure 1. Some genres use explosions more consistently than others. Higher peaks = more common explosion levels."
  )

Plot_Explosions_Distribution

Figure 1. This plot shows the distribution of explosions in the three genres with the highest average explosion counts. Peaks in the curve represent explosion levels that appear frequently. One genre has a tall peak further to the right, meaning explosions are more common in that genre. This supports the story that some genres in the 1980s relied heavily on explosions as part of their identity.

Plot 2 - Average Box Office Revenue for Top 3 Genres

# Rename column so it's easier to type and work with
movies_boxoffice <- movies %>%
  rename(BoxOfficeMillions = `Box Office (USD Millions)`)

# Filter for top genres again
movies_top_genres <- movies_boxoffice %>%
  filter(Genre %in% top_genres)

# Calculate average revenue grouped by number of explosions and genre
# (Shows if more explosions = more money)
avg_box_office <- movies_top_genres %>%
  group_by(Genre, Explosions) %>% # group movies by genre AND explosion count
  summarize(AverageRevenue = mean(BoxOfficeMillions, na.rm = TRUE)) %>%
  ungroup()

# Scatterplot with trend lines by genre
Plot_Avg_BoxOffice <- ggplot(avg_box_office, aes(x = Explosions, y = AverageRevenue, color = Genre)) +
  geom_point(size = 3, alpha = 0.7) +
  geom_line(aes(group = Genre), size = 1) +
  labs(
    title = "Do Explosions Lead to Higher Box Office?",
    subtitle = "Average revenue by number of explosions (Top 3 genres)",
    x = "Explosion Count",
    y = "Avg. Box Office (Millions USD)",
    caption = "Figure 2. No consistent upward trend — explosions alone do not guarantee revenue."
  ) +
  theme_minimal() +
  scale_color_brewer(palette = "Set2")

Plot_Avg_BoxOffice

Figure 2. The dots show the average box office revenue for movies based on how many explosions they had. If explosions made more money, we would expect the dots to move upward as explosions increase. Instead, the pattern jumps around. Some movies with fewer explosions earned more money, and movies with many explosions did not consistently earn more. This suggests that explosions alone do not determine box office success.

Plot 3 - Explosions vs Stunts (Facets by Genre)

# Rename columns for easier typing
movies_action <- movies %>%
  rename(StuntCount = `Stunt Count`, CarChases = `Car Chase Scenes`)

# Filter the same top genres
movies_action_top_genres <- movies_action %>%
  filter(Genre %in% top_genres)

# Scatter plot showing explosions vs stunt count across genres
Plot_Explosions_vs_Stunts <- ggplot(movies_action_top_genres, aes(x = StuntCount, y = Explosions)) +
  geom_point(aes(color = Genre), size = 3, alpha = 0.7) +
  geom_smooth(method = "lm", se = FALSE, color = "grey30", linetype = "dotted") + # trend line
  labs(
    title = "Explosions vs Stunts Across Top Genres",
    subtitle = "Each point represents a movie; trend lines show direction",
    x = "Stunt Count",
    y = "Explosion Count",
    caption = "Figure 3. Genres show different patterns in how explosions are used with stunts."
  ) +
  scale_color_brewer(palette = "Set2") +
  facet_wrap(~ Genre) +
  theme_minimal()

Plot_Explosions_vs_Stunts

Figure 3. Each point is a movie, showing how stunts and explosions relate. Some genres show a rising trend, meaning that movies with more stunts also tended to have more explosions. Other genres are more scattered, suggesting different approaches to action scenes. This means that explosions are part of the action formula, but how they are used depends on the genre.

Conclusion

Based on the visualizations, explosions are common in some genres, but they do not guarantee higher box office performance. Some genres consistently use explosions, but when we compare explosions to revenue, there is no clear upward pattern. Movies with fewer explosions sometimes make more money. Explosions contribute to action and excitement, but they are not a replacement for a strong story or other elements that make a movie successful.

Export Plots

if (!dir.exists("figs")) dir.create("figs")
ggsave("figs/plot1_density.png", Plot_Explosions_Distribution, width = 7.5, height = 5, dpi = 300)
ggsave("figs/plot2_dotplot.png", Plot_Avg_BoxOffice, width = 7.5, height = 5, dpi = 300)  
ggsave("figs/plot3_faceted_scatter.png", Plot_Explosions_vs_Stunts, width = 7.5, height = 5, dpi = 300) 

References

Healy, K. (2018). Data Visualization: A Practical Introduction. Princeton University Press.

Wilke, C. (2019). Section 29: Telling a story and making a point. In Fundamentals of Data Visualization. O’Reilly Media, Inc.

Wilke, C. (2019). Part I: From Data to Visualization & Part II: Principles of Figure Design. In Fundamentals of Data Visualization. O’Reilly Media, Inc.