Data import

I imported the Movie Attendance in Iceland data set and then took a quick look at the data contained in it. I saw data for years 2017 through 2021, and I noticed that for each week data was collected, there is movie attendance (adm.to.date) data. I was curious to see if there is some seasonality to movie attendance in Iceland, so I took the average of the adm.to.date for every month to compare each year’s trendline in a line graph.

Reorganizing the data

The next step was to clean and organize the data into data frames that would visually display the admissions data by month and year. I used mutate() to convert the blog dates to month and year columns and then used summarize() to calculate the mean of adm.to.date by each month. I made a ggplot of the data frame ice_movies_2, and I saw some interesting trends, but I decided that I wanted to add some more information. Specifically, are box office topping movies influencing these trends?

Top box office films

I created a data frame that summarized the top films by month and year based on the average box office earnings. I used slice_max to find the top films by box office earnings and then joined my data frames so that I could “overlap” the top box office films onto my admissions trend lines.

Note: I decided to filter out 2017 and 2021 data because both years have several months of missing data. I also considered including a gradient scale on the geom_label to indicate which films had the highest average box office earnings, but it complicated the graph too much and made it difficult to read.

library(tidyverse)
library(lubridate)
library(ggrepel)
ice_movies <- read_delim ("https://query.data.world/s/pmbfldxflx7ttdyfs23cx3abehcl5c", ";", locale = locale(encoding = "ISO-8859-2",
    asciify = TRUE))

ice_movies_2 <- ice_movies %>% select(blog.date, film.title, adm.to.date, total.box.o.to.date) %>% mutate(month = month(blog.date, label = TRUE), year = year(blog.date)) %>% group_by(month, year) %>% drop_na() %>% summarize(avg.adm.to.date = mean(adm.to.date))

top_movies <- ice_movies %>% mutate(month = month(blog.date, label = TRUE), year = year(blog.date)) %>% group_by(month, year, film.title) %>% drop_na() %>% summarize(top.box.office = mean(total.box.o.to.date)) %>% select(film.title, top.box.office, month, year) %>% distinct() %>% ungroup() %>% slice_max(order_by = top.box.office, n = 20, with_ties = FALSE) 

top_movies_year <- top_movies %>% group_by(film.title) %>% filter(top.box.office == max(top.box.office))

top_movies_year_2 <- inner_join(ice_movies_2, top_movies_year, by= c("year" = "year", "month" = "month")) %>% filter(year < 2021 & year >= 2018) %>% select(film.title, top.box.office, month, year, avg.adm.to.date) 

ice_movies_2 %>% filter(year < 2021 & year >= 2018) %>%
ggplot(aes(x = month, y = avg.adm.to.date, group = factor(year), fill = factor(year), color = factor(year))) + geom_line(size = 1) +
scale_color_brewer(palette = "Set2") +
scale_fill_brewer(palette = "Set2") +
geom_point(data = top_movies_year_2, mapping = aes(month, avg.adm.to.date), size = 3) +
geom_label_repel(data = top_movies_year_2, mapping = aes(label = film.title), force_pull = 0, color = "white", ylim=c(1800, 4600), xlim=c(1,12), segment.linetype = 5, min.segment.length = 0, segment.color = "grey", size=2.8, fontface = "bold") + guides(fill = guide_legend(override.aes = aes(fill = NA, label = ""))) +
ylim(0, 15000) +
ylab("Average movie admissions") +
xlab("Month") +
labs(title = "Average movie admission numbers in Iceland by month for 2018, 2019, and 2020", subtitle = "with top box office films by month", caption = "Data Source: Movie attendance in Iceland") + 
theme_minimal() + theme(legend.position = "bottom")

Marvel and Mamma Mia!

Unsurprisingly there are some trends with monthly movie admissions and the top box office earning films. Marvel’s Avenger movies drove some significant traffic to Iceland movie theaters in the summer of 2018 and 2019. Also, I was surprised to see Mamma Mia! topping the list, but I wonder if ABBA music is more prominent in Scandinavian countries like Iceland. It appears that the summer months (Jun. - Sep.) see the highest admission numbers, followed by the holiday months (Nov. - Jan). Additionally, the graph highlights how Covid-19 impacted movie theater admissions in Iceland because it has decreased steadily since May 2020.