Assignment 2

Author

Zachary Bechard

Introduction

Code
library(tidyverse)
ice_movies <- read_delim(
  "https://query.data.world/s/pmbfldxflx7ttdyfs23cx3abehcl5c",
  delim = ";",
  escape_double = FALSE,
  trim_ws = TRUE,
  locale = locale(encoding = "ISO-8859-1")
)

Distribution Analysis

Visualization

Code
ice_movies |>
  group_by(weekend.start) |>
  summarize(total_adm = sum(adm.weekend, na.rm = TRUE)) |>
  ggplot(aes(x = total_adm, fill = after_stat(count))) +
  geom_histogram(bins = 33, color = "black", alpha = 0.7) +
  scale_fill_gradient(low = "brown", high = "darkblue") +
  labs(title = "Distribution of Weekend Movie Admissions", x = "Total Admissions", y = "Frequency", fill = "Count") +
  theme_minimal() +
  theme(plot.title = element_text(hjust = 0.5), legend.position = "right")

Histogram of Total Movie Admissions per Weekend in Iceland

Analysis & Reflection

Key Distribution Patterns:

The histogram above shows the total movie admissions per weekend in Iceland, revealing a skewed distribution with most weekends showing lower admission counts and fewer instances of very high admissions. This pattern suggests that while most weekends attract a consistent number of viewers, there are occasional weekends likely tied to major film releases that draw significantly larger crowds. These occasions could be related to major film releases. The use of 33 bins in the histogram helps to finely detail the variations in weekend admissions, showing the range from low to high more distinctly.

Visualization Choices:

The histogram was chosen because it effectively represents the distribution of numerical data, a key principle emphasized in Chapter 6. This chapter highlights how histograms help reveal patterns in data frequency, particularly in large datasets where raw numbers might obscure underlying trends. The use of a color gradient from brown to dark blue aligns with the chapter’s discussion on effective color schemes in visualizations, where contrasting shades help guide the viewer’s focus toward denser data regions. Furthermore, the selection of 33 bins follows the guidance provided on binning strategies, ensuring that the visualization is neither too generalized nor overly detailed. This preserves clarity while accurately depicting movie attendance variations across weekends.

Critical Evaluation:

The histogram effectively illustrates the distribution of weekend movie admissions, making it easy to identify common attendance levels and outliers. The chosen bin size provides a balanced view of the data. The use of a color gradient helps differentiate between lower and higher frequencies, making the visualization more intuitive. However, one limitation is that while the histogram shows the spread of attendance, it does not provide contextual insights into why certain weekends had higher or lower admissions. A potential improvement could be to use labeled reference points.

Temporal Analysis

Visualization

Code
ice_movies |>
  mutate(weekend.start = as.Date(weekend.start)) |>
  group_by(weekend.start) |>
  summarize(total_adm = sum(adm.weekend, na.rm = TRUE)) |>
  ggplot(aes(x = weekend.start, y = total_adm)) +
  geom_area(fill = "coral3", alpha = 0.5) +
  geom_line(color = "coral4", linewidth = 1) +
  labs(title = "Trend of Total Weekend Movie Admissions in Iceland",
       x = "Weekend Start Date", 
       y = "Total Admissions") +
  theme_minimal() +
  theme(plot.title = element_text(hjust = 0.5))

Trend of Total Weekend Movie Admissions in Iceland

Analysis & Reflection

Key Temporal Patterns:

This visualization reveals the fluctuations in weekend movie attendance over time. Several peaks indicate weekends with significantly higher admissions, which could correspond to previously mentioned major releases, holidays, or some sort of promotion. Conversely, some weekends show noticeable declines, suggesting periods of lower viewing activity. A general trend may be observed where admissions tend to rise during certain months.

Visualization Choices:

An area chart was selected to best capture the changes in admissions over time. A bar chart clearly separates individual weekends, making comparisons more distinct. An area chart smooths out attendance patterns while presenting the total volume of viewers over time. A heatmap helps uncover broader seasonal trends by displaying data across multiple time periods in a structured manner. 

Critical Evaluation:

The visualization effectively highlights variations in weekend movie attendance, making it easy to identify trends. However, it does not provide explanations for the observed fluctuations. A possible improvement would be to include annotations for major film releases or industry events to provide context for sudden spikes or declines. Additionally, incorporating a moving average could help smooth out short-term volatility and reveal underlying long-term trends more clearly.

Brief Conclusion

The two visualizations work together to provide a comprehensive analysis of movie attendance patterns in Iceland. The distribution visualization highlights how admissions are spread across different weekends, revealing common attendance levels and outliers, while the temporal visualization illustrates how movie attendance fluctuates over time and uncovers trends. Together, they tell a cohesive story about audience behavior, showing both the overall frequency of attendance and how it changes week to week. This analysis raises additional questions, such as how specific major movie releases, promotions, or external factors like holidays influence attendance patterns.