While doing my semester project on NFL Attendance data, I wanted to take a look at the average total attendance per week from 2000 through 2019. First, I needed to read in and edit the data to get what I wanted.
library(tidyverse)
# Reading in data
attendance <- read.csv("/Users/drewgreiner/Documents/nfl/attendance.csv")
# Grouping data by week
weekly_attendance <- attendance %>% group_by(week) %>%
summarise(sumweekly = sum(weekly_attendance, na.rm = TRUE)/2) %>%
arrange(week)
In looking at the above visualization, we can see the average number of people per week throughout the NFL season. There is a confidence interval added, allowing us to see where there may be more variance and larger ranges. While the early data appears high, the middle weeks of the season see a significant decrease in total attendance. While this could be because of colder temperatures, poor team performance, or other variables unaccounted for in this dataset, a major reason for it is the bye week in the NFL. In the early years of the dataset, the bye would come as early as week 3, so that is why the weeks starting at three appear lower. The middle weeks are where byes continue to take place and always have, which is why those have the lowest totals for attendance.
Overall, we can see the attendance drops in the first few weeks of the season into the middle, before rebounding late in the season as all teams. To make the visualization better, I added axis titles and a general title to allow for people to better understanding what is being shown through the data.