Predicting Super Bowl Average Viewership

What is Average Viewership: Average viewership is calculated by measuring the number of concurrent viewers at regular intervals throughout a broadcast (e.g., every minute or 5 minutes), summing those totals, and dividing by the number of intervals.

Loading Libraries & Data

# Libraries
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(forecast)
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.5.2
library(zoo)
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
# Reading in Data
sb_data <- read.csv("C:/Users/james/OneDrive/Desktop/super-bowl-ratings.csv")
# Viewing first few rows of data
head(sb_data)
##   super_bowl super_bowl_number      date network average_viewers total_viewers
## 1          I                 1 1/15/1967     NBC        24430000      35600000
## 2         II                 2 1/14/1968     CBS        39120000      51300000
## 3        III                 3 1/12/1969     NBC        41660000      54500000
## 4         IV                 4 1/11/1970     CBS        44270000      59200000
## 5          V                 5 1/17/1971     NBC        46040000      58500000
## 6         VI                 6 1/16/1972     CBS        56640000      67300000
##   household_rating household_share cost_of_30_second_ad_usd
## 1             18.5              36                    37500
## 2             36.8              68                    54500
## 3             36.0              70                    55000
## 4             39.4              69                    78200
## 5             39.9              75                    72500
## 6             44.2              74                    86100
# NA values per column
colSums(is.na(sb_data))
##               super_bowl        super_bowl_number                     date 
##                        0                        0                        0 
##                  network          average_viewers            total_viewers 
##                        0                        0                        8 
##         household_rating          household_share cost_of_30_second_ad_usd 
##                        0                        1                        0

Data Visualization

# Density plot of average viewers

ggplot(sb_data, aes(x = average_viewers))+
  geom_density(fill = "skyblue")+
  theme_bw()+
  theme(panel.grid.major = element_blank(), # Turn of the background grid
        panel.grid.minor = element_blank(),
        panel.border = element_blank(),
        panel.background = element_blank()) +
   scale_x_continuous(breaks = scales::pretty_breaks(),
                     labels = scales::comma_format())+
  labs(x="Average Viewership", y= "Density", title="Density Plot of Super Bowl Average Viewership (1967-2021)")

# Finding mean average viewership per super bowl by network

networks <- sb_data %>%
  group_by(network) %>%
  summarise(avg = mean(average_viewers)) %>%
  ungroup() %>%
  arrange(desc(avg))

# Plotting Average Viewership per Super Bowl by Network

ggplot(networks, aes(x=network, y=avg, fill=network))+
  geom_bar(stat = "identity")+
  theme_bw()+
  theme(panel.grid.major = element_blank(), # Turn of the background grid
        panel.grid.minor = element_blank(),
        panel.border = element_blank(),
        panel.background = element_blank()) +
  labs(x="Network", y="Avg. Viewership Per Super Bowl (Millions)", title="Avg. Viewership Per Super Bowl Airing By Network")

# Filtering to get 50 years of data
sb_data30 <- sb_data %>%
  filter(super_bowl_number >= 11)

# Converting to time series object
sb.ts <- ts(sb_data30$average_viewers, start = c(1977, 1), end = c(2026, 1), freq = 1)

# Plotting as Time Series Object
plot(sb.ts, main="Super Bowl Avg. Viewers (1977-2026)", 
     xlab="Year",
     ylab="Avg. Viewers (Millions)",
     col = "royalblue",
     lwd = 2)

Setting Test & Validation Data

# Setting validation period length
nValid <- 10

# Setting training period length
ntrain <- length(sb.ts) - nValid

# Creating training data
train.ts <- window(sb.ts, end = c(2016, 1))

# Creating Validation data
valid.ts <- window(sb.ts, start = c(2017, 1), end = c(2026, 1))

ACF Plot of Differences

# Finding lagged differences
diff.sb.ts <- diff(sb.ts)

# Plotting lagged differences
Acf(diff.sb.ts)

The ACF plot communicates that there is constant auto correlation structure meaning no trends or seasonality. This also signifies constant variance and a constant mean. The data possesses stationarity due to constant variance, mean, adn auto correlation.

Difference Plot

# Difference Plot
plot.ts(diff(sb_data30$average_viewers))

This plot conveys the yearly change in super bowl average viewership from one year to the next. We can see a constant yearly fluctutation in the differences.

Linear Trend Model

# Fitting Linear Trend Model
linear_mod <- tslm(train.ts ~ trend)

# Results
summary(linear_mod)
## 
## Call:
## tslm(formula = train.ts ~ trend)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -10853979  -5735079   -584860   5153420  13066739 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 70334462    2095995   33.56  < 2e-16 ***
## trend         916880      89091   10.29 1.53e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6504000 on 38 degrees of freedom
## Multiple R-squared:  0.736,  Adjusted R-squared:  0.729 
## F-statistic: 105.9 on 1 and 38 DF,  p-value: 1.527e-12
# Predicting on validation set
linear_pred <- forecast(linear_mod, h = nValid, level = 90)


autoplot(sb.ts)+
  autolayer(linear_pred, alpha = 0.3)+
  autolayer(linear_pred$mean, series="Linear Forecast")+
  autolayer(linear_mod$fitted.values, series="Linear Fit")+
  autolayer(valid.ts, series="Observed")+
  labs(x="Year", y="Avg. Viewership (Millions)", title="Forecasts From Linear Model")+
  guides(color = guide_legend(title = ""))

ACF of Residuals

linear.res <- linear_mod$residuals
acf(linear.res)

The ACF Plot on the residuals will indicate whether the residuals are autocorrelated. If so the model has missed a pattern in the data. We can see significant autocorrelation at lags 0, 1, and 2. With the significant spike at low lags the model is underfitting a small bit. Ideally all residuals should be in the blue bounds (95% Confidience Interval) and their behavior is random. Auto correaltion is represented by points outside of the bounds.

Exponential Trend Model

# Fitting Exp Trend model
exp_mod <- tslm(train.ts ~ trend, lambda = 0) #lambda = 0 indicates exponential trend
# Results
summary(exp_mod)
## 
## Call:
## tslm(formula = train.ts ~ trend, lambda = 0)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.154006 -0.058298 -0.001982  0.062365  0.154312 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1.809e+01  2.347e-02  770.76  < 2e-16 ***
## trend       1.019e-02  9.975e-04   10.22 1.88e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.07282 on 38 degrees of freedom
## Multiple R-squared:  0.7331, Adjusted R-squared:  0.726 
## F-statistic: 104.4 on 1 and 38 DF,  p-value: 1.881e-12
# Forecasting on validation data
exp_pred <- forecast(exp_mod, h= nValid, level = 90)

# Ploting Forecast
autoplot(sb.ts) +
  autolayer(exp_pred, alpha = 0.3)+
  autolayer(exp_pred$mean, series="Exponential Forecast") + 
  autolayer(exp_mod$fitted.values, series = 'Exponential Fit') +
  autolayer(valid.ts, series = "Observed") + 
  labs(x="Year", y="Avg. Viewership (Millions)", title="Exponential Forecast") + 
  guides(color = guide_legend(title = ""))

exp.res <- exp_mod$residuals
acf(exp.res)

We can see the spikes at early lags similar to the linear trend model meaning that this model is slightly underfitting.

Quadratic Trend Model

# Fitting Quadratic Trend Model
quad_mod <- tslm(train.ts ~ trend + I(trend^2))

# Results
summary(quad_mod)
## 
## Call:
## tslm(formula = train.ts ~ trend + I(trend^2))
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -14625610  -4467887   -564261   3906991  13571833 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 76637154    2996871  25.572  < 2e-16 ***
## trend          16495     337107   0.049  0.96124    
## I(trend^2)     21961       7974   2.754  0.00907 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6005000 on 37 degrees of freedom
## Multiple R-squared:  0.7809, Adjusted R-squared:  0.769 
## F-statistic: 65.93 on 2 and 37 DF,  p-value: 6.348e-13
# Forecasting on Validation Data
quad_pred <- forecast(quad_mod, h= nValid, level=90)

# Plotting Forecast
autoplot(sb.ts) +
  autolayer(quad_pred, alpha = 0.3)+
  autolayer(quad_pred$mean, series="Quadratic Forecast") + 
  autolayer(quad_mod$fitted.values, series = "Quadratic Fit") +
  autolayer(valid.ts, series = "Observed") + 
  labs(x="Year", y="Avg. Viewership (Millions)", title="Quadratic Forecast") + 
  guides(color = guide_legend(title = ""))

Residuals of Quadratic Trend Model

# Plotting Residuals of Quadratic Trend Model
quad.res <- quad_mod$residuals
acf(quad.res)

Holt Winters Model

# Holt Winters Model

ets_mod <- ets(train.ts, model = "ZZZ")

# Results
summary(ets_mod)
## ETS(A,N,N) 
## 
## Call:
## ets(y = train.ts, model = "ZZZ")
## 
##   Smoothing parameters:
##     alpha = 0.5884 
## 
##   Initial states:
##     l = 78298996.0723 
## 
##   sigma:  6307425
## 
##      AIC     AICc      BIC 
## 1404.082 1404.749 1409.149 
## 
## Training set error measures:
##                   ME    RMSE     MAE      MPE     MAPE      MASE       ACF1
## Training set 1445933 6147718 4830112 1.128857 5.666198 0.9610446 -0.1290618
# Forecasting on Validation Period
ets_pred <- forecast(ets_mod, h = nValid, level = 90)

autoplot(sb.ts) +
  autolayer(ets_pred, alpha = 0.3)+
  autolayer(ets_pred$mean, series = "ETS Forecast")+
  autolayer(ets_pred$fitted, series = "ETS Fit")+
  autolayer(valid.ts, series = "Observed")+
  labs(x="Year", y="Avg. Viewership (Millions)", title="ETS Forecast")+ 
  guides(color = guide_legend(title = ""))

Residual Plot on ETS Model

# Plotting ACF of Residuals
ets.res <- ets_mod$residuals
acf(ets.res)

ets.2.mod <- ets(train.ts, model = "AAN")
summary(ets.2.mod)
## ETS(A,A,N) 
## 
## Call:
## ets(y = train.ts, model = "AAN")
## 
##   Smoothing parameters:
##     alpha = 0.5433 
##     beta  = 0.0264 
## 
##   Initial states:
##     l = 71627417.2278 
##     b = 2234848.6442 
## 
##   sigma:  6145168
## 
##      AIC     AICc      BIC 
## 1403.835 1405.600 1412.279 
## 
## Training set error measures:
##                     ME    RMSE     MAE       MPE     MAPE      MASE       ACF1
## Training set -791350.3 5829819 4619342 -1.390291 5.518434 0.9191078 -0.0433885
ets.2_pred <- forecast(ets.2.mod, h = nValid, level = 90)

autoplot(sb.ts) +
  autolayer(ets.2_pred, alpha = 0.3)+
  autolayer(ets.2_pred$fitted, series = "ETS (AAN) Fit")+
  autolayer(ets.2_pred$mean, series = "ETS (AAN) Forecast")+
  autolayer(valid.ts, series = "Observed")+
  labs(x="Year", y="Avg. Viewership (Millions)", title="ETS (AAN) Forecast")+ 
  guides(color = guide_legend(title = ""))

ets2.res <- ets.2.mod$residuals
acf(ets2.res)

Neural Network Model

p <- 5 # Number of previous time steps used for forecast
  
P <- 0 # Number of previous seasonal values to use
  
size <- 9 # Number of hidden nodes

# Fitting Model
nn_mod <- nnetar(train.ts, repeats = 20, p=p, P=P, size = size)
# Forecasting on validation data
nn_pred <- forecast(nn_mod, h = nValid, level = 90)

#Plotting Forecast
autoplot(sb.ts) +
  autolayer(nn_pred, alpha = 0.3)+
  autolayer(nn_pred$mean, series = "NN Forecast")+
  autolayer(valid.ts, series = "Observed")+
  labs(x="Year", y="Avg. Viewership (Millions)", title="NN Model Forecast")+ 
  guides(color = guide_legend(title = ""))

# Finding Neural Net Residuals
nn.res <- nn_mod$residuals

nn_df <- data.frame(nn.res)
nn_df <- na.omit(nn_df)

# ACF Plot
acf(nn_df)

The neural network residuals behave essentially like white noise, which is a good sign. It suggests the model has successfully captured the underlying structure in the data, leaving no exploitable autocorrelation behind in the residuals.

Auto Arima Model

#Fitting Auto Arima
arima <- auto.arima(train.ts)

#Results
summary(arima)
## Series: train.ts 
## ARIMA(0,1,1) with drift 
## 
## Coefficients:
##           ma1      drift
##       -0.4790  1154542.2
## s.e.   0.1525   482308.6
## 
## sigma^2 = 3.34e+13:  log likelihood = -661.66
## AIC=1329.33   AICc=1330.01   BIC=1334.32
## 
## Training set error measures:
##                    ME    RMSE     MAE        MPE     MAPE     MASE       ACF1
## Training set 192013.8 5558220 4212230 -0.0699361 4.883411 0.838105 0.04521311
# Forecasting on Validation Data
arima_pred <- forecast(arima, h = nValid, level = 90)

#Plotting Forecast
autoplot(sb.ts) +
  autolayer(arima_pred, alpha = 0.3)+
  autolayer(arima_pred$fitted, series = "Arima Fitted")+
  autolayer(arima_pred$mean, series = "Arima Forecast")+
  autolayer(valid.ts, series = "Observed")+
  labs(x="Year", y="Avg. Viewership (Millions)", title="Arima Model Forecast")+ 
  guides(color = guide_legend(title = ""))

# Finding Residuals of ARIMA
arima.res <- arima$residuals

# Finding differences
diff.res <- diff(arima.res)

# Plotting ACF of Differences
acf(diff.res)

Moving Average

# Moving Average with window size of two to predict viewership for next Super Bowl 
ma.trailing <- rollmean(train.ts, k = 2, align = "right")

# Taking last 3 moving averages 
last.ma <- tail(ma.trailing, 3)

# Using the Prior 3 Super Bowls moving averagea to make predictions
ma.trailing.pred <- ts(rep(last.ma, nValid), start = c(1977, ntrain + 1), end = c(1977, ntrain + nValid), freq = 1)

# Plotting Moving Average
autoplot(sb.ts)+
  autolayer(ma.trailing, series = "Moving Average") +
  autolayer(ma.trailing.pred, series = "MA Forecast")+
  autolayer(valid.ts, series = "Observed")+
  labs(x="Year", y="Avg. Viewership (Millions)", title="Moving Average Forecast")+ 
  guides(color = guide_legend(title = ""))

Model Performance

# List of model names
model_names <- c("Linear Trend", "Exponential Trend", "Quadratic Trend", "Holt ZZZ", "Holt AAN", "Neural Net", "Auto Arima", "Moving Average")

# Linear Trend Accuracy

lr_acc <- accuracy(linear_pred$mean, valid.ts)

# Exponential Trend Accuracy

exp_acc <- accuracy(exp_pred$mean, valid.ts)

# Quadratic Trend Accuracy

quad_acc <- accuracy(quad_pred$mean, valid.ts)

# Holt ZZZ

holt_ZZZ_acc <- accuracy(ets_pred$mean, valid.ts)

# Holt AAN

holt_AAN_acc <- accuracy(ets.2_pred$mean, valid.ts)

# Neural Network accuracy

nn_acc <- accuracy(nn_pred$mean, valid.ts)

# Auto Arima Accuracy

arima_acc <- accuracy(arima_pred$mean, valid.ts)


# Moving Average Accuracy

ma_acc <- accuracy(ma.trailing.pred, valid.ts)


# Row Binding Error Measures for Display

errors_df <- rbind(lr_acc, exp_acc, quad_acc, holt_ZZZ_acc, holt_AAN_acc, nn_acc, arima_acc, ma_acc)

# Assigning Row Names

rownames(errors_df) <- model_names

# Displaying Error Measures for each model
errors_df
##                          ME     RMSE      MAE        MPE      MAPE      ACF1
## Linear Trend       -2149498  9373206  8397680  -2.759635  7.833089 0.6620626
## Exponential Trend  -4053224  9621524  8298442  -4.464157  7.899401 0.6408273
## Quadratic Trend   -13129798 15227857 13129798 -12.648481 12.648481 0.5442585
## Holt ZZZ           -2425612 11070578  9974722  -3.170302  9.237705 0.7180497
## Holt AAN          -11294425 14099564 11616135 -11.076495 11.328422 0.6168719
## Neural Net         -2138868 11425551 10152979  -2.927622  9.390412 0.6716574
## Auto Arima         -9723814 13091436 10545821  -9.675694 10.319668 0.6411429
## Moving Average     -2212500 11414057 10331500  -3.006025  9.571976 0.6504315
##                   Theil's U
## Linear Trend       1.402724
## Exponential Trend  1.483165
## Quadratic Trend    2.387743
## Holt ZZZ           1.637872
## Holt AAN           2.216295
## Neural Net         1.686763
## Auto Arima         2.057607
## Moving Average     1.685917

Exponential Model Improved

train.res.ar.1 <- Arima(linear.res, order = c(1, 0, 0))


summary(train.res.ar.1)
## Series: linear.res 
## ARIMA(1,0,0) with non-zero mean 
## 
## Coefficients:
##          ar1       mean
##       0.5201  -112485.1
## s.e.  0.1371  1744510.0
## 
## sigma^2 = 3.106e+13:  log likelihood = -677.23
## AIC=1360.45   AICc=1361.12   BIC=1365.52
## 
## Training set error measures:
##                    ME    RMSE     MAE      MPE     MAPE      MASE       ACF1
## Training set 151659.8 5431810 4182438 196.4112 208.5965 0.8555932 -0.1219622
lr.res.for <- forecast(linear.res, h =nValid, level = 90)

autoplot(linear.res)+
  autolayer(lr.res.for$residuals)

autoplot(sb.ts) +
  autolayer(linear_pred$mean + lr.res.for$mean, series=" Linear Forecast + Residuals") + 
  autolayer(valid.ts, series = "Observed") + 
  labs(x="Year", y="Avg. Viewership (Millions)", title="Linear Forecast") + 
  guides(color = guide_legend(title = ""))

# Model Errors for improved neural network
nn_mod_imp <- accuracy(lr.res.for$mean + linear_pred$mean, valid.ts)
print(nn_mod_imp)
##                ME     RMSE      MAE      MPE     MAPE      ACF1 Theil's U
## Test set -8222950 12282245 10025562 -8.33791 9.758806 0.6620626  1.923796

As we can see running an arima model on the linear trend models residuals and plugging the residuals back into the model does not improve the forecast. Error measures of this forecast are higher then those of the first linear model across the board.

Forecasting Next Six Years of Avg. Super Bowl Viewership

# Fitting Model

linear.final <- tslm(sb.ts ~ trend)

# Forecasting Next 10 Years of viewership

final_pred <- forecast(linear.final, h=nValid, level=90)

# Forecast Graph

autoplot(sb.ts)+
  autolayer(final_pred, alpha = 0.3)+
  autolayer(final_pred$mean, series = "Linear Forecast")+
  autolayer(linear.final$fitted.values, series="Linear Fit")+
  labs(x="Year", y="Avg. Viewership (Millions)", title="Linear Trend Model 10 Year Forecast")+ 
  guides(color = guide_legend(title = ""))

# 90% Confidence Intervals

upper_bound <- final_pred$upper

lower_bound <- final_pred$lower

intervals <- data.frame(lower_bound, upper_bound)

rownames(intervals) <- c("2027", "2028", "2029", "2030", "2031", "2032", "2033", "2034", "2035", "2036")

intervals
##      lower_bound upper_bound
## 2027   103451210   128481901
## 2028   104311845   129400211
## 2029   105171441   130319560
## 2030   106030006   131239941
## 2031   106887547   132161346
## 2032   107744071   133083767
## 2033   108599586   134007197
## 2034   109454101   134931627
## 2035   110307624   135857050
## 2036   111160163   136783457

For next years super bowl the model predicts that the average viewership will be approximately 116,000,000. Our lower bound of our confidence interval allows us to state that with 90% certainty average viewership will not fall lower that 103,000,000. Based off the recent trend of viewership skyrocketing we can expect that viewership will approach the the upper bound of our 90% confidence interval.