Introduction

This Notebook Contains the process of creating time series bar graphs for the NFL Superbowl games from SUPER BOWL I to LIII (53).

#Libraries

library(ggplot2)
library(ggpubr)
library(readr)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(gghighlight)
# Read in Data

NFLO <- read.csv("superbowldata.csv")
head(NFLO)
# Date as date not fctr

NFL <- NFLO %>%
  mutate(Date = as.Date(NFLO$Date, "%B %d %Y"))
head(NFL)
# Time Series barplot() 
wp <- barplot(NFL$Pts, names.arg=NFL$Date, col=666,las=2, cex.names=0.75, ylim=c(0,60), ylab="Points Scored", main="NFL Superbowl Winner Final Scores")

Graph of the Past Winners of NFL Superbowl Championship

# Graph 2.A
# Time Series ggplot 

wpgg <- ggplot(NFL, aes(Date,Pts, fill = Winner)) +
  geom_bar(stat = "identity") + ggtitle("NFL Superbowl Winning Team")+
  theme_pubclean()
wpgg

# Graph 2.B
# Legend

wpgg <- wpgg + theme(legend.key.size = unit(2,"mm"), legend.key.width = unit(2,"mm"), legend.text = element_text(size = 7), legend.title = element_text(size = 7))
wpgg

# Graph 2.C
# Axis Adjustment

wpgg <- wpgg + ylab("Winning Team Final Score") + xlab("") + scale_x_date(breaks = NFL$Date) + theme(axis.text.x=element_text(angle=90, size= 7, vjust=0.5))
wpgg

# Graph 2.D
# Adding Values on top of each bar

wpgg2 <- wpgg + geom_text(aes(label = Pts, angle = 90))
wpgg2

  • To look at the past winners of each Superbowl championship game, a plot was created with the dates of each game on the x-axis, the winning team’s final score on the y-axis and the name of the winning team as the color of each bar. In the first version of the graph (Graph 2.A), the legend was much bigger than the actual graph, so it was adjusted in the following one (Graph 2.B). for the next graph (Graph 2.C), the y-axis was adjusted, to show a more understandable label, and the label was removed from the x-axis. Instead of the label the date for each game was shown. To make the dates more readable, they were flipped on their side and rescaled. Lastly, the final scores for the winning teams were placed on top of each bar for convenience (Graph 2.D).

Graphing the Difference Between Winner and Loser Scores

# Adding the Pts.diff column
NFL$Pts.diff <- NFL$Pts - NFL$Pts.1
head(NFL)
# Graph 3.A
# Graphing Pts.diff

dpgg <- ggplot(NFL, aes(Date,Pts.diff)) +
  geom_bar(stat = "identity", fill= "mediumorchid4") + ggtitle("NFL Superbowl Winner Vs. Loser Score Difference")+ scale_x_date(breaks = NFL$Date) + ylab("Score Difference") + xlab("") + theme_pubclean() + theme(axis.text.x=element_text(angle=90, size= 5, vjust=0.5)) + geom_text(aes(label = SB, angle = 90), hjust = -0.1, vjust = 0.3 , size = 3, color = "darkgoldenrod2") + ylim(0,55)
dpgg

# Graph 3.B
# Highlight the games where 7 or less pts.

  dpgg2 <- dpgg + gghighlight(Pts.diff <= 7) + ggtitle("NFL Superbowl Winner Vs. Loser Score Difference", subtitle = "Highlighting Games With Score Difference of 7 or less")
  
dpgg2

  • To highlight Superbowl games where the difference between the winning and losing scores was 7 points or below, a new variable called Pts.diff was created to contain the difference between the winning team’s score minus the losing team’s score. Then the previous graph (Graph 2.D) was recreated with the SB column above each graph instead of the name of the team to represent both the winning and the losing teams (Graph 3.A). Then, using gghighlight() games where the winning team won by 7 or less points, were colored in purple, and all other games were colored in gray.

Highlighting all The Years That the Superbowl was won by a team who has won multiple times

#creating frequency table

NFL2 <- NFL %>%
  group_by(Winner) %>%
  summarise(count = n())
NFL2
# Graph 3.C
# Frequency Plot of Number of Superbowls Won by Each Team

mpgg <- ggplot(NFL2, aes(Winner,count)) +
  geom_bar(position = "Stack", stat = "identity", fill= "cyan3") + ggtitle("Teams who've Won the Superbowl Multiple Times") + ylab("Number of Superbowl Games Won") + xlab("Team Name") + theme_pubclean() + theme(axis.text.x=element_text(angle=90, size= 5)) + geom_text(aes(label = count), vjust = 1.5, hjust = 0.5,  size = 5, color="orangered") + gghighlight(count >= 2)
## label_key: Winner
mpgg

  • To help give a general idea of how many Superbowl games each NFL team has won, a frequency graph was created of each team’s name and the total amount of times they’ve won, then the teams who have won more than once were highlighted (Graph3.C).
# Graph 3.D
# Winning Team for each Superbowl Year

mpgg2 <- ggplot(NFL, aes(Date,Pts)) +
  geom_bar(stat = "identity", fill= "midnightblue") + ggtitle("Superbowl Winners by Year")+ scale_x_date(breaks = NFL$Date) + ylab("Winning Team Score") + xlab("") + theme_pubclean() + theme(axis.text.x=element_text(angle=90, size= 7, vjust=0.5)) + geom_text(aes(label = Winner), angle = 90, vjust = 0.3, hjust = 0.45,  size = 3, color="gray55") + ylim(0, 70) 
mpgg2

# 3.E
# Highlighting Years Where the Winner Has Won Multiple times

mpgg3 <- mpgg2 + gghighlight(Win.1 == 1) + ggtitle("Superbowl Winners by Year", subtitle = "Highlighting Years Where the Winner Has Won Multiple Times")
## Warning: Tried to calculate with group_by(), but the calculation failed.
## Falling back to ungrouped filter operation...
mpgg3

  • To highlight years where the Superbowl was won by a team who has won multiple times, the graph (Graph 2.D) was recreated with the name of the winning team on top of each graph instead of the winning score (Graph 3.D). In a separate graph, the years where the Superbowl was won by a team who has won multiple times were highlighted in blue, and the teams that won once were colored in gray (Graph 3.E).