Problem Statement

Like many other sports fans, I have found myself on both ends: happy with a teams performance, or flat out frustrated that they did not do as well as expected. I am big fan of football, specifically the NFL and am interested in exploring how to quantify performance and ease the frustrations felt by sports fans when their team’s quarterback does not perform as expected.

The National Football League is one of the most popular sports leagues in modern day America and, due to its dedicated fans, produces millions of dollars in revenue yearly. Consisting of 32 franchises, the yearly goal of winning the Superbowl is heavily reliant on the efficiency and consistency of the Quarterback. Though NFL teams with top ranked defenses have been known to win championships, quarterback performance is still important in order for a team to beat the best of the best, and it is unsurprisingly one of the highest paid positions on the team. Thus, how can a franchise manager or coach evaluate the consistency of a quarterback in order to win a Superbowl? This project hopes to explore how consistent the current starting quarterbacks in the NFL are based on time of the year in which the game was played, home vs away games, regular season vs post season games, and other factors.

Credit: USA Today
[Credit: USA Today]

Data

The data used in this project was retrieved from Kaggle, but was derived from the NFL official website, and consists of 29 variables with 40247 observations. With this data set having records for quarterbacks from 1970-2016, there will be a reduction in data size after subsetting down to current starting quarterbacks that have played for at least 3 years. There are 8 character variables and 21 numeric variables in total. All can be seen below:

##  [1] "Player.Id"                 "Name"                     
##  [3] "Position"                  "Year"                     
##  [5] "Season"                    "Week"                     
##  [7] "Game.Date"                 "Home.or.Away"             
##  [9] "Opponent"                  "Outcome"                  
## [11] "Score"                     "Games.Played"             
## [13] "Games.Started"             "Passes.Completed"         
## [15] "Passes.Attempted"          "Completion.Percentage"    
## [17] "Passing.Yards"             "Passing.Yards.Per.Attempt"
## [19] "TD.Passes"                 "Ints"                     
## [21] "Sacks"                     "Sacked.Yards.Lost"        
## [23] "Passer.Rating"             "Rushing.Attempts"         
## [25] "Rushing.Yards"             "Yards.Per.Carry"          
## [27] "Rushing.TDs"               "Fumbles"                  
## [29] "Fumbles.Lost"

Particular variables of interest for this project include Passer Rating, Week, Touch Down (TD) Passes, Game Date, and Passing Yards Per Attempt. However, all variables will be utilized in this project to determine the efficiency of the Quarterbacks.

Proposed Methodology

I hope to evaluate quarterbacks by comparing TD passes, Completed Passes, and Passing Yards to the Week and Date of the game. That is, are quarterbacks efficient through all 16 games of the NFL season? I will compare these week to week statistics and use data visualization tools such as line charts and bar charts to present my findings. I am also planning on using regression to see if there is a relationship between home/away games played and Quarterback Passer Rating. I hope to organize my data further and develop a model to predict whether a quarterback is deemed “consistent” or not, as well as develop a rank for all quarterbacks of interest in this study. I will be using the ggplot2 and dplyr packages in R to assist me.

Why This Matters

A clearer insight on consistency of the Quarterback position is a perspective many managers and coaches need in order to evaluate their own player, and ultimately win the Super Bowl. While football is just a sport, teams need Superbowl wins in order to develop a larger fan base and popular revenue-producing franchises. Developing a model to be able to tell teams if they should stay with the quarterback they currently have or consider other options could have useful applications in the sports industry.