passing_stats <- passing_stats %>%
filter(Att >= 200)Assignment #6 Matthew Stewart
Hello! For this assignment, I will be exploring the NFL’s Passing Stats starting in 2017 using data scraped from Pro Football Reference.
Questions
My questions using this data will be finding out who the best passers are and do pass attempts lead to higher numbers of stats. I think the results will be interesting because even in the last 6 years, the game of football has changed and the passing stats should illustrate that. First, we have to load in the data.
Cleaning and Wrangling
There is no data cleaning required for this data set. But some wrangling I performed was to only grab Player Name, Age, Team, Games Started, Completions, Attempts, Touchdowns, Interceptions, Passer Rating, and Quarterback Rating.
The next thing I want to do is filter for quarterbacks with over 200 passing attempts. This is to prevent stats that are unusually low or high. Having 200 passing attempts means that the quarterback has played around 8 games or so which is a sufficient sample size for what I’m trying to find.
Visualizations
1) Distribution of Touchdowns
ggplot(passing_stats, aes(x = TD)) +
geom_histogram(bins = 20, fill = "forestgreen", color = "white") +
labs(title = "Distribution of Touchdowns",
x = "Touchdowns",
y = "Frequency")Touchdown distribution is mainly inside the 18-30 range, which is typical for a full season of quarterback play.
2) Touchdowns vs. Passing Attempts
ggplot(passing_stats, aes(x = Att, y = TD)) +
geom_point(color = "brown") +
labs(title = "Touchdowns vs. Passing Attempts",
x = "Passing Attempts",
y = "Touchdowns")There is a clear correlation between passing attempts and touchdowns. These results do not surprise me because as the goal is to throw touchdowns in football, having more chances to do so will naturally increase the chances of achieving it.
3) Interceptions vs. Passing Attempts
ggplot(passing_stats, aes(x = Att, y = Int)) +
geom_point(color = "forestgreen") +
labs(title = "Interceptions vs. Passing Attempts",
x = "Passing Attempts",
y = "Interceptions")There is not as much of a correlation between passing attempts and interceptions, although there is an increase for players that throw more passes.
4) Top QBs by Touchdown / Interception Ratio
passing_stats <- passing_stats %>%
mutate(TD_by_INT = ifelse(Int > 0, TD / Int, NA)) %>%
arrange(desc(TD_by_INT))
top_qbs <- passing_stats %>%
head(10)
ggplot(top_qbs, aes(x = reorder(Player, TD_by_INT), y = TD_by_INT)) +
geom_bar(stat = "identity", fill = "brown") +
coord_flip() +
labs(title = "Top Quarterbacks by TD/INT Ratio", x = "Player", y = "TD/INT Ratio")These results do not surprise me as these guys are some of the greatest QBs of all time and definitely among the greats of this time period.
5) Distribution of Passer Rating
ggplot(passing_stats, aes(x = Rate)) +
geom_density(fill = "forestgreen", alpha = 0.5) +
labs(
title = "Distribution of Passer Ratings",
x = "Passer Rating",
y = "Density"
) +
theme_minimal()Most of the passer ratings in this time period are around 90-95.
Conclusion
After creating the visualizations for my analysis, I found that passing attempts have a large correlation with touchdowns, but not as much with interceptions. I also found that the best quarterbacks based on Touchdown to Interception Ratio are among some of the biggest names in the game like Aaron Rodgers, Drew Brees, and Patrick Mahomes. I hope you enjoyed this analysis of NFL passing data!