browns_passers %>%
filter(Starts >=3) %>%
slice_max(Win_pct, n = 10) %>%
mutate(Player = reorder(Player, Win_pct)) %>%
ggplot(aes(y = Player, x = Win_pct)) +
geom_col(fill = "orange") +
geom_text(aes(label = scales::number(Win_pct, accuracy = 0.001)),
hjust = 1.1,
size = 3.5,
color = "white") +
labs(title = "Top 10 Winningest Browns QBs",
x = "Win Percentage",
y = "Name",) +
scale_x_continuous(limits = c(0,1),
labels = scales::number_format(accuracy = 0.001))Assignment 7: Ethical Web Scraping
Introduction
For this assignment, I wanted to explore some rather depressing statistics relating to the Cleveland Browns. My goal was to come to a better understanding of the misery of the Browns.
My data includes statistics on all-time Browns passers, receivers, and rushers. All data comes from Pro Football reference.
A data dictionary explaining the fields in detail can be found here: Download data dictionary
Data Wrangling/Cleaning
Passer Data
For the passer data, I performed the following tasks:
- Remove Rank and QBR columns
- Remove extraneous header rows
- Convert all columns to numeric, except for player name, position, and record
- Filter to include only players who have attempted at least 20 passes
- Rename some columns for better readability
- Convert QBrec column into separate W, L, and T columns
- Create additional columns for number of starts and win percentage
- Reorder columns
Receiver Data
For the receiver data, I performed the following tasks:
- Remove Rank and Ctch columns
- Remove extraneous header rows
- Convert all columns to numeric, except for player name and position
- Create additional columns for catch percentage
- Filter to include only players who have at least 20 receptions
- Reorder columns
Rusher Data
For the rusher data, I performed the following tasks:
- Rename most columns for better readability
- Remove extraneous header rows
- Remove Rank column
- Convert all columns to numeric, except for player name and position
- Filter to include only players who have at least 20 rushing attempts
Visualizations & Analysis
Top 10 Winningest Browns QBs
This chart shows the 10 winningest Browns QBs by win percentage. For this visual, I further restricted the data to include only those who have started at least three games.
A few notes of context. NFL analysis often speaks of the “Super Bowl era”, which began with the 1966 season, after which the first Super Bowl was played. This marks the beginning of the “modern era” of NFL football.
Additionally, in 1995, the Browns relocated to Baltimore and became the Ravens. After a three-year hiatus, they returned as an expansion team in 1999. Because of this, Browns football is divided into two distinct periods: pre-1995 and post-1999.
A few observations about the chart below:
- Four of the winningest Browns QBs (Graham, O’Connell, Ryan, and Plum) played at least part of their tenure with the Browns during the pre-Super Bowl era. Further, only two of the winningest Browns QBs (Hoyer and Flacco) played for the post-1999 Browns.
- This illustrates how the most successful Browns QBs played several decades ago, while more recent Browns QBs have experienced comparatively little success. It is also noteworthy that the modern Browns QBs who have experienced success have win percentages which could be described as “good but not great”.
- The tenth winningest Browns QB has a win percentage of only 0.516, which is only slightly above 0.500 (a common mark of mediocrity). It is noteworthy that only a handful of Browns QBs have a winning record all-time.
Games Played by Browns Passers
This chart shows the number of games that each Browns passer played in with the team (regardless of whether they started or not). That variable is plotted in a scatter plot against the player’s first year with the Browns.
In the “old Browns” era (1946-1995), relatively few QBs played for the Browns, which suggests that those who did play experienced success. There is wide variation in tenure length, but several QBs had very long tenures.
In contrast, there are dozens of QBs who played in the “new Browns” era (1999-present), and they generally have extremely short tenures with the Browns. Nearly all of them played fewer than 25 games (the equivalent of roughly one and a half seasons) with the Browns. Clearly very few recent Browns QBs have experienced success over the past two and a half decades.
browns_passers %>%
ggplot(aes(x=From, y=Games)) +
geom_point(position = "jitter", color = "red") +
labs(title = "Games Played By Browns Passers",
x = "First Year with the Browns",
y = "Games Played with the Browns") +
scale_x_continuous(breaks = c(1950, 1960, 1970, 1980, 1990, 2000, 2010, 2020)) +
scale_y_continuous(limits = c(0,140), breaks = c(0,20,40,60,80,100,120,140))Games Played by Browns Receivers
This chart shows the same metric as the above chart, but for Browns receivers.
There is still a negative correlation between starting year and games played, but it is substantially less pronounced. The same can be said of the cluster of low tenures in the “new Browns” era.
One would expect trends observed in QB data to be significantly muted in data for other positions. The QB is the most scrutinized position and often bears most of the blame for a team’s success or failure.
With that said, teams like the Browns that make poor decisions regarding the QB position tend to also make poor decisions at other positions, so it isn’t surprising that similar trends are observed here.
browns_receivers %>%
ggplot(aes(x=From, y=G)) +
geom_point(position = "jitter", color = "blue") +
labs(title = "Games Played By Browns Receivers",
x = "First Year with the Browns",
y = "Games Played with the Browns") +
scale_x_continuous(breaks = c(1950, 1960, 1970, 1980, 1990, 2000, 2010, 2020)) +
scale_y_continuous(limits = c(0,140), breaks = c(0,20,40,60,80,100,120,140))Games Played by Browns Rushers
This chart shows the same metric as the above charts, but for Browns rushers.
Unlike the previous charts, there is no discernable correlation between starting year and games played for Browns rushers. There is a cluster of low tenure rushers in the “new Browns” era, but a similar cluster is observed in the very earliest years of the Browns.
I suspect that these differences are due more to changes in the role of rushers vs. passers/receivers in the sport over time, as well as injury-related factors, as opposed to something intrinsic to the Browns. Running backs (the primary rushing position) are in most cases far less high-profile players than QBs and WRs. Furthermore, the success of a particular running back is only loosly tied to a team’s success, except in the cases of superstar RBs.
browns_rushers %>%
ggplot(aes(x=From, y=G)) +
geom_point(position = "jitter", color = "purple") +
labs(title = "Games Played By Browns Rushers",
x = "First Year with the Browns",
y = "Games Played with the Browns") +
scale_x_continuous(breaks = c(1950, 1960, 1970, 1980, 1990, 2000, 2010, 2020)) +
scale_y_continuous(limits = c(0,140), breaks = c(0,20,40,60,80,100,120,140))Passer Rating
This chart shows the passer rating of every Browns player who has thrown at least 20 passes for the team since 1999. The passer rating scale ranges from 0 to 158.3, but for the purposes of this analysis, the upper limit of the chart is set to 120 for reasonableness. The highest recorded single-season passer rating is 122.5, while the highest recorded career passer rating is approximately 103.
To interpret the data, it is helpful to know that a passer rating of 80 is generally considered “average”. However, this is a subject of debate, as passer ratings have inflated significantly in recent years. The true average passer rating today sits in the range of 90-95 due to league-wide changes to rules, officiating, and play-calling.
Regardless of one’s perspective on the passer rating system, the Browns QBs since 1999 have very poor passer ratings. Not a single one has a rating above 90. Only about a dozen have ratings above 80. Roughly a third have a rating below 70. The fact that one team managed to play so many bad QBs in such a short time is astonishing.
browns_passers %>%
filter(From >= 1999) %>%
mutate(Player = reorder(Player, Rate)) %>%
ggplot(aes(y = Player, x = Rate)) +
geom_col(fill = "brown") +
labs(title = "Post-1999 Browns QB Ratings",
x = "Passer Rating",
y = "Name") +
scale_x_continuous(limits = c(0, 120), breaks = c(0, 20, 40, 60, 80, 100, 120))