I used the stats found on (https://www.pro-football-reference.com/years/2024/draft.htm) which shows the stats from the 2024 NFL Draft. The data is broken up in 28 columns ranging from the draft pick number to the college that each player attended. We will use a number of these columns to analysis the 2024 draft and look for patterns. This could be what colleges had the most picks this year, what teams end up getting the most picks, and is there a pattern in the draft rounds and the number of picks.
Data Wrangling
Columns Used
Rnd = Round of draft picks chosen. Starting with 1 to 7.
Pick = The overall pick number given to each player.
Player = The name of the player
Pos = The position that each player.
Age = Age of the player
The data collected needed a series of manipulation given the original framework of the data found on the website. For starters, the column headers were not clean and have numerous blank rows at the top as well other discrepancies. To fix this, I deleted some of the rows at the top and set the column headers to be the second row of data. Additionally, there were breaks in the data separating the seven round of the nfl draft. Because of this, every few rows there was a break and the headers were restated. To fix this, I manipulated the data to remove duplicates column names found throughout the data. I also removed blank rows.
nfl_draft %>%count(Rnd) %>%ggplot(aes(x = Rnd, y = n, fill =as.factor(Rnd))) +geom_bar(stat ="identity") +labs(title ="Number of Players Drafted by Round",x ="Draft Round",y ="Number of Players",fill ="Draft Round" ) +theme_minimal()
The graph shown above displays a distribution of the number of players in the draft broken out into each round of the draft. The drafts range from one to seven. There is a relatively steady incline from rounds one through six but round seven is brought back down. The first two rounds each have 32 picks because there are 32 teams and each teams gets one pick. As the draft goes on, there are compensatory picks that occur when teams lose a lot of free agents and they are given to replenish the roster.
Correlation Between Draft Pick and Age
nfl_draft <- nfl_draft %>%mutate(Pick =as.numeric(Pick), Age =as.numeric(Age)) %>%filter(!is.na(Pick) &!is.na(Age))correlation <-cor(nfl_draft$Pick, nfl_draft$Age, use ="complete.obs")print(paste("Correlation B etween Pick and Age:", correlation))
[1] "Correlation B\n etween Pick and Age: 0.407236826722107"
ggplot(nfl_draft, aes(x = Pick, y = Age)) +geom_point(color ="blue") +geom_smooth(method ="lm", color ="red") +labs(title ="Correlation Between Draft Pick and Age",x ="Draft Pick",y ="Age") +theme_minimal()
Shown in the graph above is the correlation between draft picks and age. The correlation is calculated to be 0.407 indicating a moderating positive correlation between draft pick and age. This means that the older you are, the less likely you are to get picked. This is displayed in the graph above showing that as players get older, they are chosen later on in the draft.
Top 10 Colleges Producing NFL Drafts
nfl_draft %>%count(College.Univ) %>%top_n(10, n) %>%ggplot(aes(x =reorder(College.Univ, n), y = n, fill = College.Univ)) +geom_bar(stat ="identity") +coord_flip() +labs(title ="Top 10 Colleges Producing NFL Players",x ="College",y ="Number of Players" ) +theme_minimal() +theme(legend.position ="none")
The graph above shows the top 10 colleges producing NFL players. The number of college team with the most players being University of Michigan. This is likely because they won the national championship so they were the best college team and they have great players. Texas was also very high because they made the playoffs this year so that makes sense that they had so many players that were drafted to the NFL.
Count of Drafted Players and Their Positions
nfl_draft %>%count(Pos) %>%ggplot(aes(x =reorder(Pos, n), y = n, fill = Pos)) +geom_bar(stat ="identity") +coord_flip() +labs(title ="Count of Drafted Players and Their Positions ",x ="Position",y ="Number of Players" ) +theme_minimal() +theme(legend.position ="none")
The graph above displays the distribution of the number of drafted players by their position. The highest drafted position being the OL or Offensive Lineman. This is because the OL is an important position and their are 5 starting OL per team each game so 10 on the field at once.
Number of Draft Picks by Team
nfl_draft %>%ggplot(aes(x =reorder(Tm, -Pick, function(x) length(x)))) +geom_bar(fill ="hotpink") +labs(title ="Number of Draft Picks by Team",x ="Team",y ="Number of Picks") +theme_minimal() +theme(axis.text.x =element_text(angle =45, hjust =1))
The graph above displays the distribution of draft picks by NFL teams. Arizona having the most and the Chicago Bears having the least. I am originally from Chicago and I know how bad the Bears are so this does not surprise me at all. Arizona got a lot of picks through trades this year so it makes sense that they had the most amount of trades this year.
Source Code
---title: "NFL 2024 Draft Pick Anlysis"subtitle: "BAIS-462 Assignment 6 - Ethical Web Scraping"author: "Sadie Liptak"editor: visualtoc: TRUEformat: html: code-tools: TRUE embed-resources: TRUEexecute: message: FALSE echo: TRUE warning: FALSE---## OverviewI used the stats found on (<https://www.pro-football-reference.com/years/2024/draft.htm>) which shows the stats from the 2024 NFL Draft. The data is broken up in 28 columns ranging from the draft pick number to the college that each player attended. We will use a number of these columns to analysis the 2024 draft and look for patterns. This could be what colleges had the most picks this year, what teams end up getting the most picks, and is there a pattern in the draft rounds and the number of picks.## Data Wrangling#### Columns UsedRnd = Round of draft picks chosen. Starting with 1 to 7.Pick = The overall pick number given to each player.Player = The name of the playerPos = The position that each player.Age = Age of the playerThe data collected needed a series of manipulation given the original framework of the data found on the website. For starters, the column headers were not clean and have numerous blank rows at the top as well other discrepancies. To fix this, I deleted some of the rows at the top and set the column headers to be the second row of data. Additionally, there were breaks in the data separating the seven round of the nfl draft. Because of this, every few rows there was a break and the headers were restated. To fix this, I manipulated the data to remove duplicates column names found throughout the data. I also removed blank rows.## Analysis```{r}library(dplyr)library(tidyverse)nfl_draft <-read.csv("nfl_draft_2024.csv")head(nfl_draft)colnames(nfl_draft)```### Number of Players Drafted by Round```{r}nfl_draft %>%count(Rnd) %>%ggplot(aes(x = Rnd, y = n, fill =as.factor(Rnd))) +geom_bar(stat ="identity") +labs(title ="Number of Players Drafted by Round",x ="Draft Round",y ="Number of Players",fill ="Draft Round" ) +theme_minimal()```The graph shown above displays a distribution of the number of players in the draft broken out into each round of the draft. The drafts range from one to seven. There is a relatively steady incline from rounds one through six but round seven is brought back down. The first two rounds each have 32 picks because there are 32 teams and each teams gets one pick. As the draft goes on, there are compensatory picks that occur when teams lose a lot of free agents and they are given to replenish the roster.### Correlation Between Draft Pick and Age```{r}nfl_draft <- nfl_draft %>%mutate(Pick =as.numeric(Pick), Age =as.numeric(Age)) %>%filter(!is.na(Pick) &!is.na(Age))correlation <-cor(nfl_draft$Pick, nfl_draft$Age, use ="complete.obs")print(paste("Correlation B etween Pick and Age:", correlation))ggplot(nfl_draft, aes(x = Pick, y = Age)) +geom_point(color ="blue") +geom_smooth(method ="lm", color ="red") +labs(title ="Correlation Between Draft Pick and Age",x ="Draft Pick",y ="Age") +theme_minimal()```Shown in the graph above is the correlation between draft picks and age. The correlation is calculated to be 0.407 indicating a moderating positive correlation between draft pick and age. This means that the older you are, the less likely you are to get picked. This is displayed in the graph above showing that as players get older, they are chosen later on in the draft.### Top 10 Colleges Producing NFL Drafts```{r}nfl_draft %>%count(College.Univ) %>%top_n(10, n) %>%ggplot(aes(x =reorder(College.Univ, n), y = n, fill = College.Univ)) +geom_bar(stat ="identity") +coord_flip() +labs(title ="Top 10 Colleges Producing NFL Players",x ="College",y ="Number of Players" ) +theme_minimal() +theme(legend.position ="none")```The graph above shows the top 10 colleges producing NFL players. The number of college team with the most players being University of Michigan. This is likely because they won the national championship so they were the best college team and they have great players. Texas was also very high because they made the playoffs this year so that makes sense that they had so many players that were drafted to the NFL.### Count of Drafted Players and Their Positions```{r}nfl_draft %>%count(Pos) %>%ggplot(aes(x =reorder(Pos, n), y = n, fill = Pos)) +geom_bar(stat ="identity") +coord_flip() +labs(title ="Count of Drafted Players and Their Positions ",x ="Position",y ="Number of Players" ) +theme_minimal() +theme(legend.position ="none")```The graph above displays the distribution of the number of drafted players by their position. The highest drafted position being the OL or Offensive Lineman. This is because the OL is an important position and their are 5 starting OL per team each game so 10 on the field at once.### Number of Draft Picks by Team```{r}nfl_draft %>%ggplot(aes(x =reorder(Tm, -Pick, function(x) length(x)))) +geom_bar(fill ="hotpink") +labs(title ="Number of Draft Picks by Team",x ="Team",y ="Number of Picks") +theme_minimal() +theme(axis.text.x =element_text(angle =45, hjust =1))```The graph above displays the distribution of draft picks by NFL teams. Arizona having the most and the Chicago Bears having the least. I am originally from Chicago and I know how bad the Bears are so this does not surprise me at all. Arizona got a lot of picks through trades this year so it makes sense that they had the most amount of trades this year.