Data:
https://www.espn.com/nfl/stats/player/_/stat/rushing/table/rushing/sort/rushingYards/dir/desc
Purpose
This document is being used to compare and analyze the top 100 NFL leaders in terms of rush yards for the 2020 season. Different analyses will be run in order to understand the relationship different variables have when trying to understand the total number of rush yards for each player.
Collection
The data is suitable to be scraped as it is from ESPN’s website which is a trusted reported of sports statistics and provides up to date values. Interesting visuals will be made to show aspects of the data that help answer the question.
Summary Table:
| Variable Name | Variable Explained |
|---|---|
| RK | Player Rank by Yards |
| Name | Player Name and Team |
| POS | Player Position |
| GP | Amount of Games Played |
| ATT | Amount of Rushing Attempts |
| YDS | Total Amount of Yards |
| AVG | Average Number of Yards per Attempt |
| LNG | Longest Rush |
| BIG | Number of Rushes of 20 or more yards |
| TD | Total Number of Touchdowns |
| YDS/G | Total Number of Yards per Game |
| FUM | Total Number of Fumbles |
| LST | Total Number of Fumbles Lost |
| FD | Total Number of First Downs |
Data Table:
This is a table created to easily view, search, sort, and filter the data.
Descriptive Info:
The statistics below show some averages across the players in order to begin understanding what drives higher total yards per player. These statistics show a leaning towards the top runners taking care of the football and limiting fumbles, while being reliable for a high number of yards on average. What is interesting is that big plays are not necessarily needed for a players success in totaling a higher number of yards but obviously help.
| Average of Average Yards | Average Amount of 20+ Yard Runs |
|---|---|
| 4.716 | 4.46 |
| Average Amount of Touchdowns | Average Amount of Fumbles |
|---|---|
| 6.44 | 1.66 |
Total Yards vs. Games Played:
This is a graph to show the correlation between number of games played and yards, this graph shows that there is a direct correlation in the majority of cases for players who have played in more games to have higher rushing yards. By utilizing a ggplot and graphing the two variables in question I was able to find the relationship between them as yards increased.
Total Yards vs. First Downs by Position:
By using a ggplot with a facet wrap in order to filter by position Quarterbacks and Runningbacks could be distinguished from one another. What can be seen here is a strong relationship between First Downs and Yards for Runningbacks, which makes sense given the more first downs a player can get the higher chance the team relies on them to do it again. On the other hand Quarterbacks had far lower rush yards but a similar number of first downs as they are likely to run when forced to and pick up the minimal amount of needed yards regularly to achieve the first down.
Total Yards vs. First Downs by Position:
For this visualization a ggplot was used with colour in order to show the relationship between number of runs over 20+ yards and the number of attempts by each runner while taking into account the number of yards they totaled on the season. What is interesting to note here is that certain runners amounted their total yards based on a large number of attempts, while others totaled their yards by having a higher number of 20+ yard runs. However, the league leader was able to compile yards by being proficient in both areas, meaning he was utilized my often and was more effective in creating big plays.
Total Yards vs. Attempts by Number of Fumbles:
A ggplot with facet wrap was used to show the difference in the number of fumbles across runners to see what the impact was on their attempts and yards. As was shown earlier some runners rely on a high number of attempts to increase their yards, however if a player fumbles the ball often enough teams are less likely to give them the ball as can be seen in the graphs below. This in turn leads to a lower total yards for the player.
Furthering the Analysis:
To further this analysis I would prefer a regression across several of the variables against yards, I would expect a high average yards per carry amount to be one of the most closely correlated variables with total yards with the number of attempts being correlated strongly as well. This would also support my findings as seen above and really could be useful in the real world when it comes to an NFL team selecting a Runningback in a draft or out of free agency based off of their past seasons stats.