library()

install.packages(“rmarkdown”) library(rmarkdown)

Introduction

I will analyze the percentage of successful kicks and the distance the kicks travel (kickLength) to determine the best kicker. A kicker will be considered the best if they have the highest overall kick success rate, the highest success rates for both close and long-distance kicks as well as the ball was kicked in the direction it was intended to be kicked( kickDirectionIntended & kickDirectionActual).

Description of Project

To start, I read all the data—players, plays, games, and PFF Scouting Data—into four data frames: my_df1, my_df2, etc. I was able to merge the data into one data frame called df. After consolidating all the variables into one data frame, I gathered the relevant variables for my research into a new data frame called kickoff. The kickoff data frame consists of kickerId, kickLength, specialTeamsPlayType, specialTeamsResult, displayName, kickDirectionIntended, kickDirectionActual, kickType, and kickContactType.

Since I am only looking at kickoffs, I filtered the data to include only actions related to kickoffs. I used an if-else statement to calculate success and failure for each kickoff: a success if the result was a Touchback and a failure if it was a Return.

I grouped the data by the kicker’s display name, summing the number of successful kicks and counting the total attempted kicks. I then created a formula to find the percentage of success (total_success / count). This data was sorted to identify the top eight kickers by success rate.

For direction, I performed an if-else statement. If kickDirectionIntended and kickDirectionActual were the same (Left, Right, Center), then the kick direction was deemed successful. I repeated the process of summing, counting, and calculating the success rate based on directional accuracy for the top eight performers.

For kick length, I used another if-else statement. If the kick was 50 yards or more, it was considered a success. I summed the successes and counted the attempts, then calculated success rates for long kicks. Long kicks do not always determine whether a kicker is the best; sometimes, a kicker may not want to kick the ball far. However, being able to kick the ball far on command can demonstrate versatility.

Data Visualization

Throughout this project, I created three graphs: one vertical bar graph and two horizontal bar graphs. I chose a bar graph because it was most suitable for the data presented. The first graph was a vertical bar graph showing the top eight kickers based on the success of their kicks. This graph indicated that Kaare Vedvik and Tristan Vizcaino had the most successful kicks, with a percentage of 100%. The second graph was a horizontal bar graph that displayed successful ball kicks by direction. This graph proved to be less useful, as a large percentage of players could kick the ball in the intended direction; up to 30 players were able to accomplish this. The last graph examined how consistently players could throw a ball over a long distance, showing the top 20 performers. Up to 17 players were able to achieve this easily.

library(scales)
library(ggplot2)
library(tidytext)
library(RColorBrewer)
library(kableExtra)
library(ggalt)
library(ggforce)
library(hms)
library(gganimate)
library(data.table)
library(nflfastR)
library(ggimage)
library(gifski)
library(png)
library(dplyr)
setwd("C:/Users/genny/OneDrive/Documents/IS470 Sports Analysis")


my_df1 <- fread("Data/NFLBDB2022/plays.csv")

my_df2 <- fread("Data/NFLBDB2022/players.csv")

my_df3 <- fread("Data/NFLBDB2022/games.csv")

my_df4 <- fread("Data/NFLBDB2022/PFFScoutingData.csv")



#combining the data frames



df <- left_join(my_df1, my_df2, by = c("kickerId" = "nflId"))

df <- left_join(df,     my_df3, by = c("gameId"))

df <- left_join(df,     my_df4, by = c("gameId", "playId"))

# filtering so it shows kickoffs only, touchback == sucesss , return = fail

kickoff <- df %>%
  filter(specialTeamsPlayType == "Kickoff") %>%
  select(kickerId, kickLength, specialTeamsPlayType, specialTeamsResult, displayName, 
         kickDirectionIntended, kickDirectionActual, kickType, kickContactType) %>%
  group_by(specialTeamsPlayType) %>%
  mutate(
    success = ifelse(specialTeamsResult == "Touchback", 1, 0),
    fail = ifelse(specialTeamsResult == "Return", 1, 0)
  ) %>%
  arrange(success) %>%
  data.frame()
#filter names so they don't repeat
#adding about how many times a person fails a kick or is successful at a kick. finding the percentage of this.

df1 <- kickoff %>%
  group_by(displayName) %>%
  reframe(total_success = sum(success), 
          count = n(), 
          results_success = total_success / count) %>%
  select(displayName, results_success) %>%
  arrange(-results_success) %>%
  data.frame()
df1
##             displayName results_success
## 1          Kaare Vedvik      1.00000000
## 2      Tristan Vizcaino      1.00000000
## 3             Joey Slye      0.90728477
## 4        Bradley Pinion      0.81954887
## 5    Chandler Catanzaro      0.80000000
## 6              Matt Gay      0.77777778
## 7        Dustin Hopkins      0.77511962
## 8   Sterling Hofrichter      0.72727273
## 9         Jason Sanders      0.72687225
## 10             Wil Lutz      0.72377622
## 11        Greg Zuerlein      0.72083333
## 12           Tyler Bass      0.70000000
## 13      Brandon McManus      0.69668246
## 14      Harrison Butker      0.68275862
## 15          Jason Myers      0.66274510
## 16          Greg Joseph      0.66250000
## 17        Aldrick Rosas      0.65714286
## 18        Randy Bullock      0.65365854
## 19         Jake Elliott      0.65217391
## 20          Cody Parkey      0.65000000
## 21  Rodrigo Blankenship      0.65000000
## 22           Dan Bailey      0.64502165
## 23            Matt Wile      0.63636364
## 24           Josh Lambo      0.61616162
## 25    Rigoberto Sanchez      0.61603376
## 26        Justin Tucker      0.61433447
## 27              Ty Long      0.60156250
## 28        Kasey Redfern      0.60000000
## 29          Brett Maher      0.59440559
## 30          Kai Forbath      0.59375000
## 31          Braden Mann      0.59259259
## 32   Stephen Gostkowski      0.59203980
## 33     Ka'imi Fairbairn      0.58984375
## 34          Jake Bailey      0.58394161
## 35          Graham Gano      0.57575758
## 36          Matt Bosher      0.57446809
## 37     Mitch Wishnowsky      0.57419355
## 38       Daniel Carlson      0.57209302
## 39         Mason Crosby      0.56378601
## 40        Zane Gonzalez      0.56069364
## 41        Chris Boswell      0.54954955
## 42           Sam Ficken      0.54807692
## 43        Pat O'Donnell      0.54545455
## 44           Sam Sloman      0.54545455
## 45          Ryan Succop      0.54022989
## 46         Ryan Santoso      0.52941176
## 47 Sebastian Janikowski      0.52941176
## 48     Stephen Hauschka      0.51798561
## 49        Caleb Sturgis      0.50000000
## 50          Elliott Fry      0.50000000
## 51          Logan Cooke      0.49450549
## 52         Eddy Pineiro      0.49019608
## 53         Younghoe Koo      0.47747748
## 54         Cairo Santos      0.47096774
## 55       Austin Seibert      0.45263158
## 56          Phil Dawson      0.43750000
## 57          Matt Prater      0.42857143
## 58        Johnny Hekker      0.37500000
## 59           Sam Martin      0.37500000
## 60     Chase McLaughlin      0.33333333
## 61             Jack Fox      0.32203390
## 62          Mike Nugent      0.27272727
## 63           J.K. Scott      0.25000000
## 64      Matthew McCrane      0.23809524
## 65      Michael Badgley      0.14705882
## 66       Matthew Wright      0.06666667
## 67         Robbie Gould      0.05882353
## 68         Aaron Brewer      0.00000000
## 69             Andy Lee      0.00000000
## 70           Brett Kern      0.00000000
## 71     Britton Colquitt      0.00000000
## 72          Bryan Anger      0.00000000
## 73     Cameron Johnston      0.00000000
## 74          Chris Jones      0.00000000
## 75      Corey Bojorquez      0.00000000
## 76          Daren Bates      0.00000000
## 77      Dustin Colquitt      0.00000000
## 78         Jamie Gillan      0.00000000
## 79            Jon Brown      0.00000000
## 80         Jordan Berry      0.00000000
## 81          Keelan Cole      0.00000000
## 82      Lachlan Edwards      0.00000000
## 83          Matt Bryant      0.00000000
## 84           Matt Haack      0.00000000
## 85      Michael Dickson      0.00000000
## 86      Michael Palardy      0.00000000
## 87            Nick Folk      0.00000000
## 88          Riley Dixon      0.00000000
## 89     Taylor Russolino      0.00000000
## 90      Thomas Morstead      0.00000000
## 91       Tommy Townsend      0.00000000
# Extract the top 8 success rates
top_success <- df1 %>%
  arrange(desc(results_success)) %>%
  head(8) %>%
  select(displayName, results_success) %>%
  data.frame()
top_success
##           displayName results_success
## 1        Kaare Vedvik       1.0000000
## 2    Tristan Vizcaino       1.0000000
## 3           Joey Slye       0.9072848
## 4      Bradley Pinion       0.8195489
## 5  Chandler Catanzaro       0.8000000
## 6            Matt Gay       0.7777778
## 7      Dustin Hopkins       0.7751196
## 8 Sterling Hofrichter       0.7272727
#1: define the kicks in special team results & type play L= Left R= right, c = center
kickoff2 <- df %>%
  filter(specialTeamsPlayType == "Kickoff") %>%
  select(kickerId, kickLength, specialTeamsPlayType, specialTeamsResult, displayName, 
         kickDirectionIntended, kickDirectionActual, kickType, kickContactType) %>%
  mutate(
    success_Direction = ifelse(kickDirectionIntended == "L" & kickDirectionActual == "L", 1,
                               ifelse(kickDirectionIntended == "R" & kickDirectionActual == "R", 1,
                                      ifelse(kickDirectionIntended == "C" & kickDirectionActual == "C", 1, 0)))
  ) %>%
  data.frame()



#2: check if the row besides it in result == the same in type play. if it does it = accurate
df2 <- kickoff2 %>%
  group_by(displayName) %>%
  reframe(total_success_Direction = sum(success_Direction), 
          count1 = n(), 
          results_success_direction = total_success_Direction / count1) %>%
  select(displayName, results_success_direction) %>%
  arrange(-results_success_direction) %>%
  data.frame()
df2
##             displayName results_success_direction
## 1          Aaron Brewer                 1.0000000
## 2              Andy Lee                 1.0000000
## 3           Braden Mann                 1.0000000
## 4        Bradley Pinion                 1.0000000
## 5       Brandon McManus                 1.0000000
## 6            Brett Kern                 1.0000000
## 7           Brett Maher                 1.0000000
## 8           Bryan Anger                 1.0000000
## 9         Caleb Sturgis                 1.0000000
## 10     Cameron Johnston                 1.0000000
## 11     Chase McLaughlin                 1.0000000
## 12        Chris Boswell                 1.0000000
## 13          Chris Jones                 1.0000000
## 14      Corey Bojorquez                 1.0000000
## 15           Dan Bailey                 1.0000000
## 16          Daren Bates                 1.0000000
## 17      Dustin Colquitt                 1.0000000
## 18       Dustin Hopkins                 1.0000000
## 19         Eddy Pineiro                 1.0000000
## 20          Elliott Fry                 1.0000000
## 21          Graham Gano                 1.0000000
## 22        Greg Zuerlein                 1.0000000
## 23      Harrison Butker                 1.0000000
## 24         Jamie Gillan                 1.0000000
## 25            Joey Slye                 1.0000000
## 26        Johnny Hekker                 1.0000000
## 27            Jon Brown                 1.0000000
## 28         Jordan Berry                 1.0000000
## 29           Josh Lambo                 1.0000000
## 30         Kaare Vedvik                 1.0000000
## 31          Kai Forbath                 1.0000000
## 32        Kasey Redfern                 1.0000000
## 33          Keelan Cole                 1.0000000
## 34      Lachlan Edwards                 1.0000000
## 35          Matt Bryant                 1.0000000
## 36             Matt Gay                 1.0000000
## 37           Matt Haack                 1.0000000
## 38            Matt Wile                 1.0000000
## 39      Matthew McCrane                 1.0000000
## 40       Matthew Wright                 1.0000000
## 41      Michael Badgley                 1.0000000
## 42      Michael Dickson                 1.0000000
## 43      Michael Palardy                 1.0000000
## 44          Mike Nugent                 1.0000000
## 45     Mitch Wishnowsky                 1.0000000
## 46            Nick Folk                 1.0000000
## 47        Pat O'Donnell                 1.0000000
## 48          Phil Dawson                 1.0000000
## 49    Rigoberto Sanchez                 1.0000000
## 50          Riley Dixon                 1.0000000
## 51  Rodrigo Blankenship                 1.0000000
## 52         Ryan Santoso                 1.0000000
## 53          Ryan Succop                 1.0000000
## 54           Sam Ficken                 1.0000000
## 55     Stephen Hauschka                 1.0000000
## 56  Sterling Hofrichter                 1.0000000
## 57     Taylor Russolino                 1.0000000
## 58      Thomas Morstead                 1.0000000
## 59       Tommy Townsend                 1.0000000
## 60     Tristan Vizcaino                 1.0000000
## 61              Ty Long                 1.0000000
## 62     Ka'imi Fairbairn                 0.9960938
## 63          Jason Myers                 0.9921569
## 64         Younghoe Koo                 0.9909910
## 65       Daniel Carlson                 0.9906977
## 66       Austin Seibert                 0.9894737
## 67          Matt Bosher                 0.9893617
## 68          Logan Cooke                 0.9890110
## 69        Zane Gonzalez                 0.9884393
## 70         Cairo Santos                 0.9870968
## 71         Jake Elliott                 0.9869565
## 72             Wil Lutz                 0.9860140
## 73          Jake Bailey                 0.9854015
## 74 Sebastian Janikowski                 0.9852941
## 75   Stephen Gostkowski                 0.9850746
## 76             Jack Fox                 0.9830508
## 77           Sam Martin                 0.9779412
## 78           Tyler Bass                 0.9700000
## 79   Chandler Catanzaro                 0.9538462
## 80          Matt Prater                 0.9523810
## 81         Robbie Gould                 0.9411765
## 82           J.K. Scott                 0.8750000
## 83     Britton Colquitt                 0.7500000
## 84        Aldrick Rosas                        NA
## 85          Cody Parkey                        NA
## 86          Greg Joseph                        NA
## 87        Jason Sanders                        NA
## 88        Justin Tucker                        NA
## 89         Mason Crosby                        NA
## 90        Randy Bullock                        NA
## 91           Sam Sloman                        NA
#2.5: top 8 direction
top_success_direction <- df2 %>%
  arrange(desc(results_success_direction)) %>%
  head(8) %>%
  select(displayName, results_success_direction) %>%
  data.frame()
top_success_direction
##       displayName results_success_direction
## 1    Aaron Brewer                         1
## 2        Andy Lee                         1
## 3     Braden Mann                         1
## 4  Bradley Pinion                         1
## 5 Brandon McManus                         1
## 6      Brett Kern                         1
## 7     Brett Maher                         1
## 8     Bryan Anger                         1
#4: check top 8 farthest kicks: if the kick is 50 or more add a count 

kickoff3 <- df %>%
  filter(specialTeamsPlayType == "Kickoff") %>%
  select(kickerId, kickLength, specialTeamsPlayType, specialTeamsResult, displayName, 
         kickDirectionIntended, kickDirectionActual, kickType, kickContactType) %>%
  mutate(
    # Success length: if kickLength is 50 or greater, it's a success
    success_length = ifelse(kickLength >= 50, 1, 0)
  ) %>%
  arrange(desc(success_length)) %>%  # Optionally arrange in descending order of success_length
  data.frame()


#5: make count of accuracy and graph top 8

df3 <- kickoff3 %>%
  group_by(displayName) %>%
  reframe(total_success_length = sum(success_length), 
          count2 = n(), 
          results_success_length = total_success_length / count2) %>%
  select(displayName, results_success_length) %>%
  arrange(-results_success_length) %>%
  data.frame()
df3
##             displayName results_success_length
## 1              Andy Lee              1.0000000
## 2      Britton Colquitt              1.0000000
## 3           Bryan Anger              1.0000000
## 4         Caleb Sturgis              1.0000000
## 5      Cameron Johnston              1.0000000
## 6           Chris Jones              1.0000000
## 7       Corey Bojorquez              1.0000000
## 8       Dustin Colquitt              1.0000000
## 9           Elliott Fry              1.0000000
## 10         Jordan Berry              1.0000000
## 11         Kaare Vedvik              1.0000000
## 12          Matt Bryant              1.0000000
## 13        Pat O'Donnell              1.0000000
## 14          Riley Dixon              1.0000000
## 15  Rodrigo Blankenship              1.0000000
## 16     Tristan Vizcaino              1.0000000
## 17           Tyler Bass              1.0000000
## 18       Bradley Pinion              0.9924812
## 19       Austin Seibert              0.9894737
## 20        Chris Boswell              0.9864865
## 21              Ty Long              0.9843750
## 22     Mitch Wishnowsky              0.9806452
## 23     Ka'imi Fairbairn              0.9804688
## 24           Sam Sloman              0.9772727
## 25        Zane Gonzalez              0.9768786
## 26             Wil Lutz              0.9755245
## 27            Joey Slye              0.9735099
## 28          Cody Parkey              0.9722222
## 29        Greg Zuerlein              0.9708333
## 30      Michael Badgley              0.9705882
## 31 Sebastian Janikowski              0.9705882
## 32          Mike Nugent              0.9696970
## 33   Chandler Catanzaro              0.9692308
## 34          Phil Dawson              0.9687500
## 35          Jason Myers              0.9686275
## 36         Mason Crosby              0.9670782
## 37      Harrison Butker              0.9655172
## 38          Jake Bailey              0.9635036
## 39          Braden Mann              0.9629630
## 40       Daniel Carlson              0.9627907
## 41    Rigoberto Sanchez              0.9620253
## 42        Justin Tucker              0.9590444
## 43       Dustin Hopkins              0.9569378
## 44         Jake Elliott              0.9565217
## 45  Sterling Hofrichter              0.9545455
## 46          Ryan Succop              0.9540230
## 47      Matthew McCrane              0.9523810
## 48          Greg Joseph              0.9500000
## 49      Brandon McManus              0.9478673
## 50          Matt Bosher              0.9468085
## 51          Logan Cooke              0.9450549
## 52             Matt Gay              0.9444444
## 53          Brett Maher              0.9440559
## 54        Randy Bullock              0.9414634
## 55         Ryan Santoso              0.9411765
## 56   Stephen Gostkowski              0.9402985
## 57        Aldrick Rosas              0.9371429
## 58         Younghoe Koo              0.9369369
## 59       Matthew Wright              0.9333333
## 60         Cairo Santos              0.9290323
## 61           Sam Martin              0.9264706
## 62          Graham Gano              0.9242424
## 63           Sam Ficken              0.9230769
## 64           Dan Bailey              0.9220779
## 65         Eddy Pineiro              0.9215686
## 66        Jason Sanders              0.9207048
## 67             Jack Fox              0.9152542
## 68     Stephen Hauschka              0.9064748
## 69          Kai Forbath              0.9062500
## 70           J.K. Scott              0.8750000
## 71        Johnny Hekker              0.8750000
## 72           Josh Lambo              0.8686869
## 73     Chase McLaughlin              0.8571429
## 74          Matt Prater              0.8333333
## 75            Matt Wile              0.8181818
## 76        Kasey Redfern              0.8000000
## 77            Nick Folk              0.7500000
## 78     Taylor Russolino              0.7500000
## 79           Matt Haack              0.6666667
## 80         Jamie Gillan              0.5000000
## 81      Lachlan Edwards              0.5000000
## 82       Tommy Townsend              0.5000000
## 83         Robbie Gould              0.4117647
## 84      Michael Dickson              0.3636364
## 85      Thomas Morstead              0.2500000
## 86         Aaron Brewer              0.0000000
## 87           Brett Kern              0.0000000
## 88          Daren Bates              0.0000000
## 89            Jon Brown              0.0000000
## 90          Keelan Cole              0.0000000
## 91      Michael Palardy              0.0000000
#5: top 8 length
top_success_length<- df3 %>%
  arrange(desc(results_success_length)) %>%
  head(20) %>%
  select(displayName, results_success_length) %>%
  data.frame()
top_success_length
##            displayName results_success_length
## 1             Andy Lee              1.0000000
## 2     Britton Colquitt              1.0000000
## 3          Bryan Anger              1.0000000
## 4        Caleb Sturgis              1.0000000
## 5     Cameron Johnston              1.0000000
## 6          Chris Jones              1.0000000
## 7      Corey Bojorquez              1.0000000
## 8      Dustin Colquitt              1.0000000
## 9          Elliott Fry              1.0000000
## 10        Jordan Berry              1.0000000
## 11        Kaare Vedvik              1.0000000
## 12         Matt Bryant              1.0000000
## 13       Pat O'Donnell              1.0000000
## 14         Riley Dixon              1.0000000
## 15 Rodrigo Blankenship              1.0000000
## 16    Tristan Vizcaino              1.0000000
## 17          Tyler Bass              1.0000000
## 18      Bradley Pinion              0.9924812
## 19      Austin Seibert              0.9894737
## 20       Chris Boswell              0.9864865

Visualization 1: A graph that shows …

  # vertical bar chart
   #top 8 success kickers in a graph, bar graph
  
  ggplot(data = top_success, aes(x = reorder(displayName, -results_success), y = results_success)) +
  geom_bar(aes(fill = results_success), stat = "identity") + 
  labs(x = "Player Name", y = "Percentage of Accurate Kicks", title = "Top 8 Kickoffs by Player", fill = "Accuracy %") +
  geom_text(aes(label = percent(results_success)), vjust = -0.5) + 
  scale_fill_continuous(low = "lightblue", high = "blue", 
                        breaks = seq(0, 1, 0.2),
                        labels = scales::percent) + 
  theme(plot.title = element_text(hjust = 0.5))


Visualization 2: A graph that shows …

  # horizontal bar chart
  #graph top 8 direction
  ggplot(data = top_success_direction, aes(x = reorder(displayName, -results_success_direction), y = results_success_direction)) +
  geom_bar(aes(fill = results_success_direction), stat = "identity") + 
  coord_flip()+
  labs(x = "Player Name", y = "Percentage of Accurate Kicks by Direction", title = "Top 8 Kickoffs by Player", fill = "Accuracy %") +
  geom_text(aes(label = percent(results_success_direction)),  hjust = -0.2) + 
  scale_fill_continuous(low = "lightgreen", high = "darkgreen", 
                        breaks = seq(0, 1, 0.2),
                        labels = scales::percent) + 
  theme(plot.title = element_text(hjust = 0.50))


Visualization 3: A graph that shows …

  #7; graph top 20 length
  ggplot(data = top_success_length, aes(x = reorder(displayName, -results_success_length), y = results_success_length)) +
  geom_bar(aes(fill = results_success_length), stat = "identity") + 
  coord_flip()+
  labs(x = "Player Name", y = "Percentage of Accurate Kicks by Length", title = "Top 20 Kickoffs by Player", fill = "Accuracy %") +
  geom_text(aes(label = percent(results_success_length)),  hjust = -0.2) + 
  scale_fill_continuous(low = "magenta", high = "purple", 
                        breaks = seq(0, 1, 0.2),
                        labels = scales::percent) + 
  theme(plot.title = element_text(hjust = 0.50))

Conclusion

Kaare Vedvik and Tristan Vizcaino were tied for first place with the highest accuracy of 100% success rate for kicking the ball. Although they did not show up in the top 8 for the farthest ball kicks and most successful ball kicks in the correct direction, they also had 100% in farthest kick and successful ball kicks in the correct direction. This occurrence happened because there were 17 other players who had 100% accuracy in length and 61 in direction. The accuracy for longest kick was determined by is the kick is greater than 50 yards. in college football, 50 yards is seen as strong performance in ball throw. The Accuracy for most successful ball kicks in the correct direction, is determined by whether not the player threw the ball in the direction they intended to.

in second place was Joey Slye with a success accuracy of 90.73%. His accuracy for throwing the ball in a far length was 97.35%, while his accuracy in direction was 100%

Note

You can add captions at the bottom of images. To add a caption, include the words fig.cap=“blah blah” inside the {….} at the top of the RMarkdown code you are using to include the image. You will notice here that I can also add lines of HTML code directly into the text. Here’s something bold, something italics, something underline, and something blue.

#knitr::include_graphics("c:/Users/pptallon/Dropbox/G/Personal/Tallon005.jpg")