This is my take on the 2018 August 21 dataset provided by rfordatascience/tidytuesday.
The data for this study can be found here:
https://github.com/rfordatascience/tidytuesday/tree/master/data/week21
All the following code for this exercise can be found at my github repo here:
https://github.com/jasonmstevensphd/tidytuesday/tree/2018_08_21
Lastly, the corresponding article from Buzzfeed can be found here:
https://www.buzzfeednews.com/article/peteraldhous/california-wildfires-people-climate
Here we go!
To start, I imported the calfires_week21_frap.csv and I employed the case_when function that the original auther used to assign cause_2 as it’s not explicitly clear what the numbers correlate to in the dataset. This was a nice example of “case_when” that I’ll definitely add to my repretoire.
Text
First I loaded the libraries and files then cleaned up the data to convert blanks to NA’s. Also during this transformation I grouped the data by year and team while removing the columns containing player name, game week, and position. I noted that during this transformation that several teams were not represented and a large number of NA’s were present. To dig a little deeper I then grouped all the teams into conferences.
## # A tibble: 6 x 20
## # Groups: game_year [1]
## game_year team rush_att rush_yds rush_avg rush_tds rush_fumbles rec
## <int> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2000 ARI 322 1174 171. 6 11 302
## 2 2000 ATL 314 1083 114. 6 10 265
## 3 2000 BAL 485 2135 275. 9 8 276
## 4 2000 BUF 438 1709 320. 8 7 292
## 5 2000 CAR 351 1107 163. 7 2 318
## 6 2000 CHI 391 1631 259. 6 10 285
## # ... with 12 more variables: rec_yds <dbl>, rec_avg <dbl>, rec_tds <dbl>,
## # rec_fumbles <dbl>, pass_att <dbl>, pass_yds <dbl>, pass_tds <dbl>,
## # int <dbl>, sck <dbl>, pass_fumbles <dbl>, rate <dbl>, Conference <chr>
The initial plot of this data made it clear that Jacksonville (JAC / JAX, AFC South), San Diego (SD / LAC, AFC West), and the Rams (STL / LA, NFC West) were the teams that were missing assignment to players. As such, these teams were excluded from further analysis, which is unfortunate as analysis of “The Greatest Show on Turf” would have been interesting. As a side note, I don’t have a good way to simply exclude teams that match a character string. Stack overflow mentioned creating a reverse %in% operator. Nevertheless, I went the end around approach to get the job done in an effective yet clunky manner ####Rushing Efficiency An interesting aspect of this plot is observing how a teams rushing performance has changed over time, especially the Atlanta Falcons (can you figure out when Julio Jones was drafted?).
The above analysis was performed for passing. Interestingly, it was observed that most teams have increased the amount that they are passing. Again, Atlanta was an interesting case, effectively doubling their yearly passing yardage over a period of 10 years.
The above analysis was performed for passing. Interestingly, it was observed that most teams have increased the amount that they are passing. Again, Atlanta was an interesting case, effectively doubling their yearly passing yardage over a period of 10 years.
## NULL