IS 607 - Project 2 The goal of this assignment is to give you practice in preparing different datasets for downstream analysis work. Your task is to:
Create a .CSV file (or optionally, a MySQL database!) that includes all of the information included in the dataset. You’re encouraged to use a “wide” structure similar to how the information appears in the discussion item, so that you can practice tidying and transformations as described below. Read the information from your .CSV file into R, and use tidyr and dplyr as needed to tidy and transform your data. [Most of your grade will be based on this step!] Perform the analysis requested in the discussion item. Your code should be in an R Markdown file, posted to rpubs.com, and should include narrative descriptions of your data cleanup work, analysis, and conclusions.
Data Acquired from Wikipedia:https://en.wikipedia.org/wiki/Aaron_Rodgers
For part A I’m looking at the career numbers for Aaron Rodgers. I’d like to compare his 2015, and 2016 number in a tabular model to see how they trajected in some of his best years.
library(tidyr)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
Rawwide_data <- read.csv(paste0("https://raw.githubusercontent.com/Fyoun123/Data607/master/Project%202/ARP2.csv"),stringsAsFactors = F); Rawwide_data
## Year Team Games...GP Games...GS Passing...Comp Passing...Att
## 1 2005 GB 3 0 9 16
## 2 2006 GB 2 0 6 15
## 3 2007 GB 2 0 20 28
## 4 2008 GB 16 16 341 536
## 5 2009 GB 16 16 350 541
## 6 2010 GB 15 15 312 475
## 7 2011 GB 15 15 343 502
## 8 2012 GB 16 16 371 552
## 9 2013 GB 9 9 193 290
## 10 2014 GB 16 16 341 520
## 11 2015 GB 16 16 347 572
## 12 2016 GB 16 16 401 610
## 13 2017 GB 7 7 154 238
## 14 2018 GB 4 4 99 156
## Passing...Pct Passing...Yds Passing...Avg Passing...TD Passing...Int
## 1 56.3 65 4.1 0 1
## 2 40.0 46 3.1 0 0
## 3 71.4 218 7.8 1 0
## 4 63.6 4038 7.5 28 13
## 5 64.7 4434 8.2 30 7
## 6 65.7 3922 8.3 28 11
## 7 68.3 4643 9.2 45 6
## 8 67.2 4295 7.8 39 8
## 9 66.6 2536 8.7 17 6
## 10 65.6 4381 8.4 38 5
## 11 60.7 3821 6.7 31 8
## 12 65.7 4428 7.3 40 7
## 13 64.7 1675 7.0 16 6
## 14 63.5 1130 7.2 7 1
## Passing...Rate Passing...TD. Passing...Int. Rushing...Att Rushing...Yds
## 1 39.8 0.0 6.3 2 7
## 2 48.2 0.0 0.0 2 11
## 3 106.0 3.6 0.0 7 29
## 4 93.8 5.2 2.4 56 207
## 5 103.2 5.5 1.3 58 316
## 6 101.2 5.9 2.3 64 356
## 7 122.5 9.0 1.2 60 257
## 8 108.0 7.1 1.4 54 259
## 9 104.9 5.9 2.1 30 120
## 10 112.2 7.3 1.0 43 269
## 11 92.7 5.4 1.4 58 344
## 12 104.2 6.6 1.1 67 369
## 13 97.2 6.7 2.5 24 126
## 14 97.4 4.5 0.6 11 67
## Rushing...Avg Rushing...TD
## 1 3.5 0
## 2 5.5 0
## 3 4.1 0
## 4 3.7 4
## 5 5.4 5
## 6 5.6 4
## 7 4.3 3
## 8 4.8 2
## 9 3.3 0
## 10 6.3 2
## 11 5.9 1
## 12 5.5 4
## 13 5.2 0
## 14 6.1 0
DF2 <- Rawwide_data %>%
select(Year,Passing...Comp,Passing...Att,Passing...Pct,Passing...Avg,Passing...TD,Passing...Int,Passing...Rate)
DF2
## Year Passing...Comp Passing...Att Passing...Pct Passing...Avg
## 1 2005 9 16 56.3 4.1
## 2 2006 6 15 40.0 3.1
## 3 2007 20 28 71.4 7.8
## 4 2008 341 536 63.6 7.5
## 5 2009 350 541 64.7 8.2
## 6 2010 312 475 65.7 8.3
## 7 2011 343 502 68.3 9.2
## 8 2012 371 552 67.2 7.8
## 9 2013 193 290 66.6 8.7
## 10 2014 341 520 65.6 8.4
## 11 2015 347 572 60.7 6.7
## 12 2016 401 610 65.7 7.3
## 13 2017 154 238 64.7 7.0
## 14 2018 99 156 63.5 7.2
## Passing...TD Passing...Int Passing...Rate
## 1 0 1 39.8
## 2 0 0 48.2
## 3 1 0 106.0
## 4 28 13 93.8
## 5 30 7 103.2
## 6 28 11 101.2
## 7 45 6 122.5
## 8 39 8 108.0
## 9 17 6 104.9
## 10 38 5 112.2
## 11 31 8 92.7
## 12 40 7 104.2
## 13 16 6 97.2
## 14 7 1 97.4
DF3<- DF2 %>% gather(Type,Number,Passing...Comp:Passing...Rate)
DF3
## Year Type Number
## 1 2005 Passing...Comp 9.0
## 2 2006 Passing...Comp 6.0
## 3 2007 Passing...Comp 20.0
## 4 2008 Passing...Comp 341.0
## 5 2009 Passing...Comp 350.0
## 6 2010 Passing...Comp 312.0
## 7 2011 Passing...Comp 343.0
## 8 2012 Passing...Comp 371.0
## 9 2013 Passing...Comp 193.0
## 10 2014 Passing...Comp 341.0
## 11 2015 Passing...Comp 347.0
## 12 2016 Passing...Comp 401.0
## 13 2017 Passing...Comp 154.0
## 14 2018 Passing...Comp 99.0
## 15 2005 Passing...Att 16.0
## 16 2006 Passing...Att 15.0
## 17 2007 Passing...Att 28.0
## 18 2008 Passing...Att 536.0
## 19 2009 Passing...Att 541.0
## 20 2010 Passing...Att 475.0
## 21 2011 Passing...Att 502.0
## 22 2012 Passing...Att 552.0
## 23 2013 Passing...Att 290.0
## 24 2014 Passing...Att 520.0
## 25 2015 Passing...Att 572.0
## 26 2016 Passing...Att 610.0
## 27 2017 Passing...Att 238.0
## 28 2018 Passing...Att 156.0
## 29 2005 Passing...Pct 56.3
## 30 2006 Passing...Pct 40.0
## 31 2007 Passing...Pct 71.4
## 32 2008 Passing...Pct 63.6
## 33 2009 Passing...Pct 64.7
## 34 2010 Passing...Pct 65.7
## 35 2011 Passing...Pct 68.3
## 36 2012 Passing...Pct 67.2
## 37 2013 Passing...Pct 66.6
## 38 2014 Passing...Pct 65.6
## 39 2015 Passing...Pct 60.7
## 40 2016 Passing...Pct 65.7
## 41 2017 Passing...Pct 64.7
## 42 2018 Passing...Pct 63.5
## 43 2005 Passing...Avg 4.1
## 44 2006 Passing...Avg 3.1
## 45 2007 Passing...Avg 7.8
## 46 2008 Passing...Avg 7.5
## 47 2009 Passing...Avg 8.2
## 48 2010 Passing...Avg 8.3
## 49 2011 Passing...Avg 9.2
## 50 2012 Passing...Avg 7.8
## 51 2013 Passing...Avg 8.7
## 52 2014 Passing...Avg 8.4
## 53 2015 Passing...Avg 6.7
## 54 2016 Passing...Avg 7.3
## 55 2017 Passing...Avg 7.0
## 56 2018 Passing...Avg 7.2
## 57 2005 Passing...TD 0.0
## 58 2006 Passing...TD 0.0
## 59 2007 Passing...TD 1.0
## 60 2008 Passing...TD 28.0
## 61 2009 Passing...TD 30.0
## 62 2010 Passing...TD 28.0
## 63 2011 Passing...TD 45.0
## 64 2012 Passing...TD 39.0
## 65 2013 Passing...TD 17.0
## 66 2014 Passing...TD 38.0
## 67 2015 Passing...TD 31.0
## 68 2016 Passing...TD 40.0
## 69 2017 Passing...TD 16.0
## 70 2018 Passing...TD 7.0
## 71 2005 Passing...Int 1.0
## 72 2006 Passing...Int 0.0
## 73 2007 Passing...Int 0.0
## 74 2008 Passing...Int 13.0
## 75 2009 Passing...Int 7.0
## 76 2010 Passing...Int 11.0
## 77 2011 Passing...Int 6.0
## 78 2012 Passing...Int 8.0
## 79 2013 Passing...Int 6.0
## 80 2014 Passing...Int 5.0
## 81 2015 Passing...Int 8.0
## 82 2016 Passing...Int 7.0
## 83 2017 Passing...Int 6.0
## 84 2018 Passing...Int 1.0
## 85 2005 Passing...Rate 39.8
## 86 2006 Passing...Rate 48.2
## 87 2007 Passing...Rate 106.0
## 88 2008 Passing...Rate 93.8
## 89 2009 Passing...Rate 103.2
## 90 2010 Passing...Rate 101.2
## 91 2011 Passing...Rate 122.5
## 92 2012 Passing...Rate 108.0
## 93 2013 Passing...Rate 104.9
## 94 2014 Passing...Rate 112.2
## 95 2015 Passing...Rate 92.7
## 96 2016 Passing...Rate 104.2
## 97 2017 Passing...Rate 97.2
## 98 2018 Passing...Rate 97.4
DF4 <- DF3 %>% filter(Year==2015|Year==2016)
DF4
## Year Type Number
## 1 2015 Passing...Comp 347.0
## 2 2016 Passing...Comp 401.0
## 3 2015 Passing...Att 572.0
## 4 2016 Passing...Att 610.0
## 5 2015 Passing...Pct 60.7
## 6 2016 Passing...Pct 65.7
## 7 2015 Passing...Avg 6.7
## 8 2016 Passing...Avg 7.3
## 9 2015 Passing...TD 31.0
## 10 2016 Passing...TD 40.0
## 11 2015 Passing...Int 8.0
## 12 2016 Passing...Int 7.0
## 13 2015 Passing...Rate 92.7
## 14 2016 Passing...Rate 104.2
colnames(DF4)[1] <- "2015/2016"
colnames(DF4)[2] <- "Type"
DF4$Type <- factor(DF4$Type)
DF5 <- spread(DF4,"2015/2016","Number")
DF5
## Type 2015 2016
## 1 Passing...Att 572.0 610.0
## 2 Passing...Avg 6.7 7.3
## 3 Passing...Comp 347.0 401.0
## 4 Passing...Int 8.0 7.0
## 5 Passing...Pct 60.7 65.7
## 6 Passing...Rate 92.7 104.2
## 7 Passing...TD 31.0 40.0