The problem statement is as follows:

In this assignment, you’ll practice collaborating around a code project with GitHub. You could consider our collective work as building out a book of examples on how to use TidyVerse functions.

GitHub repository: https://github.com/acatlin/FALL2019TIDYVERSE

FiveThirtyEight.com datasets.

Kaggle datasets.

You have two tasks:

Create an Example. Using one or more TidyVerse packages, and any dataset from fivethirtyeight.com or Kaggle, create a programming sample “vignette” that demonstrates how to use one or more of the capabilities of the selected TidyVerse package with your selected dataset. (25 points) Extend an Existing Example. Using one of your classmate’s examples (as created above), extend his or her example with additional annotated code. (15 points) You should clone the provided repository. Once you have code to submit, you should make a pull request on the shared repository. Minimally, you should be submitted .Rmd files; ideally, you should also submit an .md file and update the README.md file with your example.

After you’ve completed both parts of the assignment, please submit your GitHub handle name in the submission link provided in the week 1 folder! This will let your instructor know that your work is ready to be graded.

You should complete both parts of the assignment and make your submission no later than the end of day on Sunday, December 1st.

Load Data

## 
## -- Column specification --------------------------------------------------------
## cols(
##   age = col_double(),
##   sex = col_double(),
##   cp = col_double(),
##   trestbps = col_double(),
##   chol = col_double(),
##   fbs = col_double(),
##   restecg = col_double(),
##   thalach = col_double(),
##   exang = col_double(),
##   oldpeak = col_double(),
##   slope = col_double(),
##   ca = col_double(),
##   thal = col_double(),
##   target = col_double()
## )
age sex cp trestbps chol fbs restecg thalach exang oldpeak slope ca thal target
63 1 3 145 233 1 0 150 0 2.3 0 0 1 1
37 1 2 130 250 0 1 187 0 3.5 0 0 2 1
41 0 1 130 204 0 0 172 0 1.4 2 0 2 1
56 1 1 120 236 0 1 178 0 0.8 2 0 2 1
57 0 0 120 354 0 1 163 1 0.6 2 0 2 1
57 1 0 140 192 0 1 148 0 0.4 1 0 1 1

Capability 1.

do capability tutorial (do anything)

Description: Performs any arbitrary computations on the data
Usage: do(data, …)
Example: We can create a function that sorts the data by age then returns the first 5 for each age group.

## # A tibble: 171 x 14
## # Groups:   age [41]
##      age   sex    cp trestbps  chol   fbs restecg thalach exang oldpeak slope
##    <dbl> <dbl> <dbl>    <dbl> <dbl> <dbl>   <dbl>   <dbl> <dbl>   <dbl> <dbl>
##  1    29     1     1      130   204     0       0     202     0     0       2
##  2    34     1     3      118   182     0       0     174     0     0       2
##  3    34     0     1      118   210     0       1     192     0     0.7     2
##  4    35     0     0      138   183     0       1     182     0     1.4     2
##  5    35     1     1      122   192     0       1     174     0     0       2
##  6    35     1     0      120   198     0       1     130     1     1.6     1
##  7    35     1     0      126   282     0       0     156     1     0       2
##  8    37     1     2      130   250     0       1     187     0     3.5     0
##  9    37     0     2      120   215     0       1     170     0     0       2
## 10    38     1     2      138   175     0       1     173     0     0       2
## # ... with 161 more rows, and 3 more variables: ca <dbl>, thal <dbl>,
## #   target <dbl>

Capability 2.

filter capability tutorial

Description: Using filter we can select rows of the data frame matching conditions.
Usage: filter(data)
Example: To select the people of over 20 and less than 65 we can pass the data heart and condtion age>20 and age < 65 to the function . It’ll return matching rows of heart disease.

## # A tibble: 262 x 14
##      age   sex    cp trestbps  chol   fbs restecg thalach exang oldpeak slope
##    <dbl> <dbl> <dbl>    <dbl> <dbl> <dbl>   <dbl>   <dbl> <dbl>   <dbl> <dbl>
##  1    63     1     3      145   233     1       0     150     0     2.3     0
##  2    37     1     2      130   250     0       1     187     0     3.5     0
##  3    41     0     1      130   204     0       0     172     0     1.4     2
##  4    56     1     1      120   236     0       1     178     0     0.8     2
##  5    57     0     0      120   354     0       1     163     1     0.6     2
##  6    57     1     0      140   192     0       1     148     0     0.4     1
##  7    56     0     1      140   294     0       0     153     0     1.3     1
##  8    44     1     1      120   263     0       1     173     0     0       2
##  9    52     1     2      172   199     1       1     162     0     0.5     2
## 10    57     1     2      150   168     0       1     174     0     1.6     2
## # ... with 252 more rows, and 3 more variables: ca <dbl>, thal <dbl>,
## #   target <dbl>

Capability 3.

select capability tutorial

Description: Using select we can keep the selected variables
Usage: select(data, …)
Example: To keep only age, sex,cp variable we can pass the data heart and age, sex,cp to the function .

## # A tibble: 6 x 3
##     age   sex    cp
##   <dbl> <dbl> <dbl>
## 1    63     1     3
## 2    37     1     2
## 3    41     0     1
## 4    56     1     1
## 5    57     0     0
## 6    57     1     0

Capability 4.

arrange capability tutorial Description: Using arrange we can order the rows in an expression involving variables
Usage: arrange(data, …)
Example: To arrange the rows by sex and age

## # A tibble: 6 x 3
##     age   sex    cp
##   <dbl> <dbl> <dbl>
## 1    34     0     1
## 2    35     0     0
## 3    37     0     2
## 4    39     0     2
## 5    39     0     2
## 6    41     0     1

Marker: 607-13