Import data

## # A tibble: 507 × 10
##    Sample `SNP Score` `Obesity Score` CVDScore DiabetesScore `High BP Score`
##    <chr>        <dbl>           <dbl>    <dbl>         <dbl>           <dbl>
##  1 24               0             2        0             2.5             1  
##  2 29               0             1        0             0               0  
##  3 30               0             2.5      0             0.5             2.5
##  4 46               0             0        0             0               1  
##  5 47               0             0        0             0               0  
##  6 56               0             0        0             0.5             1  
##  7 67               0             1        0.5           0.5             1  
##  8 76               0             0        0             1               0  
##  9 78               0             0        0             0               0  
## 10 95               0             0        0             0.5             0.5
## # ℹ 497 more rows
## # ℹ 4 more variables: `MI/CVA Before 65 Score` <dbl>,
## #   `MI/CVA After 65 Score` <dbl>, TotalCVDScore <dbl>, `Personal Score` <dbl>

Apply the following dplyr verbs to your data

Filter rows

FOR Filtering a Diabetes score of 1 AND TotalCVDScore of 4

## # A tibble: 2 × 10
##   Sample `SNP Score` `Obesity Score` CVDScore DiabetesScore `High BP Score`
##   <chr>        <dbl>           <dbl>    <dbl>         <dbl>           <dbl>
## 1 223              0             0        1.5             1             0.5
## 2 317              0             1.5      0               1             1  
## # ℹ 4 more variables: `MI/CVA Before 65 Score` <dbl>,
## #   `MI/CVA After 65 Score` <dbl>, TotalCVDScore <dbl>, `Personal Score` <dbl>

Arrange rows

For Arranging data so DiabetesScore is descending in value

## # A tibble: 507 × 10
##    Sample `SNP Score` `Obesity Score` CVDScore DiabetesScore `High BP Score`
##    <chr>        <dbl>           <dbl>    <dbl>         <dbl>           <dbl>
##  1 24               0             2        0             2.5             1  
##  2 401              0             2.5      0.5           2.5             2.5
##  3 463              0             3.5      0             2.5             0  
##  4 171              0             3        1             2               2  
##  5 184              0             0        1             2               2.5
##  6 244              0             0        0             2               0  
##  7 320              0             0        0             2               1  
##  8 335              0             0.5      0.5           2               0.5
##  9 338              0             0        0             2               0  
## 10 397              0             1        2             2               1  
## # ℹ 497 more rows
## # ℹ 4 more variables: `MI/CVA Before 65 Score` <dbl>,
## #   `MI/CVA After 65 Score` <dbl>, TotalCVDScore <dbl>, `Personal Score` <dbl>

Select columns

## # A tibble: 507 × 1
##    DiabetesScore
##            <dbl>
##  1           2.5
##  2           0  
##  3           0.5
##  4           0  
##  5           0  
##  6           0.5
##  7           0.5
##  8           1  
##  9           0  
## 10           0.5
## # ℹ 497 more rows

Add columns

###Table only shows new column I have created from two existing columns, the new column is DIABETESPLUSTOTALCVD, a combination of both columns scores

## # A tibble: 507 × 1
##    DIABETESPLUSTOTALCVD
##                   <dbl>
##  1                  8  
##  2                  1  
##  3                  7.5
##  4                  2  
##  5                  0  
##  6                  2.5
##  7                  4  
##  8                  3  
##  9                  0  
## 10                  3  
## # ℹ 497 more rows

Summarize by groups

DiabetesScore Column has been grouped and summarized according to what the score is, 0-2.5, each time going up by 0.5. There is a count column that shows how many samples within the data table had scores that matched the row value. Then it shows means whcih should match the row value because all of the samples within the row will have the same score value.

## # A tibble: 7 × 3
##   DiabetesScore count mean_value
##           <dbl> <int>      <dbl>
## 1           0     158        0  
## 2           0.5    34        0.5
## 3           1      21        1  
## 4           1.5     7        1.5
## 5           2       8        2  
## 6           2.5     3        2.5
## 7          NA     276      NaN

Module 6: Apply 5

Austin Boynton

Import data

Apply the following dplyr verbs to your data

Filter rows

FOR Filtering a Diabetes score of 1 AND TotalCVDScore of 4

Arrange rows

For Arranging data so DiabetesScore is descending in value

Select columns

Add columns

Summarize by groups