In this exercise you will learn to clean data using the dplyr package. To this end, you will follow through the codes in one of our e-texts, Data Visualization with R. The given example code below is from Chapter 1.2 Cleaning data.

## # A tibble: 87 x 13
##    name  height  mass hair_color skin_color eye_color birth_year gender
##    <chr>  <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> 
##  1 Luke…    172    77 blond      fair       blue            19   male  
##  2 C-3PO    167    75 <NA>       gold       yellow         112   <NA>  
##  3 R2-D2     96    32 <NA>       white, bl… red             33   <NA>  
##  4 Dart…    202   136 none       white      yellow          41.9 male  
##  5 Leia…    150    49 brown      light      brown           19   female
##  6 Owen…    178   120 brown, gr… light      blue            52   male  
##  7 Beru…    165    75 brown      light      blue            47   female
##  8 R5-D4     97    32 <NA>       white, red red             NA   <NA>  
##  9 Bigg…    183    84 black      light      brown           24   male  
## 10 Obi-…    182    77 auburn, w… fair       blue-gray       57   male  
## # … with 77 more rows, and 5 more variables: homeworld <chr>, species <chr>,
## #   films <list>, vehicles <list>, starships <list>

Q1 select Keep the variables name, eye_color, and films.

## # A tibble: 87 x 3
##    name               eye_color films    
##    <chr>              <chr>     <list>   
##  1 Luke Skywalker     blue      <chr [5]>
##  2 C-3PO              yellow    <chr [6]>
##  3 R2-D2              red       <chr [7]>
##  4 Darth Vader        yellow    <chr [4]>
##  5 Leia Organa        brown     <chr [5]>
##  6 Owen Lars          blue      <chr [3]>
##  7 Beru Whitesun lars blue      <chr [3]>
##  8 R5-D4              red       <chr [1]>
##  9 Biggs Darklighter  brown     <chr [1]>
## 10 Obi-Wan Kenobi     blue-gray <chr [6]>
## # … with 77 more rows

Q2 filter select blonds.

## # A tibble: 3 x 13
##   name  height  mass hair_color skin_color eye_color birth_year gender homeworld
##   <chr>  <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr>  <chr>    
## 1 Luke…    172    77 blond      fair       blue            19   male   Tatooine 
## 2 Anak…    188    84 blond      fair       blue            41.9 male   Tatooine 
## 3 Fini…    170    NA blond      fair       blue            91   male   Coruscant
## # … with 4 more variables: species <chr>, films <list>, vehicles <list>,
## #   starships <list>

Q3 filter select female blonds.

## # A tibble: 0 x 13
## # … with 13 variables: name <chr>, height <int>, mass <dbl>, hair_color <chr>,
## #   skin_color <chr>, eye_color <chr>, birth_year <dbl>, gender <chr>,
## #   homeworld <chr>, species <chr>, films <list>, vehicles <list>,
## #   starships <list>

Q4 mutate Convert height in centimeters to feet.

Hint: Divide the length value by 30.48.

## # A tibble: 87 x 13
##    name  height  mass hair_color skin_color eye_color birth_year gender
##    <chr>  <dbl> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> 
##  1 Luke…   5.64    77 blond      fair       blue            19   male  
##  2 C-3PO   5.48    75 <NA>       gold       yellow         112   <NA>  
##  3 R2-D2   3.15    32 <NA>       white, bl… red             33   <NA>  
##  4 Dart…   6.63   136 none       white      yellow          41.9 male  
##  5 Leia…   4.92    49 brown      light      brown           19   female
##  6 Owen…   5.84   120 brown, gr… light      blue            52   male  
##  7 Beru…   5.41    75 brown      light      blue            47   female
##  8 R5-D4   3.18    32 <NA>       white, red red             NA   <NA>  
##  9 Bigg…   6.00    84 black      light      brown           24   male  
## 10 Obi-…   5.97    77 auburn, w… fair       blue-gray       57   male  
## # … with 77 more rows, and 5 more variables: homeworld <chr>, species <chr>,
## #   films <list>, vehicles <list>, starships <list>

Q5 summarize Calculate mean height in feet

## # A tibble: 1 x 1
##   mean_ht
##     <dbl>
## 1    5.72

Q6 group_by and summarize Calculate mean height by gender.

Hint: Use%>%, the pipe operator. Save the result under a new name, mean_height.

## # A tibble: 5 x 2
##   gender        mheight
##   <chr>           <dbl>
## 1 female           5.43
## 2 hermaphrodite    5.74
## 3 male             5.88
## 4 none             6.56
## 5 <NA>             3.94

Q7 spread Convert the dataset, mean_height, to a wide dataset.

## # A tibble: 1 x 5
##   female hermaphrodite  male  none `<NA>`
##    <dbl>         <dbl> <dbl> <dbl>  <dbl>
## 1   5.43          5.74  5.88  6.56   3.94

Q8 Hide the messages and the code, but display results of the code from the webpage.

Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.

Q9 Display the title and your name correctly at the top of the webpage.

Q10 Use the correct slug.