In this exercise you will learn to clean data using the dplyr package. To this end, you will follow through the codes in one of our e-texts, Data Visualization with R. The given example code below is from Chapter 1.2 Cleaning data.

## # A tibble: 87 x 14
##    name    height  mass hair_color  skin_color eye_color birth_year sex   gender
##    <chr>    <int> <dbl> <chr>       <chr>      <chr>          <dbl> <chr> <chr> 
##  1 Luke S…    172    77 blond       fair       blue            19   male  mascu…
##  2 C-3PO      167    75 <NA>        gold       yellow         112   none  mascu…
##  3 R2-D2       96    32 <NA>        white, bl… red             33   none  mascu…
##  4 Darth …    202   136 none        white      yellow          41.9 male  mascu…
##  5 Leia O…    150    49 brown       light      brown           19   fema… femin…
##  6 Owen L…    178   120 brown, grey light      blue            52   male  mascu…
##  7 Beru W…    165    75 brown       light      blue            47   fema… femin…
##  8 R5-D4       97    32 <NA>        white, red red             NA   none  mascu…
##  9 Biggs …    183    84 black       light      brown           24   male  mascu…
## 10 Obi-Wa…    182    77 auburn, wh… fair       blue-gray       57   male  mascu…
## # … with 77 more rows, and 5 more variables: homeworld <chr>, species <chr>,
## #   films <list>, vehicles <list>, starships <list>

Q1 select Keep the variables name, eye_color, and films.

## # A tibble: 87 x 3
##    name               eye_color films    
##    <chr>              <chr>     <list>   
##  1 Luke Skywalker     blue      <chr [5]>
##  2 C-3PO              yellow    <chr [6]>
##  3 R2-D2              red       <chr [7]>
##  4 Darth Vader        yellow    <chr [4]>
##  5 Leia Organa        brown     <chr [5]>
##  6 Owen Lars          blue      <chr [3]>
##  7 Beru Whitesun lars blue      <chr [3]>
##  8 R5-D4              red       <chr [1]>
##  9 Biggs Darklighter  brown     <chr [1]>
## 10 Obi-Wan Kenobi     blue-gray <chr [6]>
## # … with 77 more rows

Q2 filter select blonds.

## # A tibble: 3 x 14
##   name      height  mass hair_color skin_color eye_color birth_year sex   gender
##   <chr>      <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> <chr> 
## 1 Luke Sky…    172    77 blond      fair       blue            19   male  mascu…
## 2 Anakin S…    188    84 blond      fair       blue            41.9 male  mascu…
## 3 Finis Va…    170    NA blond      fair       blue            91   male  mascu…
## # … with 5 more variables: homeworld <chr>, species <chr>, films <list>,
## #   vehicles <list>, starships <list>

Q3 filter select female blonds.

## # A tibble: 0 x 14
## # … with 14 variables: name <chr>, height <int>, mass <dbl>, hair_color <chr>,
## #   skin_color <chr>, eye_color <chr>, birth_year <dbl>, sex <chr>,
## #   gender <chr>, homeworld <chr>, species <chr>, films <list>,
## #   vehicles <list>, starships <list>

Q4 mutate Convert height in centimeters to feet.

Hint: Divide the length value by 30.48.

## # A tibble: 87 x 14
##    name    height  mass hair_color  skin_color eye_color birth_year sex   gender
##    <chr>    <dbl> <dbl> <chr>       <chr>      <chr>          <dbl> <chr> <chr> 
##  1 Luke S…   5.64    77 blond       fair       blue            19   male  mascu…
##  2 C-3PO     5.48    75 <NA>        gold       yellow         112   none  mascu…
##  3 R2-D2     3.15    32 <NA>        white, bl… red             33   none  mascu…
##  4 Darth …   6.63   136 none        white      yellow          41.9 male  mascu…
##  5 Leia O…   4.92    49 brown       light      brown           19   fema… femin…
##  6 Owen L…   5.84   120 brown, grey light      blue            52   male  mascu…
##  7 Beru W…   5.41    75 brown       light      blue            47   fema… femin…
##  8 R5-D4     3.18    32 <NA>        white, red red             NA   none  mascu…
##  9 Biggs …   6.00    84 black       light      brown           24   male  mascu…
## 10 Obi-Wa…   5.97    77 auburn, wh… fair       blue-gray       57   male  mascu…
## # … with 77 more rows, and 5 more variables: homeworld <chr>, species <chr>,
## #   films <list>, vehicles <list>, starships <list>

Q5 summarize Calculate mean height in feet

## # A tibble: 1 x 1
##   mean_ht
##     <dbl>
## 1    5.72

Q6 group_by and summarize Calculate mean height by gender.

Hint: Use%>%, the pipe operator. Save the result under a new name, mean_height.

## # A tibble: 3 x 2
##   gender    mean_height
## * <chr>           <dbl>
## 1 feminine         5.40
## 2 masculine        5.79
## 3 <NA>             5.95

Q7 spread Convert the dataset, mean_height, to a wide dataset.

## # A tibble: 1 x 3
##   feminine masculine `<NA>`
##      <dbl>     <dbl>  <dbl>
## 1     5.40      5.79   5.95

Q8 Hide the messages and the code, but display results of the code from the webpage.

Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.

Q9 Display the title and your name correctly at the top of the webpage.

Q10 Use the correct slug.