select Keep the variables name, eye_color, and films.filter select blonds.filter select female blonds.mutate Convert height in centimeters to feet.summarize Calculate mean height in feetgroup_by and summarize Calculate mean height by gender.spread Convert the dataset, newdata, to a wide dataset.In this exercise you will learn to clean data using the dplyr package. To this end, you will follow through the codes in one of our e-texts, Data Visualization with R. The given example code below is from Chapter 1.2 Cleaning data.
# Load package
library(tidyverse)
## ── Attaching packages ────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.2.1 ✓ purrr 0.3.3
## ✓ tibble 2.1.3 ✓ dplyr 0.8.4
## ✓ tidyr 1.0.2 ✓ stringr 1.4.0
## ✓ readr 1.3.1 ✓ forcats 0.4.0
## ── Conflicts ───────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
# Import data
data(starwars)
starwars
select Keep the variables name, eye_color, and films.select(starwars, name, eye_color, films)
filter select blonds.filter(starwars, hair_color == "blond")
filter select female blonds.newdata <- filter(starwars,
gender == "female" &
hair_color == "blonds")
mutate Convert height in centimeters to feet.Hint: Divide the length value by 30.48.
starwars <- mutate(starwars, height = height / 30.48)
summarize Calculate mean height in feetsummarize(starwars, mean_ht = mean(height, na.rm=TRUE))
group_by and summarize Calculate mean height by gender.Hint: Use%>%, the pipe operator. Save the result under a new name, mean_height.
newdata <- group_by(starwars, gender)
newdata <- summarize(newdata,
mean_ht = mean(height, na.rm=TRUE))
newdata
spread Convert the dataset, newdata, to a wide dataset.wide_data <- spread(newdata, gender, mean_ht)
wide_data
gather(wide_data,
key="gender",
value="mean_ht",
female:`<NA>`)
Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.