Describe how Tibbles are different from data frames
Explain how to convert wide or long data to “Tidy” data
Explain how to merge relational data sets using join functions. (Next module)
Explain how to use grouped mutates and filter together.
Be familiar with major dplyr functions for transforming data.
Create a new variable with mutate() and case_when().
Use the pipe operator to shape the data to prepare for analysis and visualization
The textbook chapters to cover
Ch3: Data Transformation
Ch5: Data Tidying
Ch13: Numbers
Introduction to Data Wrangling
Loading the packages
to add a code chunk, use cmd + option + I
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.2.0 ✔ readr 2.1.6
✔ forcats 1.0.1 ✔ stringr 1.6.0
✔ ggplot2 4.0.2 ✔ tibble 3.3.1
✔ lubridate 1.9.5 ✔ tidyr 1.3.2
✔ purrr 1.2.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
What is Tidyverse?
The tidyverse is a collection of R packages that share a common design philosophy and are designed to work together seamlessly. The tidyverse includes packages for data manipulation, visualization, and modeling, among other tasks. Some of the core packages in the tidyverse include:
Note: use shift + option + I for multi-cursor activation.
airlines Airline names.
airports Airport metadata
flights Flights data
planes Plane metadata.
weather Hourly weather data
flights <- flightsflights |>count(year, month) |>arrange(desc(n)) |>mutate(month =as_factor(month)) |>mutate(month =fct_reorder(month, n)) |>ggplot(aes(month, n, fill = year)) +geom_col(fill ="#c6891f", show.legend =FALSE) +coord_flip() +labs(x ="Month",y ="# of Flights",title ="# of Flights by Month during Year 2013 at New York")