learning outcome

-Describe the concept of Data Wrangling. -Describe how Tibbles are different from data frames. -Explain how to convert wide or long data to “Tidy” data. -Explain how to use grouped mutates and filter together. -Be familiar with major dplyr functions for transforming data. -Create a new variable with mutate() and case_when(). -Use the pipe operator to shape the data to prepare for analysis and visualization.

too use multi-cursor editing feature, do the following steps:

To use mouse multi-cursor editing feature, do the following steps: 1. Select the text you want to edit. 2. Press shift + alt + i to place a cursor at the end of each selected line. 3. You can now edit all the selected text simultaneously.

Introduction to Data Wrangling

##loading packages Note To add a code chunk, you can use the keyboard shortcut ‘Ctrl+ Alt + I’.

#install.packages("tidyverse")
#install.packages("nycflights13")

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.2.0     ✔ readr     2.1.6
## ✔ forcats   1.0.1     ✔ stringr   1.6.0
## ✔ ggplot2   4.0.2     ✔ tibble    3.3.1
## ✔ lubridate 1.9.5     ✔ tidyr     1.3.2
## ✔ purrr     1.2.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(nycflights13)

Tibbles vs Data Frames

class(mtcars)
## [1] "data.frame"
cars.tib <- mtcars |>
  rownames_to_column() |> 
  as_tibble() |> 
  rename(model = rowname)
  
cars.tib
## # A tibble: 32 × 12
##    model         mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
##    <chr>       <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
##  1 Mazda RX4    21       6  160    110  3.9   2.62  16.5     0     1     4     4
##  2 Mazda RX4 …  21       6  160    110  3.9   2.88  17.0     0     1     4     4
##  3 Datsun 710   22.8     4  108     93  3.85  2.32  18.6     1     1     4     1
##  4 Hornet 4 D…  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1
##  5 Hornet Spo…  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2
##  6 Valiant      18.1     6  225    105  2.76  3.46  20.2     1     0     3     1
##  7 Duster 360   14.3     8  360    245  3.21  3.57  15.8     0     0     3     4
##  8 Merc 240D    24.4     4  147.    62  3.69  3.19  20       1     0     4     2
##  9 Merc 230     22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2
## 10 Merc 280     19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4
## # ℹ 22 more rows
class(cars.tib)
## [1] "tbl_df"     "tbl"        "data.frame"
typeof(cars.tib)
## [1] "list"