Week 9: Apply it to your data 8

Import your data

myData <- read_csv("../00_data/myData.csv")

## Rows: 1222 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): months, state
## dbl (8): year, colony_n, colony_max, colony_lost, colony_lost_pct, colony_ad...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Chapter 13

What are primary keys in your data?

#data %>% count(colony_reno_pct, colony_lost_pct) %>% filter(n > 10)

Can you divide your data into two?

Divide it using dplyr::select in a way the two have a common variable, which you could use to join the two.

#colony_1sthalf <- data %>% select(year:colony_lost)
#colony_2ndhalf <- data %>% select(colony_lost:colony_reno_pct)

Can you join the two together?

Use tidyr::left_join or other joining functions.

#left_join(colony_1sthalf, colony_2ndhalf)

Chapter 14 Tools

Detect matches

#data %>%
   # summarise(sum(str_detect(colony_lost_pct, "4$")))

#str_detect(data$colony_lost_pct,"4$")
#sum(str_detect(data$colony_lost_pct,"4$"))

Extract matches

states <- c("Connecticut", "Massachusetts", "Pennsylvania", "New York", "New Jersey", "New Hampshire", "Vermont", "Maine")
state_new_eng <- str_c(states, collapse = "|")

Replacing matches

#data %>% mutate(colony_lost = colony_lost_pct %>% str_replace("[A-Z]", "-"))
#data %>% mutate(colony_lost = colony_lost_pct %>% str_replace_all("[A-Z]", "-"))

Week 9: Apply it to your data 8

Ben Nome

2022-10-27

Import your data

Chapter 13

What are primary keys in your data?

Can you divide your data into two?

Can you join the two together?

Chapter 14

Tools

Detect matches

Extract matches

Replacing matches