Import two related datasets from TidyTuesday Project.
cbp_resp <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2024/2024-11-26/cbp_resp.csv')
## Rows: 68815 Columns: 12
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (10): month_grouping, month_abbv, component, land_border_region, area_of...
## dbl (2): fiscal_year, encounter_count
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
cbp_state <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2024/2024-11-26/cbp_state.csv')
## Rows: 54939 Columns: 9
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (7): month_grouping, month_abbv, land_border_region, state, demographic,...
## dbl (2): fiscal_year, encounter_count
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Describe the two datasets
Data1: cbp_resp
Data 2: cbp_state
set.seed(1234)
cbp_resp_small <- cbp_resp %>% select(fiscal_year, land_border_region, citizenship, encounter_count) %>% sample_n(10)
cbp_state_small <- cbp_state %>% select(fiscal_year,land_border_region,citizenship,state) %>% sample_n(10)
cbp_resp_small
## # A tibble: 10 × 4
## fiscal_year land_border_region citizenship encounter_count
## <dbl> <chr> <chr> <dbl>
## 1 2023 Other ROMANIA 14
## 2 2021 Northern Land Border COLOMBIA 4
## 3 2022 Southwest Land Border HONDURAS 4
## 4 2024 Other UKRAINE 6
## 5 2024 Southwest Land Border MEXICO 453
## 6 2024 Southwest Land Border HAITI 1
## 7 2021 Southwest Land Border EL SALVADOR 27
## 8 2022 Northern Land Border EL SALVADOR 1
## 9 2023 Southwest Land Border CUBA 3
## 10 2021 Northern Land Border MEXICO 1
cbp_state_small
## # A tibble: 10 × 4
## fiscal_year land_border_region citizenship state
## <dbl> <chr> <chr> <chr>
## 1 2024 Southwest Land Border MEXICO TX
## 2 2023 Southwest Land Border NICARAGUA AZ
## 3 2023 Other NICARAGUA CA
## 4 2023 Other UKRAINE DE
## 5 2023 Other MEXICO GA
## 6 2024 Other MEXICO DC
## 7 2022 Southwest Land Border HAITI CA
## 8 2023 Other MEXICO MD
## 9 2023 Other OTHER KY
## 10 2024 Other VENEZUELA IL
Describe the resulting data:
How is it different from the original two datasets?
inner_join(cbp_state_small,cbp_resp_small)
## Joining with `by = join_by(fiscal_year, land_border_region, citizenship)`
## # A tibble: 1 × 5
## fiscal_year land_border_region citizenship state encounter_count
## <dbl> <chr> <chr> <chr> <dbl>
## 1 2024 Southwest Land Border MEXICO TX 453
Describe the resulting data:
How is it different from the original two datasets?
left_join(cbp_state_small, cbp_resp_small)
## Joining with `by = join_by(fiscal_year, land_border_region, citizenship)`
## # A tibble: 10 × 5
## fiscal_year land_border_region citizenship state encounter_count
## <dbl> <chr> <chr> <chr> <dbl>
## 1 2024 Southwest Land Border MEXICO TX 453
## 2 2023 Southwest Land Border NICARAGUA AZ NA
## 3 2023 Other NICARAGUA CA NA
## 4 2023 Other UKRAINE DE NA
## 5 2023 Other MEXICO GA NA
## 6 2024 Other MEXICO DC NA
## 7 2022 Southwest Land Border HAITI CA NA
## 8 2023 Other MEXICO MD NA
## 9 2023 Other OTHER KY NA
## 10 2024 Other VENEZUELA IL NA
Describe the resulting data:
How is it different from the original two datasets?
right_join(cbp_state_small,cbp_resp_small)
## Joining with `by = join_by(fiscal_year, land_border_region, citizenship)`
## # A tibble: 10 × 5
## fiscal_year land_border_region citizenship state encounter_count
## <dbl> <chr> <chr> <chr> <dbl>
## 1 2024 Southwest Land Border MEXICO TX 453
## 2 2023 Other ROMANIA <NA> 14
## 3 2021 Northern Land Border COLOMBIA <NA> 4
## 4 2022 Southwest Land Border HONDURAS <NA> 4
## 5 2024 Other UKRAINE <NA> 6
## 6 2024 Southwest Land Border HAITI <NA> 1
## 7 2021 Southwest Land Border EL SALVADOR <NA> 27
## 8 2022 Northern Land Border EL SALVADOR <NA> 1
## 9 2023 Southwest Land Border CUBA <NA> 3
## 10 2021 Northern Land Border MEXICO <NA> 1
Describe the resulting data:
How is it different from the original two datasets?
full_join(cbp_state_small,cbp_resp_small)
## Joining with `by = join_by(fiscal_year, land_border_region, citizenship)`
## # A tibble: 19 × 5
## fiscal_year land_border_region citizenship state encounter_count
## <dbl> <chr> <chr> <chr> <dbl>
## 1 2024 Southwest Land Border MEXICO TX 453
## 2 2023 Southwest Land Border NICARAGUA AZ NA
## 3 2023 Other NICARAGUA CA NA
## 4 2023 Other UKRAINE DE NA
## 5 2023 Other MEXICO GA NA
## 6 2024 Other MEXICO DC NA
## 7 2022 Southwest Land Border HAITI CA NA
## 8 2023 Other MEXICO MD NA
## 9 2023 Other OTHER KY NA
## 10 2024 Other VENEZUELA IL NA
## 11 2023 Other ROMANIA <NA> 14
## 12 2021 Northern Land Border COLOMBIA <NA> 4
## 13 2022 Southwest Land Border HONDURAS <NA> 4
## 14 2024 Other UKRAINE <NA> 6
## 15 2024 Southwest Land Border HAITI <NA> 1
## 16 2021 Southwest Land Border EL SALVADOR <NA> 27
## 17 2022 Northern Land Border EL SALVADOR <NA> 1
## 18 2023 Southwest Land Border CUBA <NA> 3
## 19 2021 Northern Land Border MEXICO <NA> 1
Describe the resulting data:
How is it different from the original two datasets?
semi_join(cbp_state_small,cbp_resp_small)
## Joining with `by = join_by(fiscal_year, land_border_region, citizenship)`
## # A tibble: 1 × 4
## fiscal_year land_border_region citizenship state
## <dbl> <chr> <chr> <chr>
## 1 2024 Southwest Land Border MEXICO TX
Describe the resulting data:
How is it different from the original two datasets?
anti_join(cbp_state_small,cbp_resp_small)
## Joining with `by = join_by(fiscal_year, land_border_region, citizenship)`
## # A tibble: 9 × 4
## fiscal_year land_border_region citizenship state
## <dbl> <chr> <chr> <chr>
## 1 2023 Southwest Land Border NICARAGUA AZ
## 2 2023 Other NICARAGUA CA
## 3 2023 Other UKRAINE DE
## 4 2023 Other MEXICO GA
## 5 2024 Other MEXICO DC
## 6 2022 Southwest Land Border HAITI CA
## 7 2023 Other MEXICO MD
## 8 2023 Other OTHER KY
## 9 2024 Other VENEZUELA IL