Import two related datasets from TidyTuesday Project.
cats_uk <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-01-31/cats_uk.csv')
## Rows: 18215 Columns: 11
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): tag_id, study_name
## dbl (5): event_id, location_long, location_lat, ground_speed, height_above_...
## lgl (3): visible, algorithm_marked_outlier, manually_marked_outlier
## dttm (1): timestamp
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
cats_uk_reference <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-01-31/cats_uk_reference.csv')
## Rows: 101 Columns: 16
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (6): tag_id, animal_id, animal_taxon, animal_reproductive_condition, an...
## dbl (4): prey_p_month, hrs_indoors, n_cats, age_years
## lgl (4): hunt, food_dry, food_wet, food_other
## dttm (2): deploy_on_date, deploy_off_date
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Describe the two datasets:
Data1: cats_uk
Data 2: cats_uk_reference
set.seed(1234)
cats_uk_small <- cats_uk %>% select(tag_id, event_id, location_long) %>% sample_n(10)
cats_uk_reference_small <- cats_uk_reference %>% select(tag_id, animal_taxon, hunt) %>% sample_n(10)
cats_uk_small
## # A tibble: 10 × 3
## tag_id event_id location_long
## <chr> <dbl> <dbl>
## 1 Ernie-Tag 3507105313 -4.64
## 2 Bits-Tag 3544803591 -4.92
## 3 Amber-Tag 3507104912 -4.64
## 4 Bits-Tag 3544803661 -4.92
## 5 Smudge-Tag 3637402232 -5.08
## 6 Jago 3396159641 -5.07
## 7 Fairclough-Tag 3766102907 -5.18
## 8 Frank_2-Tag 3672007720 -5.54
## 9 Charlie 3403118802 -5.08
## 10 Tilly-Tag 3716217172 -5.30
cats_uk_reference_small
## # A tibble: 10 × 3
## tag_id animal_taxon hunt
## <chr> <chr> <lgl>
## 1 Lola Felis catus TRUE
## 2 Millie-Tag Felis catus TRUE
## 3 Jim-Tag Felis catus TRUE
## 4 Siberia-Tag Felis catus TRUE
## 5 Reggie-Tag Felis catus TRUE
## 6 Fairclough-Tag Felis catus FALSE
## 7 Dexter2-Tag Felis catus FALSE
## 8 Fonzie-Tag Felis catus TRUE
## 9 Frank-Tag Felis catus TRUE
## 10 Freya-Tag Felis catus TRUE
Describe the resulting data:
How is it different from the original two datasets? 1 row compared to 10 rows in the original dataset all columns from the two datasets
cats_uk_small %>% inner_join(cats_uk_reference_small, by = c("tag_id"))
## # A tibble: 1 × 5
## tag_id event_id location_long animal_taxon hunt
## <chr> <dbl> <dbl> <chr> <lgl>
## 1 Fairclough-Tag 3766102907 -5.18 Felis catus FALSE
Describe the resulting data:
How is it different from the original two datasets? *5 columns as compared to 3 columns in the original dataset
cats_uk_small %>% left_join(cats_uk_reference_small, by = c("tag_id"))
## # A tibble: 10 × 5
## tag_id event_id location_long animal_taxon hunt
## <chr> <dbl> <dbl> <chr> <lgl>
## 1 Ernie-Tag 3507105313 -4.64 <NA> NA
## 2 Bits-Tag 3544803591 -4.92 <NA> NA
## 3 Amber-Tag 3507104912 -4.64 <NA> NA
## 4 Bits-Tag 3544803661 -4.92 <NA> NA
## 5 Smudge-Tag 3637402232 -5.08 <NA> NA
## 6 Jago 3396159641 -5.07 <NA> NA
## 7 Fairclough-Tag 3766102907 -5.18 Felis catus FALSE
## 8 Frank_2-Tag 3672007720 -5.54 <NA> NA
## 9 Charlie 3403118802 -5.08 <NA> NA
## 10 Tilly-Tag 3716217172 -5.30 <NA> NA
Describe the resulting data:
How is it different from the original two datasets? *5 columns as compared to 3 columns in the original dataset
cats_uk_small %>% right_join(cats_uk_reference_small, by = c("tag_id"))
## # A tibble: 10 × 5
## tag_id event_id location_long animal_taxon hunt
## <chr> <dbl> <dbl> <chr> <lgl>
## 1 Fairclough-Tag 3766102907 -5.18 Felis catus FALSE
## 2 Lola NA NA Felis catus TRUE
## 3 Millie-Tag NA NA Felis catus TRUE
## 4 Jim-Tag NA NA Felis catus TRUE
## 5 Siberia-Tag NA NA Felis catus TRUE
## 6 Reggie-Tag NA NA Felis catus TRUE
## 7 Dexter2-Tag NA NA Felis catus FALSE
## 8 Fonzie-Tag NA NA Felis catus TRUE
## 9 Frank-Tag NA NA Felis catus TRUE
## 10 Freya-Tag NA NA Felis catus TRUE
Describe the resulting data:
How is it different from the original two datasets? *19 rows as compared to 10 rows in the original dataset
cats_uk_small %>% full_join(cats_uk_reference_small, by = c("tag_id"))
## # A tibble: 19 × 5
## tag_id event_id location_long animal_taxon hunt
## <chr> <dbl> <dbl> <chr> <lgl>
## 1 Ernie-Tag 3507105313 -4.64 <NA> NA
## 2 Bits-Tag 3544803591 -4.92 <NA> NA
## 3 Amber-Tag 3507104912 -4.64 <NA> NA
## 4 Bits-Tag 3544803661 -4.92 <NA> NA
## 5 Smudge-Tag 3637402232 -5.08 <NA> NA
## 6 Jago 3396159641 -5.07 <NA> NA
## 7 Fairclough-Tag 3766102907 -5.18 Felis catus FALSE
## 8 Frank_2-Tag 3672007720 -5.54 <NA> NA
## 9 Charlie 3403118802 -5.08 <NA> NA
## 10 Tilly-Tag 3716217172 -5.30 <NA> NA
## 11 Lola NA NA Felis catus TRUE
## 12 Millie-Tag NA NA Felis catus TRUE
## 13 Jim-Tag NA NA Felis catus TRUE
## 14 Siberia-Tag NA NA Felis catus TRUE
## 15 Reggie-Tag NA NA Felis catus TRUE
## 16 Dexter2-Tag NA NA Felis catus FALSE
## 17 Fonzie-Tag NA NA Felis catus TRUE
## 18 Frank-Tag NA NA Felis catus TRUE
## 19 Freya-Tag NA NA Felis catus TRUE
Describe the resulting data:
How is it different from the original two datasets? * 1 row compared to 10 rows in original dataset
cats_uk_small %>% semi_join(cats_uk_reference_small, by = c("tag_id"))
## # A tibble: 1 × 3
## tag_id event_id location_long
## <chr> <dbl> <dbl>
## 1 Fairclough-Tag 3766102907 -5.18
Describe the resulting data:
How is it different from the original two datasets? * 9 rows as compared to 10 rows in original dataset
cats_uk_small %>% anti_join(cats_uk_reference_small, by = c("tag_id"))
## # A tibble: 9 × 3
## tag_id event_id location_long
## <chr> <dbl> <dbl>
## 1 Ernie-Tag 3507105313 -4.64
## 2 Bits-Tag 3544803591 -4.92
## 3 Amber-Tag 3507104912 -4.64
## 4 Bits-Tag 3544803661 -4.92
## 5 Smudge-Tag 3637402232 -5.08
## 6 Jago 3396159641 -5.07
## 7 Frank_2-Tag 3672007720 -5.54
## 8 Charlie 3403118802 -5.08
## 9 Tilly-Tag 3716217172 -5.30