Import two related datasets from TidyTuesday Project.
ufo_sightings <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-06-20/ufo_sightings.csv')
## Rows: 96429 Columns: 12
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (7): city, state, country_code, shape, reported_duration, summary, day_...
## dbl (1): duration_seconds
## lgl (1): has_images
## dttm (2): reported_date_time, reported_date_time_utc
## date (1): posted_date
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
places <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-06-20/places.csv')
## Rows: 14417 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (6): city, alternate_city_names, state, country, country_code, timezone
## dbl (4): latitude, longitude, population, elevation_m
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Describe the two datasets:
Data 1: UFO_Sightings
Data 2: Places
set.seed(1234)
ufo_sightings_small <- ufo_sightings %>% select(city, state, reported_date_time) %>% sample_n(10)
places_small <- places %>% select(city, state, country_code) %>% sample_n(10)
ufo_sightings_small
## # A tibble: 10 × 3
## city state reported_date_time
## <chr> <chr> <dttm>
## 1 Minneapolis MN 2019-10-25 21:30:00
## 2 Danville CA 2006-07-24 04:30:00
## 3 Avon NY 2001-10-22 02:00:00
## 4 Glendale AZ 2016-03-02 05:00:00
## 5 Mayville NY 2007-12-01 20:00:00
## 6 Dallas TX 2011-04-26 03:00:00
## 7 Mount Horeb WI 1999-02-02 08:00:00
## 8 Scarborough ME 2015-08-07 23:00:00
## 9 Phoenix AZ 1990-08-22 05:19:00
## 10 Slocomb AL 1970-03-31 03:00:00
places_small
## # A tibble: 10 × 3
## city state country_code
## <chr> <chr> <chr>
## 1 Salmon Arm BC CA
## 2 Eaton CO US
## 3 Basin City WA US
## 4 Jacksonville NC US
## 5 New Paris IN US
## 6 Altoona IA US
## 7 Fairmont MN US
## 8 Middletown NJ US
## 9 Blanchardville WI US
## 10 Laguna Hills CA US
Describe the resulting data:
How is it different from the original two datasets?
ufo_sightings_small %>% inner_join(places_small, by = c("city", "state"))
## # A tibble: 0 × 4
## # ℹ 4 variables: city <chr>, state <chr>, reported_date_time <dttm>,
## # country_code <chr>
Describe the resulting data:
How is it different from the original two data sets?
ufo_sightings_small %>% left_join(places_small)
## Joining with `by = join_by(city, state)`
## # A tibble: 10 × 4
## city state reported_date_time country_code
## <chr> <chr> <dttm> <chr>
## 1 Minneapolis MN 2019-10-25 21:30:00 <NA>
## 2 Danville CA 2006-07-24 04:30:00 <NA>
## 3 Avon NY 2001-10-22 02:00:00 <NA>
## 4 Glendale AZ 2016-03-02 05:00:00 <NA>
## 5 Mayville NY 2007-12-01 20:00:00 <NA>
## 6 Dallas TX 2011-04-26 03:00:00 <NA>
## 7 Mount Horeb WI 1999-02-02 08:00:00 <NA>
## 8 Scarborough ME 2015-08-07 23:00:00 <NA>
## 9 Phoenix AZ 1990-08-22 05:19:00 <NA>
## 10 Slocomb AL 1970-03-31 03:00:00 <NA>
Describe the resulting data:
How is it different from the original two data sets?
ufo_sightings_small %>% right_join(places_small)
## Joining with `by = join_by(city, state)`
## # A tibble: 10 × 4
## city state reported_date_time country_code
## <chr> <chr> <dttm> <chr>
## 1 Salmon Arm BC NA CA
## 2 Eaton CO NA US
## 3 Basin City WA NA US
## 4 Jacksonville NC NA US
## 5 New Paris IN NA US
## 6 Altoona IA NA US
## 7 Fairmont MN NA US
## 8 Middletown NJ NA US
## 9 Blanchardville WI NA US
## 10 Laguna Hills CA NA US
Describe the resulting data:
How is it different from the original two data sets?
ufo_sightings_small %>% full_join(places_small)
## Joining with `by = join_by(city, state)`
## # A tibble: 20 × 4
## city state reported_date_time country_code
## <chr> <chr> <dttm> <chr>
## 1 Minneapolis MN 2019-10-25 21:30:00 <NA>
## 2 Danville CA 2006-07-24 04:30:00 <NA>
## 3 Avon NY 2001-10-22 02:00:00 <NA>
## 4 Glendale AZ 2016-03-02 05:00:00 <NA>
## 5 Mayville NY 2007-12-01 20:00:00 <NA>
## 6 Dallas TX 2011-04-26 03:00:00 <NA>
## 7 Mount Horeb WI 1999-02-02 08:00:00 <NA>
## 8 Scarborough ME 2015-08-07 23:00:00 <NA>
## 9 Phoenix AZ 1990-08-22 05:19:00 <NA>
## 10 Slocomb AL 1970-03-31 03:00:00 <NA>
## 11 Salmon Arm BC NA CA
## 12 Eaton CO NA US
## 13 Basin City WA NA US
## 14 Jacksonville NC NA US
## 15 New Paris IN NA US
## 16 Altoona IA NA US
## 17 Fairmont MN NA US
## 18 Middletown NJ NA US
## 19 Blanchardville WI NA US
## 20 Laguna Hills CA NA US
Describe the resulting data:
How is it different from the original two data sets?
ufo_sightings_small %>% semi_join(places_small)
## Joining with `by = join_by(city, state)`
## # A tibble: 0 × 3
## # ℹ 3 variables: city <chr>, state <chr>, reported_date_time <dttm>
places_small %>% semi_join(ufo_sightings_small)
## Joining with `by = join_by(city, state)`
## # A tibble: 0 × 3
## # ℹ 3 variables: city <chr>, state <chr>, country_code <chr>
Describe the resulting data:
How is it different from the original two datasets?
ufo_sightings_small %>% anti_join(places_small)
## Joining with `by = join_by(city, state)`
## # A tibble: 10 × 3
## city state reported_date_time
## <chr> <chr> <dttm>
## 1 Minneapolis MN 2019-10-25 21:30:00
## 2 Danville CA 2006-07-24 04:30:00
## 3 Avon NY 2001-10-22 02:00:00
## 4 Glendale AZ 2016-03-02 05:00:00
## 5 Mayville NY 2007-12-01 20:00:00
## 6 Dallas TX 2011-04-26 03:00:00
## 7 Mount Horeb WI 1999-02-02 08:00:00
## 8 Scarborough ME 2015-08-07 23:00:00
## 9 Phoenix AZ 1990-08-22 05:19:00
## 10 Slocomb AL 1970-03-31 03:00:00
places_small %>% anti_join(ufo_sightings_small)
## Joining with `by = join_by(city, state)`
## # A tibble: 10 × 3
## city state country_code
## <chr> <chr> <chr>
## 1 Salmon Arm BC CA
## 2 Eaton CO US
## 3 Basin City WA US
## 4 Jacksonville NC US
## 5 New Paris IN US
## 6 Altoona IA US
## 7 Fairmont MN US
## 8 Middletown NJ US
## 9 Blanchardville WI US
## 10 Laguna Hills CA US