1. Import your data

Import two related datasets from TidyTuesday Project.

ufo_sightings <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-06-20/ufo_sightings.csv')
## Rows: 96429 Columns: 12
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (7): city, state, country_code, shape, reported_duration, summary, day_...
## dbl  (1): duration_seconds
## lgl  (1): has_images
## dttm (2): reported_date_time, reported_date_time_utc
## date (1): posted_date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
places <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-06-20/places.csv')
## Rows: 14417 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (6): city, alternate_city_names, state, country, country_code, timezone
## dbl (4): latitude, longitude, population, elevation_m
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

2. Make data small

Describe the two datasets:

Data 1: UFO_Sightings

Data 2: Places

set.seed(1234)
ufo_sightings_small <- ufo_sightings %>% select(city, state, reported_date_time) %>% sample_n(10)
places_small <- places %>% select(city, state, country_code) %>% sample_n(10)

ufo_sightings_small
## # A tibble: 10 × 3
##    city        state reported_date_time 
##    <chr>       <chr> <dttm>             
##  1 Minneapolis MN    2019-10-25 21:30:00
##  2 Danville    CA    2006-07-24 04:30:00
##  3 Avon        NY    2001-10-22 02:00:00
##  4 Glendale    AZ    2016-03-02 05:00:00
##  5 Mayville    NY    2007-12-01 20:00:00
##  6 Dallas      TX    2011-04-26 03:00:00
##  7 Mount Horeb WI    1999-02-02 08:00:00
##  8 Scarborough ME    2015-08-07 23:00:00
##  9 Phoenix     AZ    1990-08-22 05:19:00
## 10 Slocomb     AL    1970-03-31 03:00:00
places_small
## # A tibble: 10 × 3
##    city           state country_code
##    <chr>          <chr> <chr>       
##  1 Salmon Arm     BC    CA          
##  2 Eaton          CO    US          
##  3 Basin City     WA    US          
##  4 Jacksonville   NC    US          
##  5 New Paris      IN    US          
##  6 Altoona        IA    US          
##  7 Fairmont       MN    US          
##  8 Middletown     NJ    US          
##  9 Blanchardville WI    US          
## 10 Laguna Hills   CA    US

3. inner_join

Describe the resulting data:

How is it different from the original two datasets?

ufo_sightings_small %>% inner_join(places_small, by = c("city", "state"))
## # A tibble: 0 × 4
## # ℹ 4 variables: city <chr>, state <chr>, reported_date_time <dttm>,
## #   country_code <chr>

4. left_join

Describe the resulting data:

How is it different from the original two data sets?

ufo_sightings_small %>% left_join(places_small)
## Joining with `by = join_by(city, state)`
## # A tibble: 10 × 4
##    city        state reported_date_time  country_code
##    <chr>       <chr> <dttm>              <chr>       
##  1 Minneapolis MN    2019-10-25 21:30:00 <NA>        
##  2 Danville    CA    2006-07-24 04:30:00 <NA>        
##  3 Avon        NY    2001-10-22 02:00:00 <NA>        
##  4 Glendale    AZ    2016-03-02 05:00:00 <NA>        
##  5 Mayville    NY    2007-12-01 20:00:00 <NA>        
##  6 Dallas      TX    2011-04-26 03:00:00 <NA>        
##  7 Mount Horeb WI    1999-02-02 08:00:00 <NA>        
##  8 Scarborough ME    2015-08-07 23:00:00 <NA>        
##  9 Phoenix     AZ    1990-08-22 05:19:00 <NA>        
## 10 Slocomb     AL    1970-03-31 03:00:00 <NA>

5. right_join

Describe the resulting data:

How is it different from the original two data sets?

ufo_sightings_small %>% right_join(places_small)
## Joining with `by = join_by(city, state)`
## # A tibble: 10 × 4
##    city           state reported_date_time country_code
##    <chr>          <chr> <dttm>             <chr>       
##  1 Salmon Arm     BC    NA                 CA          
##  2 Eaton          CO    NA                 US          
##  3 Basin City     WA    NA                 US          
##  4 Jacksonville   NC    NA                 US          
##  5 New Paris      IN    NA                 US          
##  6 Altoona        IA    NA                 US          
##  7 Fairmont       MN    NA                 US          
##  8 Middletown     NJ    NA                 US          
##  9 Blanchardville WI    NA                 US          
## 10 Laguna Hills   CA    NA                 US

6. full_join

Describe the resulting data:

How is it different from the original two data sets?

ufo_sightings_small %>% full_join(places_small)
## Joining with `by = join_by(city, state)`
## # A tibble: 20 × 4
##    city           state reported_date_time  country_code
##    <chr>          <chr> <dttm>              <chr>       
##  1 Minneapolis    MN    2019-10-25 21:30:00 <NA>        
##  2 Danville       CA    2006-07-24 04:30:00 <NA>        
##  3 Avon           NY    2001-10-22 02:00:00 <NA>        
##  4 Glendale       AZ    2016-03-02 05:00:00 <NA>        
##  5 Mayville       NY    2007-12-01 20:00:00 <NA>        
##  6 Dallas         TX    2011-04-26 03:00:00 <NA>        
##  7 Mount Horeb    WI    1999-02-02 08:00:00 <NA>        
##  8 Scarborough    ME    2015-08-07 23:00:00 <NA>        
##  9 Phoenix        AZ    1990-08-22 05:19:00 <NA>        
## 10 Slocomb        AL    1970-03-31 03:00:00 <NA>        
## 11 Salmon Arm     BC    NA                  CA          
## 12 Eaton          CO    NA                  US          
## 13 Basin City     WA    NA                  US          
## 14 Jacksonville   NC    NA                  US          
## 15 New Paris      IN    NA                  US          
## 16 Altoona        IA    NA                  US          
## 17 Fairmont       MN    NA                  US          
## 18 Middletown     NJ    NA                  US          
## 19 Blanchardville WI    NA                  US          
## 20 Laguna Hills   CA    NA                  US

7. semi_join

Describe the resulting data:

How is it different from the original two data sets?

ufo_sightings_small %>% semi_join(places_small)
## Joining with `by = join_by(city, state)`
## # A tibble: 0 × 3
## # ℹ 3 variables: city <chr>, state <chr>, reported_date_time <dttm>
places_small %>% semi_join(ufo_sightings_small)
## Joining with `by = join_by(city, state)`
## # A tibble: 0 × 3
## # ℹ 3 variables: city <chr>, state <chr>, country_code <chr>

8. anti_join

Describe the resulting data:

How is it different from the original two datasets?

ufo_sightings_small %>% anti_join(places_small)
## Joining with `by = join_by(city, state)`
## # A tibble: 10 × 3
##    city        state reported_date_time 
##    <chr>       <chr> <dttm>             
##  1 Minneapolis MN    2019-10-25 21:30:00
##  2 Danville    CA    2006-07-24 04:30:00
##  3 Avon        NY    2001-10-22 02:00:00
##  4 Glendale    AZ    2016-03-02 05:00:00
##  5 Mayville    NY    2007-12-01 20:00:00
##  6 Dallas      TX    2011-04-26 03:00:00
##  7 Mount Horeb WI    1999-02-02 08:00:00
##  8 Scarborough ME    2015-08-07 23:00:00
##  9 Phoenix     AZ    1990-08-22 05:19:00
## 10 Slocomb     AL    1970-03-31 03:00:00
places_small %>% anti_join(ufo_sightings_small)
## Joining with `by = join_by(city, state)`
## # A tibble: 10 × 3
##    city           state country_code
##    <chr>          <chr> <chr>       
##  1 Salmon Arm     BC    CA          
##  2 Eaton          CO    US          
##  3 Basin City     WA    US          
##  4 Jacksonville   NC    US          
##  5 New Paris      IN    US          
##  6 Altoona        IA    US          
##  7 Fairmont       MN    US          
##  8 Middletown     NJ    US          
##  9 Blanchardville WI    US          
## 10 Laguna Hills   CA    US