Import two related datasets from TidyTuesday Project.
survivalists <- read.csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/refs/heads/master/data/2023/2023-01-24/survivalists.csv')
loadouts <- read.csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/refs/heads/master/data/2023/2023-01-24/loadouts.csv')
Describe the two datasets:
Data1: Survivalist
Data 2
set.seed(1234)
survivalists_small <- survivalists %>% select(season, name, age) %>% sample_n(10)
loadouts_small <- loadouts %>% select(season, name, item_detailed) %>% sample_n(10)
survivalists_small
## season name age
## 1 3 Britt Ahart 40
## 2 8 Nate Weber 47
## 3 3 Carleigh Fairchild 28
## 4 1 Chris Weatherman 41
## 5 1 Dustin Feher 37
## 6 4 Brody Wilkes 33
## 7 2 Randy Champagne 28
## 8 1 Lucas Miller 32
## 9 9 Karie Lee Knoke 57
## 10 7 Joe Nicholas 31
loadouts_small
## season name
## 1 1 Chris Weatherman
## 2 9 Jessie Krebs
## 3 3 Jim Shields
## 4 4 Shannon Bosdell
## 5 2 Nicole Apelian
## 6 6 Barry Karcher
## 7 1 Alan Kay
## 8 9 Tom Garstang
## 9 7 Kielyn Marrone
## 10 6 Woniya Thibeault
## item_detailed
## 1 Knife
## 2 Trapping wire
## 3 Rations
## 4 Tarp – 12′ x 12′
## 5 200 yards of 30 lb test fishing line, 100 yards of 80 lb test fishing lines and hooks
## 6 Saw
## 7 Large 2-quart pot
## 8 Folding Saw
## 9 Snare wire
## 10 Pot
Describe the resulting data:
How is it different from the original two datasets? * Only showcases 1 row compared to the original ten whilste combining them.
survivalists_small %>% inner_join(loadouts_small, by =c("season", "name"))
## season name age item_detailed
## 1 1 Chris Weatherman 41 Knife
Describe the resulting data:
How is it different from the original two datasets? * We can see that it only took down one of the items detailed whilst discarding the rest
survivalists_small %>% left_join(loadouts_small, by =c("season", "name"))
## season name age item_detailed
## 1 3 Britt Ahart 40 <NA>
## 2 8 Nate Weber 47 <NA>
## 3 3 Carleigh Fairchild 28 <NA>
## 4 1 Chris Weatherman 41 Knife
## 5 1 Dustin Feher 37 <NA>
## 6 4 Brody Wilkes 33 <NA>
## 7 2 Randy Champagne 28 <NA>
## 8 1 Lucas Miller 32 <NA>
## 9 9 Karie Lee Knoke 57 <NA>
## 10 7 Joe Nicholas 31 <NA>
Describe the resulting data:
How is it different from the original two datasets? * We see that it only kept the names and some of the ages whilst completly discarding the items detailed and listing them off below seperatly from the table.
survivalists_small %>% right_join(loadouts_small, by =c("season", "name"))
## season name age
## 1 1 Chris Weatherman 41
## 2 9 Jessie Krebs NA
## 3 3 Jim Shields NA
## 4 4 Shannon Bosdell NA
## 5 2 Nicole Apelian NA
## 6 6 Barry Karcher NA
## 7 1 Alan Kay NA
## 8 9 Tom Garstang NA
## 9 7 Kielyn Marrone NA
## 10 6 Woniya Thibeault NA
## item_detailed
## 1 Knife
## 2 Trapping wire
## 3 Rations
## 4 Tarp – 12′ x 12′
## 5 200 yards of 30 lb test fishing line, 100 yards of 80 lb test fishing lines and hooks
## 6 Saw
## 7 Large 2-quart pot
## 8 Folding Saw
## 9 Snare wire
## 10 Pot
Describe the resulting data:
How is it different from the original two datasets? * It lists out far form data than the original set, before then listing off the items_detailed separately.
survivalists_small %>% full_join(loadouts_small, by =c("season", "name"))
## season name age
## 1 3 Britt Ahart 40
## 2 8 Nate Weber 47
## 3 3 Carleigh Fairchild 28
## 4 1 Chris Weatherman 41
## 5 1 Dustin Feher 37
## 6 4 Brody Wilkes 33
## 7 2 Randy Champagne 28
## 8 1 Lucas Miller 32
## 9 9 Karie Lee Knoke 57
## 10 7 Joe Nicholas 31
## 11 9 Jessie Krebs NA
## 12 3 Jim Shields NA
## 13 4 Shannon Bosdell NA
## 14 2 Nicole Apelian NA
## 15 6 Barry Karcher NA
## 16 1 Alan Kay NA
## 17 9 Tom Garstang NA
## 18 7 Kielyn Marrone NA
## 19 6 Woniya Thibeault NA
## item_detailed
## 1 <NA>
## 2 <NA>
## 3 <NA>
## 4 Knife
## 5 <NA>
## 6 <NA>
## 7 <NA>
## 8 <NA>
## 9 <NA>
## 10 <NA>
## 11 Trapping wire
## 12 Rations
## 13 Tarp – 12′ x 12′
## 14 200 yards of 30 lb test fishing line, 100 yards of 80 lb test fishing lines and hooks
## 15 Saw
## 16 Large 2-quart pot
## 17 Folding Saw
## 18 Snare wire
## 19 Pot
Describe the resulting data:
How is it different from the original two datasets? * Only lists off the survivalists data and only showcases one contestant.
survivalists_small %>% semi_join(loadouts_small, by =c("season", "name"))
## season name age
## 1 1 Chris Weatherman 41
Describe the resulting data:
How is it different from the original two datasets? * Lists only survivalist data but lists off 9 contestants as opposed to the ten in the original.
survivalists_small %>% anti_join(loadouts_small, by =c("season", "name"))
## season name age
## 1 3 Britt Ahart 40
## 2 8 Nate Weber 47
## 3 3 Carleigh Fairchild 28
## 4 1 Dustin Feher 37
## 5 4 Brody Wilkes 33
## 6 2 Randy Champagne 28
## 7 1 Lucas Miller 32
## 8 9 Karie Lee Knoke 57
## 9 7 Joe Nicholas 31