inner_cust_orders <- inner_join(customers, orders, by ='customer_id')
4 rows
They do not have a matching Primary and foreign key in the two tables
inner_cust_orders
## # A tibble: 4 × 6
## customer_id name city order_id product amount
## <dbl> <chr> <chr> <dbl> <chr> <dbl>
## 1 1 Alice New York 101 Laptop 1200
## 2 2 Bob Los Angeles 102 Phone 800
## 3 2 Bob Los Angeles 104 Desktop 1500
## 4 3 Charlie Chicago 103 Tablet 300
left_cust_orders <- left_join(customers, orders, by = 'customer_id')
6 rows
The number differs because inner only takes information that matches, but left join will take all inforatmion from the left table and put information that matches from the right
left_cust_orders
## # A tibble: 6 × 6
## customer_id name city order_id product amount
## <dbl> <chr> <chr> <dbl> <chr> <dbl>
## 1 1 Alice New York 101 Laptop 1200
## 2 2 Bob Los Angeles 102 Phone 800
## 3 2 Bob Los Angeles 104 Desktop 1500
## 4 3 Charlie Chicago 103 Tablet 300
## 5 4 David Houston NA <NA> NA
## 6 5 Eve Phoenix NA <NA> NA
right_cust_orders <- right_join(customers, orders, by ='customer_id')
There are 6 rows in the result
6 and 7, because they do not have that information in the customer table but have that information in the order table and its the right table in the join.
right_cust_orders
## # A tibble: 6 × 6
## customer_id name city order_id product amount
## <dbl> <chr> <chr> <dbl> <chr> <dbl>
## 1 1 Alice New York 101 Laptop 1200
## 2 2 Bob Los Angeles 102 Phone 800
## 3 2 Bob Los Angeles 104 Desktop 1500
## 4 3 Charlie Chicago 103 Tablet 300
## 5 6 <NA> <NA> 105 Camera 600
## 6 7 <NA> <NA> 106 Printer 150
full_cust_orders <- full_join(customers, orders, by ='customer_id')
There are 8 rows
Rows 5,6,7,8 have information have only one table but get included because it is a full join
##Display the result
full_cust_orders
## # A tibble: 8 × 6
## customer_id name city order_id product amount
## <dbl> <chr> <chr> <dbl> <chr> <dbl>
## 1 1 Alice New York 101 Laptop 1200
## 2 2 Bob Los Angeles 102 Phone 800
## 3 2 Bob Los Angeles 104 Desktop 1500
## 4 3 Charlie Chicago 103 Tablet 300
## 5 4 David Houston NA <NA> NA
## 6 5 Eve Phoenix NA <NA> NA
## 7 6 <NA> <NA> 105 Camera 600
## 8 7 <NA> <NA> 106 Printer 150
semi_cust_orders <- semi_join(customers, orders, by ='customer_id')
There are three rows as a result
Bob only has a singular row in the semi compared to the two rows in the inner
semi_cust_orders
## # A tibble: 3 × 3
## customer_id name city
## <dbl> <chr> <chr>
## 1 1 Alice New York
## 2 2 Bob Los Angeles
## 3 3 Charlie Chicago
ant_cust_orders <- anti_join(customers, orders, by ='customer_id')
Dave and Eve
They have information in the customers tab but not the orders tab
Display the result
ant_cust_orders
## # A tibble: 2 × 3
## customer_id name city
## <dbl> <chr> <chr>
## 1 4 David Houston
## 2 5 Eve Phoenix
A left join with customers as the left table since it will reutrn all customers even those who haven’t placed a order.
A inner join since it will return only customer ID’s that have matches in both tables
inner_cust_orders <- inner_join(customers, orders, by ='customer_id')
left_cust_orders <- left_join(customers, orders, by = 'customer_id')
left_cust_orders
## # A tibble: 6 × 6
## customer_id name city order_id product amount
## <dbl> <chr> <chr> <dbl> <chr> <dbl>
## 1 1 Alice New York 101 Laptop 1200
## 2 2 Bob Los Angeles 102 Phone 800
## 3 2 Bob Los Angeles 104 Desktop 1500
## 4 3 Charlie Chicago 103 Tablet 300
## 5 4 David Houston NA <NA> NA
## 6 5 Eve Phoenix NA <NA> NA
inner_cust_orders
## # A tibble: 4 × 6
## customer_id name city order_id product amount
## <dbl> <chr> <chr> <dbl> <chr> <dbl>
## 1 1 Alice New York 101 Laptop 1200
## 2 2 Bob Los Angeles 102 Phone 800
## 3 2 Bob Los Angeles 104 Desktop 1500
## 4 3 Charlie Chicago 103 Tablet 300