q1 <- inner_join(customers , orders )
## Joining with `by = join_by(customer_id)`
There are four rows
inner join returns all rows from both tables where there is a match, therefore the rows not returned did not have a match in the other table
head(q1)
## # A tibble: 4 × 6
## customer_id name city order_id product amount
## <dbl> <chr> <chr> <dbl> <chr> <dbl>
## 1 1 Alice New York 101 Laptop 1200
## 2 2 Bob Los Angeles 102 Phone 800
## 3 2 Bob Los Angeles 104 Desktop 1500
## 4 3 Charlie Chicago 103 Tablet 300
q2 <- left_join(customers, orders, by = "customer_id")
There are six rows
A left join keeps all rows from the left table (customers), and adds matching rows from the right table (orders). If there’s no match in orders, the result will have NA values for those columns. Therefore, the number of rows equals the number of rows in ‘customers’(even if some customers have no matching orders).
head(q2)
## # A tibble: 6 × 6
## customer_id name city order_id product amount
## <dbl> <chr> <chr> <dbl> <chr> <dbl>
## 1 1 Alice New York 101 Laptop 1200
## 2 2 Bob Los Angeles 102 Phone 800
## 3 2 Bob Los Angeles 104 Desktop 1500
## 4 3 Charlie Chicago 103 Tablet 300
## 5 4 David Houston NA <NA> NA
## 6 5 Eve Phoenix NA <NA> NA
q3 <- right_join(customers, orders, by = "customer_id")
There are six rows
q3[is.na(q3$name), "customer_id"]
## # A tibble: 2 × 1
## customer_id
## <dbl>
## 1 6
## 2 7
customer ids 6 and 7 have no matching entries in the ‘customers’ table, so their name and city are NA.
head(q3)
## # A tibble: 6 × 6
## customer_id name city order_id product amount
## <dbl> <chr> <chr> <dbl> <chr> <dbl>
## 1 1 Alice New York 101 Laptop 1200
## 2 2 Bob Los Angeles 102 Phone 800
## 3 2 Bob Los Angeles 104 Desktop 1500
## 4 3 Charlie Chicago 103 Tablet 300
## 5 6 <NA> <NA> 105 Camera 600
## 6 7 <NA> <NA> 106 Printer 150