Import your data

Chapter 13

What are primary keys in your data?

Can you divide your data into two?

Divide it using dplyr::select in a way the two have a common variable, which you could use to join the two.

Batting_1half <- Batting %>% select(playerID:HR) 
Batting_2half <- Batting %>% select(playerID:yearID, stint, teamID, RBI:GIDP)

Can you join the two together?

Use tidyr::left_join or other joining functions.

left_join(Batting_1half, Batting_2half)
## Joining, by = c("playerID", "yearID", "stint", "teamID")
## # A tibble: 110,495 × 22
##    playerID  yearID stint teamID lgID      G    AB     R     H   X2B   X3B    HR
##    <chr>      <int> <int> <fct>  <fct> <int> <int> <int> <int> <int> <int> <int>
##  1 abercda01   1871     1 TRO    NA        1     4     0     0     0     0     0
##  2 addybo01    1871     1 RC1    NA       25   118    30    32     6     0     0
##  3 allisar01   1871     1 CL1    NA       29   137    28    40     4     5     0
##  4 allisdo01   1871     1 WS3    NA       27   133    28    44    10     2     2
##  5 ansonca01   1871     1 RC1    NA       25   120    29    39    11     3     0
##  6 armstbo01   1871     1 FW1    NA       12    49     9    11     2     1     0
##  7 barkeal01   1871     1 RC1    NA        1     4     0     1     0     0     0
##  8 barnero01   1871     1 BS1    NA       31   157    66    63    10     9     0
##  9 barrebi01   1871     1 FW1    NA        1     5     1     1     1     0     0
## 10 barrofr01   1871     1 BS1    NA       18    86    13    13     2     1     0
## # … with 110,485 more rows, and 10 more variables: RBI <int>, SB <int>,
## #   CS <int>, BB <int>, SO <int>, IBB <int>, HBP <int>, SH <int>, SF <int>,
## #   GIDP <int>

Chapter 14

Tools

Detect matches

Batting %>% filter(str_detect(playerID, "HR"))
## # A tibble: 0 × 22
## # … with 22 variables: playerID <chr>, yearID <int>, stint <int>, teamID <fct>,
## #   lgID <fct>, G <int>, AB <int>, R <int>, H <int>, X2B <int>, X3B <int>,
## #   HR <int>, RBI <int>, SB <int>, CS <int>, BB <int>, SO <int>, IBB <int>,
## #   HBP <int>, SH <int>, SF <int>, GIDP <int>

Extract matches

Replacing matches