data <- read_excel("../00_data/myData.xlsx")
## New names:
## • `` -> `...1`
data
## # A tibble: 4,810 × 24
## ...1 rank position hand player years total…¹ status yr_st…² season age
## <dbl> <dbl> <chr> <chr> <chr> <chr> <dbl> <chr> <dbl> <chr> <dbl>
## 1 1 1 C Left Wayne G… 1979… 894 Retir… 1979 1978-… 18
## 2 2 1 C Left Wayne G… 1979… 894 Retir… 1979 1978-… 18
## 3 3 1 C Left Wayne G… 1979… 894 Retir… 1979 1978-… 18
## 4 4 1 C Left Wayne G… 1979… 894 Retir… 1979 1979-… 19
## 5 5 1 C Left Wayne G… 1979… 894 Retir… 1979 1980-… 20
## 6 6 1 C Left Wayne G… 1979… 894 Retir… 1979 1981-… 21
## 7 7 1 C Left Wayne G… 1979… 894 Retir… 1979 1982-… 22
## 8 8 1 C Left Wayne G… 1979… 894 Retir… 1979 1983-… 23
## 9 9 1 C Left Wayne G… 1979… 894 Retir… 1979 1984-… 24
## 10 10 1 C Left Wayne G… 1979… 894 Retir… 1979 1985-… 25
## # … with 4,800 more rows, 13 more variables: team <chr>, league <chr>,
## # season_games <dbl>, goals <dbl>, assists <dbl>, points <dbl>,
## # plus_minus <chr>, penalty_min <dbl>, goals_even <chr>,
## # goals_power_play <chr>, goals_short_handed <chr>, goals_game_winner <chr>,
## # headshot <chr>, and abbreviated variable names ¹total_goals, ²yr_start
Primary keys in my data are: Rank, season, and team
data %>% count(rank,season,team)
## # A tibble: 4,810 × 4
## rank season team n
## <dbl> <chr> <chr> <int>
## 1 1 1978-79 EDO 1
## 2 1 1978-79 INR 1
## 3 1 1978-79 TOT 1
## 4 1 1979-80 EDM 1
## 5 1 1980-81 EDM 1
## 6 1 1981-82 EDM 1
## 7 1 1982-83 EDM 1
## 8 1 1983-84 EDM 1
## 9 1 1984-85 EDM 1
## 10 1 1985-86 EDM 1
## # … with 4,800 more rows
Divide it using dplyr::select in a way the two have a common variable, which you could use to join the two.
data_1half <- data %>% select(rank:team) %>% head(250)
data2_half <- data %>% select(team:goals_game_winner) %>% head(250)
Use tidyr::left_join or other joining functions.
left_join(data_1half, data2_half)
## Joining, by = "team"
## # A tibble: 6,490 × 22
## rank position hand player years total…¹ status yr_st…² season age team
## <dbl> <chr> <chr> <chr> <chr> <dbl> <chr> <dbl> <chr> <dbl> <chr>
## 1 1 C Left Wayne G… 1979… 894 Retir… 1979 1978-… 18 TOT
## 2 1 C Left Wayne G… 1979… 894 Retir… 1979 1978-… 18 TOT
## 3 1 C Left Wayne G… 1979… 894 Retir… 1979 1978-… 18 TOT
## 4 1 C Left Wayne G… 1979… 894 Retir… 1979 1978-… 18 TOT
## 5 1 C Left Wayne G… 1979… 894 Retir… 1979 1978-… 18 TOT
## 6 1 C Left Wayne G… 1979… 894 Retir… 1979 1978-… 18 TOT
## 7 1 C Left Wayne G… 1979… 894 Retir… 1979 1978-… 18 TOT
## 8 1 C Left Wayne G… 1979… 894 Retir… 1979 1978-… 18 TOT
## 9 1 C Left Wayne G… 1979… 894 Retir… 1979 1978-… 18 TOT
## 10 1 C Left Wayne G… 1979… 894 Retir… 1979 1978-… 18 TOT
## # … with 6,480 more rows, 11 more variables: league <chr>, season_games <dbl>,
## # goals <dbl>, assists <dbl>, points <dbl>, plus_minus <chr>,
## # penalty_min <dbl>, goals_even <chr>, goals_power_play <chr>,
## # goals_short_handed <chr>, goals_game_winner <chr>, and abbreviated variable
## # names ¹total_goals, ²yr_start