This is the first part of my Project 2 assignment for DATA607 in the Fall 2023 Term at CUNY SPS. In this assignment I import a wide data set, tidy it, and then analyze it. I created this first data set, which contains the seasonal home run totals for the starting player from each position for each team in the American League Eastern Division of Major League Baseball from 2018 to 2023 (excluding the pandemic-shortened 2020 season).
In this code block, I load the necessary libraries, import the data from my github repository, and rename the columns.
library(tidyr)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats 1.0.0 ✔ readr 2.1.4
## ✔ ggplot2 3.4.4 ✔ stringr 1.5.0
## ✔ lubridate 1.9.2 ✔ tibble 3.2.1
## ✔ purrr 1.0.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(knitr)
hr_data <- read.csv("https://raw.githubusercontent.com/Marley-Myrianthopoulos/Data607Project2/main/HR_Data_607.csv")
colnames(hr_data) <- c("Team", "Position", "2018", "2019", "2021", "2022", "2023")
kable(hr_data, format = "pipe", caption = "Initial Homerun Data", align = "lcccccc")
Team | Position | 2018 | 2019 | 2021 | 2022 | 2023 |
---|---|---|---|---|---|---|
BAL | C | 3 | 13 | 11 | 13 | 20 |
1B | 16 | 12 | 33 | 22 | 18 | |
2B | 17 | 24 | 5 | 13 | 13 | |
3B | 24 | 6 | 9 | 13 | 7 | |
SS | 7 | 12 | 11 | 16 | 4 | |
LF | 24 | 13 | 22 | 16 | 16 | |
CF | 15 | 10 | 30 | 16 | 15 | |
RF | 8 | 35 | 18 | 33 | 28 | |
DH | 17 | 31 | 21 | 10 | 14 | |
BOS | C | 5 | 23 | 6 | 8 | 9 |
1B | 15 | 19 | 25 | 12 | 24 | |
2B | 10 | 3 | 6 | 16 | 3 | |
3B | 23 | 33 | 23 | 15 | 6 | |
SS | 21 | 32 | 38 | 27 | 33 | |
LF | 16 | 13 | 13 | 11 | 15 | |
CF | 13 | 21 | 20 | 6 | 8 | |
RF | 32 | 29 | 31 | 3 | 13 | |
DH | 43 | 36 | 28 | 16 | 23 | |
NYY | C | 18 | 34 | 23 | 11 | 10 |
1B | 11 | 21 | 8 | 32 | 12 | |
2B | 24 | 26 | 10 | 24 | 25 | |
3B | 27 | 16 | 9 | 4 | 21 | |
SS | 27 | 21 | 14 | 15 | 15 | |
LF | 12 | 13 | 13 | 8 | 5 | |
CF | 27 | 28 | 10 | 62 | 7 | |
RF | 27 | 27 | 39 | 12 | 37 | |
DH | 38 | 13 | 35 | 31 | 24 | |
TBR | C | 14 | 9 | 33 | 6 | 11 |
1B | 11 | 19 | 13 | 11 | 22 | |
2B | 7 | 17 | 39 | 8 | 21 | |
3B | 10 | 20 | 7 | 8 | 17 | |
SS | 4 | 14 | 11 | 9 | 31 | |
LF | 7 | 21 | 27 | 20 | 23 | |
CF | 7 | 14 | 4 | 7 | 25 | |
RF | 9 | 20 | 10 | 4 | 20 | |
DH | 30 | 33 | 13 | 6 | 12 | |
TOR | C | 10 | 13 | 1 | 14 | 8 |
1B | 25 | 22 | 48 | 32 | 26 | |
2B | 11 | 16 | 45 | 7 | 11 | |
3B | 18 | 18 | 29 | 24 | 20 | |
SS | 17 | 15 | 2 | 27 | 17 | |
LF | 22 | 20 | 21 | 5 | 20 | |
CF | 15 | 26 | 22 | 25 | 8 | |
RF | 25 | 31 | 32 | 25 | 21 | |
DH | 21 | 21 | 22 | 4 | 19 |
In this code block, I “tidy” the data by using pivot longer to convert the data into a format that includes a variable for the season, rather than having each season be a separate column.
tidy_hr_data <- hr_data %>%
mutate(Team = na_if(Team, "")) %>%
fill(Team) %>%
pivot_longer(
cols = -c("Team", "Position"),
names_to = "Season",
values_to = "HRs")
kable(tidy_hr_data, format = "pipe", caption = "Tidy Homerun Data", align = "lccc")
Team | Position | Season | HRs |
---|---|---|---|
BAL | C | 2018 | 3 |
BAL | C | 2019 | 13 |
BAL | C | 2021 | 11 |
BAL | C | 2022 | 13 |
BAL | C | 2023 | 20 |
BAL | 1B | 2018 | 16 |
BAL | 1B | 2019 | 12 |
BAL | 1B | 2021 | 33 |
BAL | 1B | 2022 | 22 |
BAL | 1B | 2023 | 18 |
BAL | 2B | 2018 | 17 |
BAL | 2B | 2019 | 24 |
BAL | 2B | 2021 | 5 |
BAL | 2B | 2022 | 13 |
BAL | 2B | 2023 | 13 |
BAL | 3B | 2018 | 24 |
BAL | 3B | 2019 | 6 |
BAL | 3B | 2021 | 9 |
BAL | 3B | 2022 | 13 |
BAL | 3B | 2023 | 7 |
BAL | SS | 2018 | 7 |
BAL | SS | 2019 | 12 |
BAL | SS | 2021 | 11 |
BAL | SS | 2022 | 16 |
BAL | SS | 2023 | 4 |
BAL | LF | 2018 | 24 |
BAL | LF | 2019 | 13 |
BAL | LF | 2021 | 22 |
BAL | LF | 2022 | 16 |
BAL | LF | 2023 | 16 |
BAL | CF | 2018 | 15 |
BAL | CF | 2019 | 10 |
BAL | CF | 2021 | 30 |
BAL | CF | 2022 | 16 |
BAL | CF | 2023 | 15 |
BAL | RF | 2018 | 8 |
BAL | RF | 2019 | 35 |
BAL | RF | 2021 | 18 |
BAL | RF | 2022 | 33 |
BAL | RF | 2023 | 28 |
BAL | DH | 2018 | 17 |
BAL | DH | 2019 | 31 |
BAL | DH | 2021 | 21 |
BAL | DH | 2022 | 10 |
BAL | DH | 2023 | 14 |
BOS | C | 2018 | 5 |
BOS | C | 2019 | 23 |
BOS | C | 2021 | 6 |
BOS | C | 2022 | 8 |
BOS | C | 2023 | 9 |
BOS | 1B | 2018 | 15 |
BOS | 1B | 2019 | 19 |
BOS | 1B | 2021 | 25 |
BOS | 1B | 2022 | 12 |
BOS | 1B | 2023 | 24 |
BOS | 2B | 2018 | 10 |
BOS | 2B | 2019 | 3 |
BOS | 2B | 2021 | 6 |
BOS | 2B | 2022 | 16 |
BOS | 2B | 2023 | 3 |
BOS | 3B | 2018 | 23 |
BOS | 3B | 2019 | 33 |
BOS | 3B | 2021 | 23 |
BOS | 3B | 2022 | 15 |
BOS | 3B | 2023 | 6 |
BOS | SS | 2018 | 21 |
BOS | SS | 2019 | 32 |
BOS | SS | 2021 | 38 |
BOS | SS | 2022 | 27 |
BOS | SS | 2023 | 33 |
BOS | LF | 2018 | 16 |
BOS | LF | 2019 | 13 |
BOS | LF | 2021 | 13 |
BOS | LF | 2022 | 11 |
BOS | LF | 2023 | 15 |
BOS | CF | 2018 | 13 |
BOS | CF | 2019 | 21 |
BOS | CF | 2021 | 20 |
BOS | CF | 2022 | 6 |
BOS | CF | 2023 | 8 |
BOS | RF | 2018 | 32 |
BOS | RF | 2019 | 29 |
BOS | RF | 2021 | 31 |
BOS | RF | 2022 | 3 |
BOS | RF | 2023 | 13 |
BOS | DH | 2018 | 43 |
BOS | DH | 2019 | 36 |
BOS | DH | 2021 | 28 |
BOS | DH | 2022 | 16 |
BOS | DH | 2023 | 23 |
NYY | C | 2018 | 18 |
NYY | C | 2019 | 34 |
NYY | C | 2021 | 23 |
NYY | C | 2022 | 11 |
NYY | C | 2023 | 10 |
NYY | 1B | 2018 | 11 |
NYY | 1B | 2019 | 21 |
NYY | 1B | 2021 | 8 |
NYY | 1B | 2022 | 32 |
NYY | 1B | 2023 | 12 |
NYY | 2B | 2018 | 24 |
NYY | 2B | 2019 | 26 |
NYY | 2B | 2021 | 10 |
NYY | 2B | 2022 | 24 |
NYY | 2B | 2023 | 25 |
NYY | 3B | 2018 | 27 |
NYY | 3B | 2019 | 16 |
NYY | 3B | 2021 | 9 |
NYY | 3B | 2022 | 4 |
NYY | 3B | 2023 | 21 |
NYY | SS | 2018 | 27 |
NYY | SS | 2019 | 21 |
NYY | SS | 2021 | 14 |
NYY | SS | 2022 | 15 |
NYY | SS | 2023 | 15 |
NYY | LF | 2018 | 12 |
NYY | LF | 2019 | 13 |
NYY | LF | 2021 | 13 |
NYY | LF | 2022 | 8 |
NYY | LF | 2023 | 5 |
NYY | CF | 2018 | 27 |
NYY | CF | 2019 | 28 |
NYY | CF | 2021 | 10 |
NYY | CF | 2022 | 62 |
NYY | CF | 2023 | 7 |
NYY | RF | 2018 | 27 |
NYY | RF | 2019 | 27 |
NYY | RF | 2021 | 39 |
NYY | RF | 2022 | 12 |
NYY | RF | 2023 | 37 |
NYY | DH | 2018 | 38 |
NYY | DH | 2019 | 13 |
NYY | DH | 2021 | 35 |
NYY | DH | 2022 | 31 |
NYY | DH | 2023 | 24 |
TBR | C | 2018 | 14 |
TBR | C | 2019 | 9 |
TBR | C | 2021 | 33 |
TBR | C | 2022 | 6 |
TBR | C | 2023 | 11 |
TBR | 1B | 2018 | 11 |
TBR | 1B | 2019 | 19 |
TBR | 1B | 2021 | 13 |
TBR | 1B | 2022 | 11 |
TBR | 1B | 2023 | 22 |
TBR | 2B | 2018 | 7 |
TBR | 2B | 2019 | 17 |
TBR | 2B | 2021 | 39 |
TBR | 2B | 2022 | 8 |
TBR | 2B | 2023 | 21 |
TBR | 3B | 2018 | 10 |
TBR | 3B | 2019 | 20 |
TBR | 3B | 2021 | 7 |
TBR | 3B | 2022 | 8 |
TBR | 3B | 2023 | 17 |
TBR | SS | 2018 | 4 |
TBR | SS | 2019 | 14 |
TBR | SS | 2021 | 11 |
TBR | SS | 2022 | 9 |
TBR | SS | 2023 | 31 |
TBR | LF | 2018 | 7 |
TBR | LF | 2019 | 21 |
TBR | LF | 2021 | 27 |
TBR | LF | 2022 | 20 |
TBR | LF | 2023 | 23 |
TBR | CF | 2018 | 7 |
TBR | CF | 2019 | 14 |
TBR | CF | 2021 | 4 |
TBR | CF | 2022 | 7 |
TBR | CF | 2023 | 25 |
TBR | RF | 2018 | 9 |
TBR | RF | 2019 | 20 |
TBR | RF | 2021 | 10 |
TBR | RF | 2022 | 4 |
TBR | RF | 2023 | 20 |
TBR | DH | 2018 | 30 |
TBR | DH | 2019 | 33 |
TBR | DH | 2021 | 13 |
TBR | DH | 2022 | 6 |
TBR | DH | 2023 | 12 |
TOR | C | 2018 | 10 |
TOR | C | 2019 | 13 |
TOR | C | 2021 | 1 |
TOR | C | 2022 | 14 |
TOR | C | 2023 | 8 |
TOR | 1B | 2018 | 25 |
TOR | 1B | 2019 | 22 |
TOR | 1B | 2021 | 48 |
TOR | 1B | 2022 | 32 |
TOR | 1B | 2023 | 26 |
TOR | 2B | 2018 | 11 |
TOR | 2B | 2019 | 16 |
TOR | 2B | 2021 | 45 |
TOR | 2B | 2022 | 7 |
TOR | 2B | 2023 | 11 |
TOR | 3B | 2018 | 18 |
TOR | 3B | 2019 | 18 |
TOR | 3B | 2021 | 29 |
TOR | 3B | 2022 | 24 |
TOR | 3B | 2023 | 20 |
TOR | SS | 2018 | 17 |
TOR | SS | 2019 | 15 |
TOR | SS | 2021 | 2 |
TOR | SS | 2022 | 27 |
TOR | SS | 2023 | 17 |
TOR | LF | 2018 | 22 |
TOR | LF | 2019 | 20 |
TOR | LF | 2021 | 21 |
TOR | LF | 2022 | 5 |
TOR | LF | 2023 | 20 |
TOR | CF | 2018 | 15 |
TOR | CF | 2019 | 26 |
TOR | CF | 2021 | 22 |
TOR | CF | 2022 | 25 |
TOR | CF | 2023 | 8 |
TOR | RF | 2018 | 25 |
TOR | RF | 2019 | 31 |
TOR | RF | 2021 | 32 |
TOR | RF | 2022 | 25 |
TOR | RF | 2023 | 21 |
TOR | DH | 2018 | 21 |
TOR | DH | 2019 | 21 |
TOR | DH | 2021 | 22 |
TOR | DH | 2022 | 4 |
TOR | DH | 2023 | 19 |
I wanted to replace the names of the fielding positions with the number that represents them on a baseball scorecard and I have been meaning to practice joins, so in this code block I create a new data frame with the scorecard position number for each position, use a join to add this information into the tidy data frame, and then reorder the columns (removing the old position data in the process). I have now finished tidying the data and I am prepared to analyze it.
Position <- c("C", "1B", "2B", "3B", "SS", "LF", "CF", "RF", "DH")
Pos <- c(2,3,4,5,6,7,8,9,"DH")
positions_data <- data.frame(Position, Pos)
full_hr_data <- full_join(tidy_hr_data, positions_data, by = join_by(Position))
full_hr_data <- full_hr_data[,c(1,3,5,4)]
full_hr_data <- full_hr_data[order(full_hr_data$Pos, full_hr_data$Season, full_hr_data$Team),]
kable(full_hr_data, format = "pipe", caption = "Tidy Homerun Data with Scorecard Position Numbers", align = "lccc")
Team | Season | Pos | HRs |
---|---|---|---|
BAL | 2018 | 2 | 3 |
BOS | 2018 | 2 | 5 |
NYY | 2018 | 2 | 18 |
TBR | 2018 | 2 | 14 |
TOR | 2018 | 2 | 10 |
BAL | 2019 | 2 | 13 |
BOS | 2019 | 2 | 23 |
NYY | 2019 | 2 | 34 |
TBR | 2019 | 2 | 9 |
TOR | 2019 | 2 | 13 |
BAL | 2021 | 2 | 11 |
BOS | 2021 | 2 | 6 |
NYY | 2021 | 2 | 23 |
TBR | 2021 | 2 | 33 |
TOR | 2021 | 2 | 1 |
BAL | 2022 | 2 | 13 |
BOS | 2022 | 2 | 8 |
NYY | 2022 | 2 | 11 |
TBR | 2022 | 2 | 6 |
TOR | 2022 | 2 | 14 |
BAL | 2023 | 2 | 20 |
BOS | 2023 | 2 | 9 |
NYY | 2023 | 2 | 10 |
TBR | 2023 | 2 | 11 |
TOR | 2023 | 2 | 8 |
BAL | 2018 | 3 | 16 |
BOS | 2018 | 3 | 15 |
NYY | 2018 | 3 | 11 |
TBR | 2018 | 3 | 11 |
TOR | 2018 | 3 | 25 |
BAL | 2019 | 3 | 12 |
BOS | 2019 | 3 | 19 |
NYY | 2019 | 3 | 21 |
TBR | 2019 | 3 | 19 |
TOR | 2019 | 3 | 22 |
BAL | 2021 | 3 | 33 |
BOS | 2021 | 3 | 25 |
NYY | 2021 | 3 | 8 |
TBR | 2021 | 3 | 13 |
TOR | 2021 | 3 | 48 |
BAL | 2022 | 3 | 22 |
BOS | 2022 | 3 | 12 |
NYY | 2022 | 3 | 32 |
TBR | 2022 | 3 | 11 |
TOR | 2022 | 3 | 32 |
BAL | 2023 | 3 | 18 |
BOS | 2023 | 3 | 24 |
NYY | 2023 | 3 | 12 |
TBR | 2023 | 3 | 22 |
TOR | 2023 | 3 | 26 |
BAL | 2018 | 4 | 17 |
BOS | 2018 | 4 | 10 |
NYY | 2018 | 4 | 24 |
TBR | 2018 | 4 | 7 |
TOR | 2018 | 4 | 11 |
BAL | 2019 | 4 | 24 |
BOS | 2019 | 4 | 3 |
NYY | 2019 | 4 | 26 |
TBR | 2019 | 4 | 17 |
TOR | 2019 | 4 | 16 |
BAL | 2021 | 4 | 5 |
BOS | 2021 | 4 | 6 |
NYY | 2021 | 4 | 10 |
TBR | 2021 | 4 | 39 |
TOR | 2021 | 4 | 45 |
BAL | 2022 | 4 | 13 |
BOS | 2022 | 4 | 16 |
NYY | 2022 | 4 | 24 |
TBR | 2022 | 4 | 8 |
TOR | 2022 | 4 | 7 |
BAL | 2023 | 4 | 13 |
BOS | 2023 | 4 | 3 |
NYY | 2023 | 4 | 25 |
TBR | 2023 | 4 | 21 |
TOR | 2023 | 4 | 11 |
BAL | 2018 | 5 | 24 |
BOS | 2018 | 5 | 23 |
NYY | 2018 | 5 | 27 |
TBR | 2018 | 5 | 10 |
TOR | 2018 | 5 | 18 |
BAL | 2019 | 5 | 6 |
BOS | 2019 | 5 | 33 |
NYY | 2019 | 5 | 16 |
TBR | 2019 | 5 | 20 |
TOR | 2019 | 5 | 18 |
BAL | 2021 | 5 | 9 |
BOS | 2021 | 5 | 23 |
NYY | 2021 | 5 | 9 |
TBR | 2021 | 5 | 7 |
TOR | 2021 | 5 | 29 |
BAL | 2022 | 5 | 13 |
BOS | 2022 | 5 | 15 |
NYY | 2022 | 5 | 4 |
TBR | 2022 | 5 | 8 |
TOR | 2022 | 5 | 24 |
BAL | 2023 | 5 | 7 |
BOS | 2023 | 5 | 6 |
NYY | 2023 | 5 | 21 |
TBR | 2023 | 5 | 17 |
TOR | 2023 | 5 | 20 |
BAL | 2018 | 6 | 7 |
BOS | 2018 | 6 | 21 |
NYY | 2018 | 6 | 27 |
TBR | 2018 | 6 | 4 |
TOR | 2018 | 6 | 17 |
BAL | 2019 | 6 | 12 |
BOS | 2019 | 6 | 32 |
NYY | 2019 | 6 | 21 |
TBR | 2019 | 6 | 14 |
TOR | 2019 | 6 | 15 |
BAL | 2021 | 6 | 11 |
BOS | 2021 | 6 | 38 |
NYY | 2021 | 6 | 14 |
TBR | 2021 | 6 | 11 |
TOR | 2021 | 6 | 2 |
BAL | 2022 | 6 | 16 |
BOS | 2022 | 6 | 27 |
NYY | 2022 | 6 | 15 |
TBR | 2022 | 6 | 9 |
TOR | 2022 | 6 | 27 |
BAL | 2023 | 6 | 4 |
BOS | 2023 | 6 | 33 |
NYY | 2023 | 6 | 15 |
TBR | 2023 | 6 | 31 |
TOR | 2023 | 6 | 17 |
BAL | 2018 | 7 | 24 |
BOS | 2018 | 7 | 16 |
NYY | 2018 | 7 | 12 |
TBR | 2018 | 7 | 7 |
TOR | 2018 | 7 | 22 |
BAL | 2019 | 7 | 13 |
BOS | 2019 | 7 | 13 |
NYY | 2019 | 7 | 13 |
TBR | 2019 | 7 | 21 |
TOR | 2019 | 7 | 20 |
BAL | 2021 | 7 | 22 |
BOS | 2021 | 7 | 13 |
NYY | 2021 | 7 | 13 |
TBR | 2021 | 7 | 27 |
TOR | 2021 | 7 | 21 |
BAL | 2022 | 7 | 16 |
BOS | 2022 | 7 | 11 |
NYY | 2022 | 7 | 8 |
TBR | 2022 | 7 | 20 |
TOR | 2022 | 7 | 5 |
BAL | 2023 | 7 | 16 |
BOS | 2023 | 7 | 15 |
NYY | 2023 | 7 | 5 |
TBR | 2023 | 7 | 23 |
TOR | 2023 | 7 | 20 |
BAL | 2018 | 8 | 15 |
BOS | 2018 | 8 | 13 |
NYY | 2018 | 8 | 27 |
TBR | 2018 | 8 | 7 |
TOR | 2018 | 8 | 15 |
BAL | 2019 | 8 | 10 |
BOS | 2019 | 8 | 21 |
NYY | 2019 | 8 | 28 |
TBR | 2019 | 8 | 14 |
TOR | 2019 | 8 | 26 |
BAL | 2021 | 8 | 30 |
BOS | 2021 | 8 | 20 |
NYY | 2021 | 8 | 10 |
TBR | 2021 | 8 | 4 |
TOR | 2021 | 8 | 22 |
BAL | 2022 | 8 | 16 |
BOS | 2022 | 8 | 6 |
NYY | 2022 | 8 | 62 |
TBR | 2022 | 8 | 7 |
TOR | 2022 | 8 | 25 |
BAL | 2023 | 8 | 15 |
BOS | 2023 | 8 | 8 |
NYY | 2023 | 8 | 7 |
TBR | 2023 | 8 | 25 |
TOR | 2023 | 8 | 8 |
BAL | 2018 | 9 | 8 |
BOS | 2018 | 9 | 32 |
NYY | 2018 | 9 | 27 |
TBR | 2018 | 9 | 9 |
TOR | 2018 | 9 | 25 |
BAL | 2019 | 9 | 35 |
BOS | 2019 | 9 | 29 |
NYY | 2019 | 9 | 27 |
TBR | 2019 | 9 | 20 |
TOR | 2019 | 9 | 31 |
BAL | 2021 | 9 | 18 |
BOS | 2021 | 9 | 31 |
NYY | 2021 | 9 | 39 |
TBR | 2021 | 9 | 10 |
TOR | 2021 | 9 | 32 |
BAL | 2022 | 9 | 33 |
BOS | 2022 | 9 | 3 |
NYY | 2022 | 9 | 12 |
TBR | 2022 | 9 | 4 |
TOR | 2022 | 9 | 25 |
BAL | 2023 | 9 | 28 |
BOS | 2023 | 9 | 13 |
NYY | 2023 | 9 | 37 |
TBR | 2023 | 9 | 20 |
TOR | 2023 | 9 | 21 |
BAL | 2018 | DH | 17 |
BOS | 2018 | DH | 43 |
NYY | 2018 | DH | 38 |
TBR | 2018 | DH | 30 |
TOR | 2018 | DH | 21 |
BAL | 2019 | DH | 31 |
BOS | 2019 | DH | 36 |
NYY | 2019 | DH | 13 |
TBR | 2019 | DH | 33 |
TOR | 2019 | DH | 21 |
BAL | 2021 | DH | 21 |
BOS | 2021 | DH | 28 |
NYY | 2021 | DH | 35 |
TBR | 2021 | DH | 13 |
TOR | 2021 | DH | 22 |
BAL | 2022 | DH | 10 |
BOS | 2022 | DH | 16 |
NYY | 2022 | DH | 31 |
TBR | 2022 | DH | 6 |
TOR | 2022 | DH | 4 |
BAL | 2023 | DH | 14 |
BOS | 2023 | DH | 23 |
NYY | 2023 | DH | 24 |
TBR | 2023 | DH | 12 |
TOR | 2023 | DH | 19 |
To analyze this data, I want to see which team got the most (and least) “plus” value from one of their players in terms of home runs. To determine this, I will divide each player’s home run total for each year by the average number of home runs for their position that season. It’s important to group by position because some positions in baseball are more technically difficult defensively, so players in those positions are not expected to produce as much offensive output as players in less difficult defensive positions are. To make sure I’m considering multiple ways of interpreting the data, I will compare each player’s home runs to the mean and the median number of home runs from their position for the season.
In this code block, I add columns with the mean and median home runs for that season and position, and then columns showing each player’s home run total for the season divided by the mean and median for their position that season.
expanded_hr_data <- full_hr_data %>%
group_by(Season,Pos) %>%
mutate(yr_pos_mean = mean(HRs)) %>%
mutate(yr_pos_median = median(HRs)) %>%
mutate(mean_adj = round(HRs / yr_pos_mean, 2)) %>%
mutate(median_adj = round(HRs / yr_pos_median, 2))
kable(expanded_hr_data, format = "pipe", caption = "Tidy Homerun Data with Grouped Home Run Information", align = "lccccccc")
Team | Season | Pos | HRs | yr_pos_mean | yr_pos_median | mean_adj | median_adj |
---|---|---|---|---|---|---|---|
BAL | 2018 | 2 | 3 | 10.0 | 10 | 0.30 | 0.30 |
BOS | 2018 | 2 | 5 | 10.0 | 10 | 0.50 | 0.50 |
NYY | 2018 | 2 | 18 | 10.0 | 10 | 1.80 | 1.80 |
TBR | 2018 | 2 | 14 | 10.0 | 10 | 1.40 | 1.40 |
TOR | 2018 | 2 | 10 | 10.0 | 10 | 1.00 | 1.00 |
BAL | 2019 | 2 | 13 | 18.4 | 13 | 0.71 | 1.00 |
BOS | 2019 | 2 | 23 | 18.4 | 13 | 1.25 | 1.77 |
NYY | 2019 | 2 | 34 | 18.4 | 13 | 1.85 | 2.62 |
TBR | 2019 | 2 | 9 | 18.4 | 13 | 0.49 | 0.69 |
TOR | 2019 | 2 | 13 | 18.4 | 13 | 0.71 | 1.00 |
BAL | 2021 | 2 | 11 | 14.8 | 11 | 0.74 | 1.00 |
BOS | 2021 | 2 | 6 | 14.8 | 11 | 0.41 | 0.55 |
NYY | 2021 | 2 | 23 | 14.8 | 11 | 1.55 | 2.09 |
TBR | 2021 | 2 | 33 | 14.8 | 11 | 2.23 | 3.00 |
TOR | 2021 | 2 | 1 | 14.8 | 11 | 0.07 | 0.09 |
BAL | 2022 | 2 | 13 | 10.4 | 11 | 1.25 | 1.18 |
BOS | 2022 | 2 | 8 | 10.4 | 11 | 0.77 | 0.73 |
NYY | 2022 | 2 | 11 | 10.4 | 11 | 1.06 | 1.00 |
TBR | 2022 | 2 | 6 | 10.4 | 11 | 0.58 | 0.55 |
TOR | 2022 | 2 | 14 | 10.4 | 11 | 1.35 | 1.27 |
BAL | 2023 | 2 | 20 | 11.6 | 10 | 1.72 | 2.00 |
BOS | 2023 | 2 | 9 | 11.6 | 10 | 0.78 | 0.90 |
NYY | 2023 | 2 | 10 | 11.6 | 10 | 0.86 | 1.00 |
TBR | 2023 | 2 | 11 | 11.6 | 10 | 0.95 | 1.10 |
TOR | 2023 | 2 | 8 | 11.6 | 10 | 0.69 | 0.80 |
BAL | 2018 | 3 | 16 | 15.6 | 15 | 1.03 | 1.07 |
BOS | 2018 | 3 | 15 | 15.6 | 15 | 0.96 | 1.00 |
NYY | 2018 | 3 | 11 | 15.6 | 15 | 0.71 | 0.73 |
TBR | 2018 | 3 | 11 | 15.6 | 15 | 0.71 | 0.73 |
TOR | 2018 | 3 | 25 | 15.6 | 15 | 1.60 | 1.67 |
BAL | 2019 | 3 | 12 | 18.6 | 19 | 0.65 | 0.63 |
BOS | 2019 | 3 | 19 | 18.6 | 19 | 1.02 | 1.00 |
NYY | 2019 | 3 | 21 | 18.6 | 19 | 1.13 | 1.11 |
TBR | 2019 | 3 | 19 | 18.6 | 19 | 1.02 | 1.00 |
TOR | 2019 | 3 | 22 | 18.6 | 19 | 1.18 | 1.16 |
BAL | 2021 | 3 | 33 | 25.4 | 25 | 1.30 | 1.32 |
BOS | 2021 | 3 | 25 | 25.4 | 25 | 0.98 | 1.00 |
NYY | 2021 | 3 | 8 | 25.4 | 25 | 0.31 | 0.32 |
TBR | 2021 | 3 | 13 | 25.4 | 25 | 0.51 | 0.52 |
TOR | 2021 | 3 | 48 | 25.4 | 25 | 1.89 | 1.92 |
BAL | 2022 | 3 | 22 | 21.8 | 22 | 1.01 | 1.00 |
BOS | 2022 | 3 | 12 | 21.8 | 22 | 0.55 | 0.55 |
NYY | 2022 | 3 | 32 | 21.8 | 22 | 1.47 | 1.45 |
TBR | 2022 | 3 | 11 | 21.8 | 22 | 0.50 | 0.50 |
TOR | 2022 | 3 | 32 | 21.8 | 22 | 1.47 | 1.45 |
BAL | 2023 | 3 | 18 | 20.4 | 22 | 0.88 | 0.82 |
BOS | 2023 | 3 | 24 | 20.4 | 22 | 1.18 | 1.09 |
NYY | 2023 | 3 | 12 | 20.4 | 22 | 0.59 | 0.55 |
TBR | 2023 | 3 | 22 | 20.4 | 22 | 1.08 | 1.00 |
TOR | 2023 | 3 | 26 | 20.4 | 22 | 1.27 | 1.18 |
BAL | 2018 | 4 | 17 | 13.8 | 11 | 1.23 | 1.55 |
BOS | 2018 | 4 | 10 | 13.8 | 11 | 0.72 | 0.91 |
NYY | 2018 | 4 | 24 | 13.8 | 11 | 1.74 | 2.18 |
TBR | 2018 | 4 | 7 | 13.8 | 11 | 0.51 | 0.64 |
TOR | 2018 | 4 | 11 | 13.8 | 11 | 0.80 | 1.00 |
BAL | 2019 | 4 | 24 | 17.2 | 17 | 1.40 | 1.41 |
BOS | 2019 | 4 | 3 | 17.2 | 17 | 0.17 | 0.18 |
NYY | 2019 | 4 | 26 | 17.2 | 17 | 1.51 | 1.53 |
TBR | 2019 | 4 | 17 | 17.2 | 17 | 0.99 | 1.00 |
TOR | 2019 | 4 | 16 | 17.2 | 17 | 0.93 | 0.94 |
BAL | 2021 | 4 | 5 | 21.0 | 10 | 0.24 | 0.50 |
BOS | 2021 | 4 | 6 | 21.0 | 10 | 0.29 | 0.60 |
NYY | 2021 | 4 | 10 | 21.0 | 10 | 0.48 | 1.00 |
TBR | 2021 | 4 | 39 | 21.0 | 10 | 1.86 | 3.90 |
TOR | 2021 | 4 | 45 | 21.0 | 10 | 2.14 | 4.50 |
BAL | 2022 | 4 | 13 | 13.6 | 13 | 0.96 | 1.00 |
BOS | 2022 | 4 | 16 | 13.6 | 13 | 1.18 | 1.23 |
NYY | 2022 | 4 | 24 | 13.6 | 13 | 1.76 | 1.85 |
TBR | 2022 | 4 | 8 | 13.6 | 13 | 0.59 | 0.62 |
TOR | 2022 | 4 | 7 | 13.6 | 13 | 0.51 | 0.54 |
BAL | 2023 | 4 | 13 | 14.6 | 13 | 0.89 | 1.00 |
BOS | 2023 | 4 | 3 | 14.6 | 13 | 0.21 | 0.23 |
NYY | 2023 | 4 | 25 | 14.6 | 13 | 1.71 | 1.92 |
TBR | 2023 | 4 | 21 | 14.6 | 13 | 1.44 | 1.62 |
TOR | 2023 | 4 | 11 | 14.6 | 13 | 0.75 | 0.85 |
BAL | 2018 | 5 | 24 | 20.4 | 23 | 1.18 | 1.04 |
BOS | 2018 | 5 | 23 | 20.4 | 23 | 1.13 | 1.00 |
NYY | 2018 | 5 | 27 | 20.4 | 23 | 1.32 | 1.17 |
TBR | 2018 | 5 | 10 | 20.4 | 23 | 0.49 | 0.43 |
TOR | 2018 | 5 | 18 | 20.4 | 23 | 0.88 | 0.78 |
BAL | 2019 | 5 | 6 | 18.6 | 18 | 0.32 | 0.33 |
BOS | 2019 | 5 | 33 | 18.6 | 18 | 1.77 | 1.83 |
NYY | 2019 | 5 | 16 | 18.6 | 18 | 0.86 | 0.89 |
TBR | 2019 | 5 | 20 | 18.6 | 18 | 1.08 | 1.11 |
TOR | 2019 | 5 | 18 | 18.6 | 18 | 0.97 | 1.00 |
BAL | 2021 | 5 | 9 | 15.4 | 9 | 0.58 | 1.00 |
BOS | 2021 | 5 | 23 | 15.4 | 9 | 1.49 | 2.56 |
NYY | 2021 | 5 | 9 | 15.4 | 9 | 0.58 | 1.00 |
TBR | 2021 | 5 | 7 | 15.4 | 9 | 0.45 | 0.78 |
TOR | 2021 | 5 | 29 | 15.4 | 9 | 1.88 | 3.22 |
BAL | 2022 | 5 | 13 | 12.8 | 13 | 1.02 | 1.00 |
BOS | 2022 | 5 | 15 | 12.8 | 13 | 1.17 | 1.15 |
NYY | 2022 | 5 | 4 | 12.8 | 13 | 0.31 | 0.31 |
TBR | 2022 | 5 | 8 | 12.8 | 13 | 0.62 | 0.62 |
TOR | 2022 | 5 | 24 | 12.8 | 13 | 1.88 | 1.85 |
BAL | 2023 | 5 | 7 | 14.2 | 17 | 0.49 | 0.41 |
BOS | 2023 | 5 | 6 | 14.2 | 17 | 0.42 | 0.35 |
NYY | 2023 | 5 | 21 | 14.2 | 17 | 1.48 | 1.24 |
TBR | 2023 | 5 | 17 | 14.2 | 17 | 1.20 | 1.00 |
TOR | 2023 | 5 | 20 | 14.2 | 17 | 1.41 | 1.18 |
BAL | 2018 | 6 | 7 | 15.2 | 17 | 0.46 | 0.41 |
BOS | 2018 | 6 | 21 | 15.2 | 17 | 1.38 | 1.24 |
NYY | 2018 | 6 | 27 | 15.2 | 17 | 1.78 | 1.59 |
TBR | 2018 | 6 | 4 | 15.2 | 17 | 0.26 | 0.24 |
TOR | 2018 | 6 | 17 | 15.2 | 17 | 1.12 | 1.00 |
BAL | 2019 | 6 | 12 | 18.8 | 15 | 0.64 | 0.80 |
BOS | 2019 | 6 | 32 | 18.8 | 15 | 1.70 | 2.13 |
NYY | 2019 | 6 | 21 | 18.8 | 15 | 1.12 | 1.40 |
TBR | 2019 | 6 | 14 | 18.8 | 15 | 0.74 | 0.93 |
TOR | 2019 | 6 | 15 | 18.8 | 15 | 0.80 | 1.00 |
BAL | 2021 | 6 | 11 | 15.2 | 11 | 0.72 | 1.00 |
BOS | 2021 | 6 | 38 | 15.2 | 11 | 2.50 | 3.45 |
NYY | 2021 | 6 | 14 | 15.2 | 11 | 0.92 | 1.27 |
TBR | 2021 | 6 | 11 | 15.2 | 11 | 0.72 | 1.00 |
TOR | 2021 | 6 | 2 | 15.2 | 11 | 0.13 | 0.18 |
BAL | 2022 | 6 | 16 | 18.8 | 16 | 0.85 | 1.00 |
BOS | 2022 | 6 | 27 | 18.8 | 16 | 1.44 | 1.69 |
NYY | 2022 | 6 | 15 | 18.8 | 16 | 0.80 | 0.94 |
TBR | 2022 | 6 | 9 | 18.8 | 16 | 0.48 | 0.56 |
TOR | 2022 | 6 | 27 | 18.8 | 16 | 1.44 | 1.69 |
BAL | 2023 | 6 | 4 | 20.0 | 17 | 0.20 | 0.24 |
BOS | 2023 | 6 | 33 | 20.0 | 17 | 1.65 | 1.94 |
NYY | 2023 | 6 | 15 | 20.0 | 17 | 0.75 | 0.88 |
TBR | 2023 | 6 | 31 | 20.0 | 17 | 1.55 | 1.82 |
TOR | 2023 | 6 | 17 | 20.0 | 17 | 0.85 | 1.00 |
BAL | 2018 | 7 | 24 | 16.2 | 16 | 1.48 | 1.50 |
BOS | 2018 | 7 | 16 | 16.2 | 16 | 0.99 | 1.00 |
NYY | 2018 | 7 | 12 | 16.2 | 16 | 0.74 | 0.75 |
TBR | 2018 | 7 | 7 | 16.2 | 16 | 0.43 | 0.44 |
TOR | 2018 | 7 | 22 | 16.2 | 16 | 1.36 | 1.38 |
BAL | 2019 | 7 | 13 | 16.0 | 13 | 0.81 | 1.00 |
BOS | 2019 | 7 | 13 | 16.0 | 13 | 0.81 | 1.00 |
NYY | 2019 | 7 | 13 | 16.0 | 13 | 0.81 | 1.00 |
TBR | 2019 | 7 | 21 | 16.0 | 13 | 1.31 | 1.62 |
TOR | 2019 | 7 | 20 | 16.0 | 13 | 1.25 | 1.54 |
BAL | 2021 | 7 | 22 | 19.2 | 21 | 1.15 | 1.05 |
BOS | 2021 | 7 | 13 | 19.2 | 21 | 0.68 | 0.62 |
NYY | 2021 | 7 | 13 | 19.2 | 21 | 0.68 | 0.62 |
TBR | 2021 | 7 | 27 | 19.2 | 21 | 1.41 | 1.29 |
TOR | 2021 | 7 | 21 | 19.2 | 21 | 1.09 | 1.00 |
BAL | 2022 | 7 | 16 | 12.0 | 11 | 1.33 | 1.45 |
BOS | 2022 | 7 | 11 | 12.0 | 11 | 0.92 | 1.00 |
NYY | 2022 | 7 | 8 | 12.0 | 11 | 0.67 | 0.73 |
TBR | 2022 | 7 | 20 | 12.0 | 11 | 1.67 | 1.82 |
TOR | 2022 | 7 | 5 | 12.0 | 11 | 0.42 | 0.45 |
BAL | 2023 | 7 | 16 | 15.8 | 16 | 1.01 | 1.00 |
BOS | 2023 | 7 | 15 | 15.8 | 16 | 0.95 | 0.94 |
NYY | 2023 | 7 | 5 | 15.8 | 16 | 0.32 | 0.31 |
TBR | 2023 | 7 | 23 | 15.8 | 16 | 1.46 | 1.44 |
TOR | 2023 | 7 | 20 | 15.8 | 16 | 1.27 | 1.25 |
BAL | 2018 | 8 | 15 | 15.4 | 15 | 0.97 | 1.00 |
BOS | 2018 | 8 | 13 | 15.4 | 15 | 0.84 | 0.87 |
NYY | 2018 | 8 | 27 | 15.4 | 15 | 1.75 | 1.80 |
TBR | 2018 | 8 | 7 | 15.4 | 15 | 0.45 | 0.47 |
TOR | 2018 | 8 | 15 | 15.4 | 15 | 0.97 | 1.00 |
BAL | 2019 | 8 | 10 | 19.8 | 21 | 0.51 | 0.48 |
BOS | 2019 | 8 | 21 | 19.8 | 21 | 1.06 | 1.00 |
NYY | 2019 | 8 | 28 | 19.8 | 21 | 1.41 | 1.33 |
TBR | 2019 | 8 | 14 | 19.8 | 21 | 0.71 | 0.67 |
TOR | 2019 | 8 | 26 | 19.8 | 21 | 1.31 | 1.24 |
BAL | 2021 | 8 | 30 | 17.2 | 20 | 1.74 | 1.50 |
BOS | 2021 | 8 | 20 | 17.2 | 20 | 1.16 | 1.00 |
NYY | 2021 | 8 | 10 | 17.2 | 20 | 0.58 | 0.50 |
TBR | 2021 | 8 | 4 | 17.2 | 20 | 0.23 | 0.20 |
TOR | 2021 | 8 | 22 | 17.2 | 20 | 1.28 | 1.10 |
BAL | 2022 | 8 | 16 | 23.2 | 16 | 0.69 | 1.00 |
BOS | 2022 | 8 | 6 | 23.2 | 16 | 0.26 | 0.38 |
NYY | 2022 | 8 | 62 | 23.2 | 16 | 2.67 | 3.88 |
TBR | 2022 | 8 | 7 | 23.2 | 16 | 0.30 | 0.44 |
TOR | 2022 | 8 | 25 | 23.2 | 16 | 1.08 | 1.56 |
BAL | 2023 | 8 | 15 | 12.6 | 8 | 1.19 | 1.88 |
BOS | 2023 | 8 | 8 | 12.6 | 8 | 0.63 | 1.00 |
NYY | 2023 | 8 | 7 | 12.6 | 8 | 0.56 | 0.88 |
TBR | 2023 | 8 | 25 | 12.6 | 8 | 1.98 | 3.12 |
TOR | 2023 | 8 | 8 | 12.6 | 8 | 0.63 | 1.00 |
BAL | 2018 | 9 | 8 | 20.2 | 25 | 0.40 | 0.32 |
BOS | 2018 | 9 | 32 | 20.2 | 25 | 1.58 | 1.28 |
NYY | 2018 | 9 | 27 | 20.2 | 25 | 1.34 | 1.08 |
TBR | 2018 | 9 | 9 | 20.2 | 25 | 0.45 | 0.36 |
TOR | 2018 | 9 | 25 | 20.2 | 25 | 1.24 | 1.00 |
BAL | 2019 | 9 | 35 | 28.4 | 29 | 1.23 | 1.21 |
BOS | 2019 | 9 | 29 | 28.4 | 29 | 1.02 | 1.00 |
NYY | 2019 | 9 | 27 | 28.4 | 29 | 0.95 | 0.93 |
TBR | 2019 | 9 | 20 | 28.4 | 29 | 0.70 | 0.69 |
TOR | 2019 | 9 | 31 | 28.4 | 29 | 1.09 | 1.07 |
BAL | 2021 | 9 | 18 | 26.0 | 31 | 0.69 | 0.58 |
BOS | 2021 | 9 | 31 | 26.0 | 31 | 1.19 | 1.00 |
NYY | 2021 | 9 | 39 | 26.0 | 31 | 1.50 | 1.26 |
TBR | 2021 | 9 | 10 | 26.0 | 31 | 0.38 | 0.32 |
TOR | 2021 | 9 | 32 | 26.0 | 31 | 1.23 | 1.03 |
BAL | 2022 | 9 | 33 | 15.4 | 12 | 2.14 | 2.75 |
BOS | 2022 | 9 | 3 | 15.4 | 12 | 0.19 | 0.25 |
NYY | 2022 | 9 | 12 | 15.4 | 12 | 0.78 | 1.00 |
TBR | 2022 | 9 | 4 | 15.4 | 12 | 0.26 | 0.33 |
TOR | 2022 | 9 | 25 | 15.4 | 12 | 1.62 | 2.08 |
BAL | 2023 | 9 | 28 | 23.8 | 21 | 1.18 | 1.33 |
BOS | 2023 | 9 | 13 | 23.8 | 21 | 0.55 | 0.62 |
NYY | 2023 | 9 | 37 | 23.8 | 21 | 1.55 | 1.76 |
TBR | 2023 | 9 | 20 | 23.8 | 21 | 0.84 | 0.95 |
TOR | 2023 | 9 | 21 | 23.8 | 21 | 0.88 | 1.00 |
BAL | 2018 | DH | 17 | 29.8 | 30 | 0.57 | 0.57 |
BOS | 2018 | DH | 43 | 29.8 | 30 | 1.44 | 1.43 |
NYY | 2018 | DH | 38 | 29.8 | 30 | 1.28 | 1.27 |
TBR | 2018 | DH | 30 | 29.8 | 30 | 1.01 | 1.00 |
TOR | 2018 | DH | 21 | 29.8 | 30 | 0.70 | 0.70 |
BAL | 2019 | DH | 31 | 26.8 | 31 | 1.16 | 1.00 |
BOS | 2019 | DH | 36 | 26.8 | 31 | 1.34 | 1.16 |
NYY | 2019 | DH | 13 | 26.8 | 31 | 0.49 | 0.42 |
TBR | 2019 | DH | 33 | 26.8 | 31 | 1.23 | 1.06 |
TOR | 2019 | DH | 21 | 26.8 | 31 | 0.78 | 0.68 |
BAL | 2021 | DH | 21 | 23.8 | 22 | 0.88 | 0.95 |
BOS | 2021 | DH | 28 | 23.8 | 22 | 1.18 | 1.27 |
NYY | 2021 | DH | 35 | 23.8 | 22 | 1.47 | 1.59 |
TBR | 2021 | DH | 13 | 23.8 | 22 | 0.55 | 0.59 |
TOR | 2021 | DH | 22 | 23.8 | 22 | 0.92 | 1.00 |
BAL | 2022 | DH | 10 | 13.4 | 10 | 0.75 | 1.00 |
BOS | 2022 | DH | 16 | 13.4 | 10 | 1.19 | 1.60 |
NYY | 2022 | DH | 31 | 13.4 | 10 | 2.31 | 3.10 |
TBR | 2022 | DH | 6 | 13.4 | 10 | 0.45 | 0.60 |
TOR | 2022 | DH | 4 | 13.4 | 10 | 0.30 | 0.40 |
BAL | 2023 | DH | 14 | 18.4 | 19 | 0.76 | 0.74 |
BOS | 2023 | DH | 23 | 18.4 | 19 | 1.25 | 1.21 |
NYY | 2023 | DH | 24 | 18.4 | 19 | 1.30 | 1.26 |
TBR | 2023 | DH | 12 | 18.4 | 19 | 0.65 | 0.63 |
TOR | 2023 | DH | 19 | 18.4 | 19 | 1.03 | 1.00 |
In this code block, I determine which player ranked best relative to the mean number of home runs for their position in a given season and display this information along with the results for their counterparts in that same season. The top spot goes to Aaron Judge, who set the American League record for home runs in a season in 2022 with 62 home runs. No other American League East center fielder hit more than 25 that year, and the division average for center fielders was 23.2.
expanded_hr_data_highmean <- expanded_hr_data[order(expanded_hr_data$mean_adj,decreasing = TRUE),]
expanded_hr_data_highmean_df <- subset(expanded_hr_data_highmean, Season == expanded_hr_data_highmean$Season[1] & Pos == expanded_hr_data_highmean$Pos[1])
kable(expanded_hr_data_highmean_df, format = "pipe", caption = "Position and Season Data for Best Home Run Value (Mean)", align = "lccccccc")
Team | Season | Pos | HRs | yr_pos_mean | yr_pos_median | mean_adj | median_adj |
---|---|---|---|---|---|---|---|
NYY | 2022 | 8 | 62 | 23.2 | 16 | 2.67 | 3.88 |
TOR | 2022 | 8 | 25 | 23.2 | 16 | 1.08 | 1.56 |
BAL | 2022 | 8 | 16 | 23.2 | 16 | 0.69 | 1.00 |
TBR | 2022 | 8 | 7 | 23.2 | 16 | 0.30 | 0.44 |
BOS | 2022 | 8 | 6 | 23.2 | 16 | 0.26 | 0.38 |
In this code block, I determine which player ranked best relative to the median number of home runs for their position in a given season and display this information along with the results for their counterparts in that same season. The top spot goes to Marcus Semien, who hit 45 home runs in 2021 as a second baseman for the Toronto Blue Jays. Second base is a “defense-first” position, and the median number of home runs for American League Eastern Division second basemen in 2021 was only 10.
expanded_hr_data_highmedian <- expanded_hr_data[order(expanded_hr_data$median_adj,decreasing = TRUE),]
expanded_hr_data_highmedian_df <- subset(expanded_hr_data_highmedian, Season == expanded_hr_data_highmedian$Season[1] & Pos == expanded_hr_data_highmedian$Pos[1])
kable(expanded_hr_data_highmedian_df, format = "pipe", caption = "Position and Season Data for Best Home Run Value (Median)", align = "lccccccc")
Team | Season | Pos | HRs | yr_pos_mean | yr_pos_median | mean_adj | median_adj |
---|---|---|---|---|---|---|---|
TOR | 2021 | 4 | 45 | 21 | 10 | 2.14 | 4.5 |
TBR | 2021 | 4 | 39 | 21 | 10 | 1.86 | 3.9 |
NYY | 2021 | 4 | 10 | 21 | 10 | 0.48 | 1.0 |
BOS | 2021 | 4 | 6 | 21 | 10 | 0.29 | 0.6 |
BAL | 2021 | 4 | 5 | 21 | 10 | 0.24 | 0.5 |
In this code block, I determine which player ranked worst relative to the mean number of home runs for their position in a given season and display this information along with the results for their counterparts in that same season. The dubious honor goes to Reese McGuire, the Blue Jays catcher in 2021. His single home run during a season when the mean number of home runs for a catcher in the AL East division was 14.8 is the worst ratio of home runs to mean position home runs of any position for any team in any of the seasons being considered.
expanded_hr_data_lowmean <- expanded_hr_data[order(expanded_hr_data$mean_adj,decreasing = FALSE),]
expanded_hr_data_lowmean_df <- subset(expanded_hr_data_lowmean, Season == expanded_hr_data_lowmean$Season[1] & Pos == expanded_hr_data_lowmean$Pos[1])
kable(expanded_hr_data_lowmean_df, format = "pipe", caption = "Position and Season Data for Worst Home Run Value (Mean)", align = "lccccccc")
Team | Season | Pos | HRs | yr_pos_mean | yr_pos_median | mean_adj | median_adj |
---|---|---|---|---|---|---|---|
TOR | 2021 | 2 | 1 | 14.8 | 11 | 0.07 | 0.09 |
BOS | 2021 | 2 | 6 | 14.8 | 11 | 0.41 | 0.55 |
BAL | 2021 | 2 | 11 | 14.8 | 11 | 0.74 | 1.00 |
NYY | 2021 | 2 | 23 | 14.8 | 11 | 1.55 | 2.09 |
TBR | 2021 | 2 | 33 | 14.8 | 11 | 2.23 | 3.00 |
In this code block, I determine which player ranked worst relative to the median number of home runs for their position in a given season and display this information along with the results for their counterparts in that same season. Once again, Reese McGuire’s single home run in 2021 places him in last place.
expanded_hr_data_lowmedian <- expanded_hr_data[order(expanded_hr_data$median_adj,decreasing = FALSE),]
expanded_hr_data_lowmedian_df <- subset(expanded_hr_data_lowmedian, Season == expanded_hr_data_lowmedian$Season[1] & Pos == expanded_hr_data_lowmedian$Pos[1])
kable(expanded_hr_data_lowmedian_df, format = "pipe", caption = "Position and Season Data for Worst Home Run Value (Median)", align = "lccccccc")
Team | Season | Pos | HRs | yr_pos_mean | yr_pos_median | mean_adj | median_adj |
---|---|---|---|---|---|---|---|
TOR | 2021 | 2 | 1 | 14.8 | 11 | 0.07 | 0.09 |
BOS | 2021 | 2 | 6 | 14.8 | 11 | 0.41 | 0.55 |
BAL | 2021 | 2 | 11 | 14.8 | 11 | 0.74 | 1.00 |
NYY | 2021 | 2 | 23 | 14.8 | 11 | 1.55 | 2.09 |
TBR | 2021 | 2 | 33 | 14.8 | 11 | 2.23 | 3.00 |
I got to experiment more with tidyverse in this analysis and had a lot of fun. I’d definitely be interested in taking it further to look at all major league baseball teams rather than one division. For this analysis, the number of home runs given for each position is the number of home runs hit by the player who started the most games for the team at that position during the year. It might be interesting to redo the analysis where the home runs by position is based on who started each game at that position, rather than using the total home runs hit by the player who started at that position most often.