Consider the following dataset representing the performance of baseball players in a season. It includes the following variables: PlayerID, Hits, At-Bats, Home Runs (HR), Walks (BB), and Strikeouts (SO).
PlayerID Hits At-Bats HR BB SO
1 120 400 15 40 80
2 140 450 12 50 75
3 110 380 8 30 60
4 160 500 20 60 90
5 130 420 10 45 70
Compute the on-base percentage (OBP) for each player and select the player with the highest OBP.
To calculate OBP, you can use the following formula:
OBP = (Hits + Walks) / (At-Bats + Walks)
data<- data.frame(
PlayerID = c(1,2,3,4,5),
Hits = c(120,140,110,160,130),
At_Bats = c(400,450,380,500,420),
HR = c(15,12,8,20,10),
BB = c(40,50,30,60,45),
SO = c(80,75,60,90,70)
)
data
#on-base percentage (OBP) for each player
data$OBP<- (data$Hits + data$BB) / (data$At_Bats + data$BB)
data$OBP
## [1] 0.3636364 0.3800000 0.3414634 0.3928571 0.3763441
#select the player with the highest OBP
player_highest_obp<- data$PlayerID[which.max(data$OBP)]
player_highest_obp
## [1] 4