Introduction

The data presented in this markup was scrapped from the site https://mykbostats.com/players/ and placed into a csv file beforehand. The csv file can be found in the following github repo titled as “KBO_Batters.csv”: https://github.com/Ryungje/DATA607/tree/main/Assignment%201

I had initially collected this data during a personal project in which I wanted to create a predictive model for KBO baseball player performance. Though the project did not get very far, the data I had collected is still very useful.


Import File from CSV

This is large data set, as seen from the dimensions.

batters <- read.csv("KBO_Batters.csv")
head(batters)
##           Name Year         Team    BA   OBP   SLG   OPS  G  PA  AB  R  H X2B
## 1 Ko Young-min 2016 Doosan Bears 0.250 0.400 0.500 0.900  8   5   4  1  1   1
## 2 Ko Young-min 2015 Doosan Bears 0.328 0.403 0.478 0.881 41  77  67 13 22   1
## 3 Ko Young-min 2014 Doosan Bears 0.287 0.355 0.340 0.695 52 108  94 18 27   2
## 4 Ko Young-min 2013 Doosan Bears 0.286 0.412 0.571 0.983 10  17  14  3  4   1
## 5 Ko Young-min 2012 Doosan Bears 0.265 0.335 0.404 0.739 58 173 151 33 40  10
## 6 Ko Young-min 2011 Doosan Bears 0.210 0.305 0.301 0.606 93 208 176 31 37   5
##   X3B HR RBI SB CS BB SO TB GDP HBP SH SF IBB RISP  PHBA
## 1   0  0   1  0  0  1  1  2   0   0  0  0   0 0.50 0.333
## 2   0  3  11  4  2  6 22 32   4   3  0  1   0 0.24 0.444
## 3   0  1   7  1  1 11 18 32   3   0  1  2   0 0.20 0.294
## 4   0  1   1  1  0  3  5  8   1   0  0  1   0 0.25 0.333
## 5   1  3  26  7  1 13 28 61   5   3 NA NA  NA   NA    NA
## 6   1  3  16  6  6 18 50 53   4   7 NA NA  NA   NA    NA

Create Subset

For our current purposes, we only care about players yearly Team, On-Base Percentage (OBP), Slugging Average (SLG), and OPS.

batters <- batters[c('Name', 'Team', 'Year', 'OBP', 'SLG', 'OPS')]
head(batters)
##           Name         Team Year   OBP   SLG   OPS
## 1 Ko Young-min Doosan Bears 2016 0.400 0.500 0.900
## 2 Ko Young-min Doosan Bears 2015 0.403 0.478 0.881
## 3 Ko Young-min Doosan Bears 2014 0.355 0.340 0.695
## 4 Ko Young-min Doosan Bears 2013 0.412 0.571 0.983
## 5 Ko Young-min Doosan Bears 2012 0.335 0.404 0.739
## 6 Ko Young-min Doosan Bears 2011 0.305 0.301 0.606

Find Best OPS

In the simplest way, I shall find the player (disregarding year and team) with the highest OPS.

batters[which.max(batters$OPS),]
##          Name             Team Year OBP SLG OPS
## 2984 Heo Joon Hyundai Unicorns 2006   1   4   5

Conclusion

There are many, and quite complicated, ways to measure a baseball player’s performance. The methods presented here are trivial and does not take into account nearly all the data that is available per player. But with that said, the best player in the KBO is Heo Joon.