This analysis evaluates catchers’ throwing ability. The provided data includes information on game situations and throw metrics. The written evaluations for the five catchers are at the bottom of the document.

Explanation of Process

There are many factors that contribute to whether a runner is thrown out, several of which are entirely out of the catcher’s control. Much of run game control is based on the pitcher, and runner jumps and speeds are variable.

Using catchers’ actual CS rates would strongly reflect the noise described above. Instead, I created a model to predict whether a runner is thrown out based on the factors within the catchers’ control. When considering pop time and throw location, I called the variable xCS. Then, I replaced pop time with throw velocity to remove the footwork and exchange and focus on the throw. I called this variable xCS_throw. I also calculated average velocity and a “best zone” percentage. I used these variables separate and evaluate the aspects of the process, including the exchange, footwork, throw, and noise.

Data Wrangling

#library(tidyverse)
#library(xgboost)
#library(forecast)
#library(caret)

# Load data
throw_df <- read.csv('throws.csv')

# Add names for prompt-specified catchers

# From prompt: Focus on the following catchers: mbfa77f1d7d2b05a, m9db84fc73b25d29, m4f63b6eb7e0cc67, m2bdcfa35ae599eb, m7144941874a1d61.

throw_df$catcher_name <- if_else(throw_df$catcher_id == 'mbfa77f1d7d2b05a', 'Player A',
                          if_else(throw_df$catcher_id == 'm9db84fc73b25d29', 'Player B',
                           if_else(throw_df$catcher_id == 'm4f63b6eb7e0cc67', 'Player C',
                            if_else(throw_df$catcher_id == 'm2bdcfa35ae599eb', 'Player D',
                             if_else(throw_df$catcher_id == 'm7144941874a1d61', 'Player E',
                                                                throw_df$catcher_id)))))

# Create necessary columns

# Determine whether steal attempt is 2B or 3B
throw_df$throw_base <- if_else(throw_df$pre_base_state == '1-0-0', 2, 3)

# Determine how many outs were recorded on the play (>0 does not necessarily mean runner was out, could be K)
throw_df$outs_on_play <- throw_df$post_outs - throw_df$pre_outs

# Separate 2B and 3B throws
throws_2b <- throw_df %>%
  filter(throw_base == 2)

throws_3b <- throw_df %>%
  filter(throw_base == 3)

In order to evaluate the effectiveness of throws, we need to know whether the runner was thrown out. This can be deduced from the game event columns.

# Deduce whether runner was thrown out. Because of the data structure, some plays where the third out of an inning clears the bases are ambiguous (could be a strikeout or runner thrown out).

throws_2b <- throws_2b %>%
                              # if no out, not thrown out
  mutate(runner_out = if_else(outs_on_play == 0, 0,
                              # if bases clear and not end of inning, thrown out
                        if_else(post_base_state == '0-0-0' & post_outs < 3, 1,
                              # if walk (inning not over) and base runner out, thrown out
                        if_else(post_base_state == '1-0-0', 1,
                              # if two outs recorded, K / throw out double play
                          if_else(post_base_state == '0-0-0' & outs_on_play == 2, 1,
                              # if bases clear and inning over and no double play, 
                              # could be K or runner thrown out (ambiguous)
                            if_else(post_base_state == '0-0-0' & post_outs == 3, NA,
                                    0))))))
# Repeat for 3B

throws_3b <- throws_3b %>%
  mutate(runner_out = if_else(outs_on_play == 0, 0,
                        if_else(post_base_state == '0-0-0' & post_outs < 3, 1,
                        if_else(post_base_state == '1-0-0', 1,
                          if_else(post_base_state == '0-0-0' & outs_on_play == 2, 1,
                            if_else(post_base_state == '0-0-0' & post_outs == 3, NA,
                                    0))))))

Predictive Modeling

Using catchers’ actual CS rates would reflect the noise described above. Instead, I create a model to predict whether a runner is thrown out based on the factors within the catchers’ control.

XG Boost

I decided to use an xg boost model because of its ability to handle imbalanced data sets. xgboost is also a good choice because it creates averages at each end node. This is effective in standardizing each throw across different pitcher/runner situations.

# Remove rows with NA values, including ambiguous plays
# Necessary for xgboost because response cannot include NAs
boost_data_2b <- throws_2b %>%
  na.omit()

Boostrapping: This technique accounts for the limited and imbalanced data set.

RNGkind(sample.kind = "Rounding")

## Warning in RNGkind(sample.kind = "Rounding"): non-uniform 'Rounding' sampler
## used

set.seed(1986) # Let's Go Mets
# Correct the imbalance in the data set by bootstrapping
sampled_rows_2b <- sample(1:nrow(boost_data_2b), nrow(boost_data_2b)*2, replace = TRUE)
  
bootstraps_2b <- boost_data_2b[sampled_rows_2b, ]

# Check the new distribution of 1s and 0s
table(bootstraps_2b$runner_out)

## 
##    0    1 
## 1586  384

# Create training and validation data sets
RNGkind(sample.kind = "Rounding")

## Warning in RNGkind(sample.kind = "Rounding"): non-uniform 'Rounding' sampler
## used

set.seed(1986)
train_rows_2b <- sample(1:dim(bootstraps_2b)[1], dim(bootstraps_2b)[1] * .8) # 80% training, 20% validation
train_2b <- bootstraps_2b[train_rows_2b,] 
valid_2b <- bootstraps_2b[-train_rows_2b,]

The XG Boost uses the following variables to predict whether a runner is thrown out, thus controlling for all pitcher and runner variables.

throw_end_position_x: The x coordinate of the throw’s end position
throw_end_position_z: The z coordinate of the throw’s end position
pop_time: this encompasses throw velocity and exchange time

# Create matrices suitable for xgboost
dtrain_2b <- xgb.DMatrix(data = as.matrix(train_2b[,c('throw_end_position_x', 'throw_end_position_z',                                                    'pop_time')]), label = train_2b$runner_out)
dvalid_2b <- xgb.DMatrix(data = as.matrix(valid_2b[,c('throw_end_position_x','throw_end_position_z',
                                                   'pop_time')]), label = valid_2b$runner_out)
# Create xgboost model
xgb_2b <- xgboost(data = dtrain_2b, 
                  label = train_2b$runner_out, 
                  nrounds = 500, 
                  eta = 0.05, 
                  eval_metric = "error",
                  verbose = 0)

## Warning in xgb.get.DMatrix(data, label, missing, weight, nthread =
## merged$nthread): xgboost: label will be ignored.

boost_preds_2b <- predict(xgb_2b, dvalid_2b)

pred_dat_2b <- cbind.data.frame(boost_preds_2b, valid_2b$runner_out)

accuracy_2b <- accuracy(boost_preds_2b, valid_2b$runner_out)

xg_df_2b <- data.frame(predicted_value = boost_preds_2b, actual_value = valid_2b$runner_out, pop_time = valid_2b$pop_time)

head(xg_df_2b)

##   predicted_value actual_value pop_time
## 1     0.170701995            0    1.927
## 2     0.030102080            0    1.902
## 3     0.023372684            0    1.979
## 4     0.008492525            0    1.929
## 5     0.082195602            0    1.926
## 6     0.047497850            0    2.010

cutoff <- 0.5 # Any throw predicted above 0.5 is considered a throw out
pred_class_2b <- if_else(boost_preds_2b > cutoff, 1, 0)

confusionMatrix(as.factor(pred_class_2b), as.factor(valid_2b$runner_out))

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction   0   1
##          0 306  12
##          1   9  67
##                                           
##                Accuracy : 0.9467          
##                  95% CI : (0.9197, 0.9667)
##     No Information Rate : 0.7995          
##     P-Value [Acc > NIR] : <2e-16          
##                                           
##                   Kappa : 0.8314          
##                                           
##  Mcnemar's Test P-Value : 0.6625          
##                                           
##             Sensitivity : 0.9714          
##             Specificity : 0.8481          
##          Pos Pred Value : 0.9623          
##          Neg Pred Value : 0.8816          
##              Prevalence : 0.7995          
##          Detection Rate : 0.7766          
##    Detection Prevalence : 0.8071          
##       Balanced Accuracy : 0.9098          
##                                           
##        'Positive' Class : 0               
##

The accuracy metrics above show that the model is effective at predicting whether a runner is thrown out. It should not be perfectly accurate, as the degree of inaccuracy accounts for throws where the catcher is not responsible for the steal

dfull_2b <- xgb.DMatrix(data = as.matrix(boost_data_2b[,c('throw_end_position_x',
                    'throw_end_position_z','pop_time')]), label = boost_data_2b$runner_out)

full_preds_2b <- predict(xgb_2b, dfull_2b)

catcher_data_2b <- data.frame(catcher = boost_data_2b$catcher_name, 
                           predicted_value = full_preds_2b, 
                           actual_value = boost_data_2b$runner_out)

catcher_stats_2b <- catcher_data_2b %>%
  group_by(catcher) %>%
  summarize(xCS_rate_2b = mean(predicted_value),
            CS_rate_2b = mean(actual_value),
            n_2b = n()) %>%
  arrange(desc(n_2b))

head(catcher_stats_2b)

## # A tibble: 6 × 4
##   catcher          xCS_rate_2b CS_rate_2b  n_2b
##   <chr>                  <dbl>      <dbl> <int>
## 1 m071bba992b6f220       0.258      0.2      30
## 2 Player C               0.163      0.143    28
## 3 m92aa1be6311678f       0.271      0.296    27
## 4 mc3e5cf8c95d750f       0.262      0.148    27
## 5 Player A               0.193      0.115    26
## 6 Player B               0.173      0.192    26

# Plot x_cs_rate against cs_rate
# Add a 45 degree line to show where x_cs_rate = cs_rate
# Only include catchers with at least 10 throws

plot_2b <- catcher_stats_2b %>% 
  filter(n_2b >= 10) %>%
  ggplot( aes(x = CS_rate_2b, y = xCS_rate_2b)) +
  geom_point(color = 'blue') +
  geom_abline(intercept = 0, slope = 1, color = 'darkorange1') +
  scale_x_continuous(limits = c(0,0.6), breaks = seq(0,1,0.1)) +
  scale_y_continuous(limits = c(0,0.6), breaks = seq(0,1,0.1)) +
  labs(x = 'Actual CS Rate', y = 'Predicted CS Rate', title = 'xgboost Model: 2B Throws', 
       subtitle = 'Minimum 10 Throws')

plot_2b

This plot shows the relationship between the actual CS rate and the predicted CS rate. The 45 degree line shows where the predicted CS rate is equal to the actual CS rate. Points above the line can be understood to “deserve” a higher CS rate than they actually have, given the quality of their performance.

catcher_eval_2b <- catcher_stats_2b %>%
  filter(catcher %in% c('Player A', 'Player B', 'Player C', 'Player D', 'Player E')) %>%
  arrange(desc(xCS_rate_2b))

catcher_eval_2b

## # A tibble: 5 × 4
##   catcher  xCS_rate_2b CS_rate_2b  n_2b
##   <chr>          <dbl>      <dbl> <int>
## 1 Player D       0.329      0.278    18
## 2 Player E       0.244      0.25     12
## 3 Player A       0.193      0.115    26
## 4 Player B       0.173      0.192    26
## 5 Player C       0.163      0.143    28

XG Boost repeated for 3B, not included in report.

Run an XG Boost using only throw information, observe differences between throw xCS and regular xCS.

This XG Boost uses the following variables, thus additionally controlling for exchange time and focusing on the quality of the throw itself.

throw_end_position_x: The x coordinate of the throw’s end position
throw_end_position_z: The z coordinate of the throw’s end position
throw_velo

# Create matrices suitable for xgboost
dtrain_2b_throw <- xgb.DMatrix(data = as.matrix(train_2b[,c('throw_end_position_x', 'throw_end_position_z',                                                    'throw_velo')]), label = train_2b$runner_out)
dvalid_2b_throw <- xgb.DMatrix(data = as.matrix(valid_2b[,c('throw_end_position_x','throw_end_position_z',
                                                   'throw_velo')]), label = valid_2b$runner_out)

The remainder of the xgboost is not included in this output.

catcher_eval_2b_throw

## # A tibble: 5 × 4
##   catcher  xCS_rate_2b_throw CS_rate_2b  n_2b
##   <chr>                <dbl>      <dbl> <int>
## 1 Player D             0.273      0.278    18
## 2 Player E             0.217      0.25     12
## 3 Player A             0.170      0.115    26
## 4 Player C             0.167      0.143    28
## 5 Player B             0.165      0.192    26

Repeat for 3B (not included in output)

Join the various expected stats into one dataframe

catcher_eval_full <- catcher_eval_2b %>%
  left_join(catcher_eval_2b_throw, by = 'catcher') %>%
  left_join(catcher_eval_3b, by = 'catcher') %>%
  left_join(catcher_eval_3b_throw, by = 'catcher')

catcher_eval_full <- catcher_eval_full %>%
  select(catcher, n_2b.x, CS_rate_2b.x, xCS_rate_2b, xCS_rate_2b_throw, n_3b.x, CS_rate_3b.x, xCS_rate_3b, xCS_rate_3b_throw) %>%
  rename(n_2b = n_2b.x, 
         CS_2b = CS_rate_2b.x, 
         xCS_2b = xCS_rate_2b,
         xCS_2b_throw = xCS_rate_2b_throw,
         n_3b = n_3b.x, 
         CS_3b = CS_rate_3b.x,
         xCS_3b = xCS_rate_3b,
         xCS_3b_throw = xCS_rate_3b_throw) %>%
  mutate(CS_2b = round(CS_2b, 3),
         xCS_2b = round(xCS_2b, 3),
         xCS_2b_throw = round(xCS_2b_throw, 3),
         CS_3b = round(CS_3b, 3),
         xCS_3b = round(xCS_3b, 3),
         xCS_3b_throw = round(xCS_3b_throw, 3))

catcher_eval_full

## # A tibble: 5 × 9
##   catcher   n_2b CS_2b xCS_2b xCS_2b_throw  n_3b CS_3b xCS_3b xCS_3b_throw
##   <chr>    <int> <dbl>  <dbl>        <dbl> <int> <dbl>  <dbl>        <dbl>
## 1 Player D    18 0.278  0.329        0.273     6 0      0            0.093
## 2 Player E    12 0.25   0.244        0.217     2 0      0            0    
## 3 Player A    26 0.115  0.193        0.17      1 0      0            0    
## 4 Player B    26 0.192  0.173        0.165     5 0     -0.001        0.005
## 5 Player C    28 0.143  0.163        0.167     3 0.333  0.335        0.552

Calculate the average throw velocities and “best zone” percentage. The “best zone” is a 2x3 foot zone that is in the ideal tag location for each base.

throw_eval_2b <- throws_2b %>%
  group_by(catcher_name) %>%
  summarize(avg_velo_2b = mean(throw_velo, na.rm = T))

throw_eval_3b <- throws_3b %>%
  group_by(catcher_name) %>%
  summarize(avg_velo_3b = mean(throw_velo, na.rm = T))

throws_2b$best_zone <- ifelse(
  throws_2b$throw_end_position_z <= 2 & throws_2b$throw_end_position_z > 0 &
  throws_2b$throw_end_position_x >= 0 & throws_2b$throw_end_position_x <= 3, 
  1, 0)

throws_3b$best_zone <- ifelse(
  throws_3b$throw_end_position_z <= 2 & throws_3b$throw_end_position_z > 0 &
  throws_3b$throw_end_position_x >= -60 & throws_3b$throw_end_position_x <= -57,
  1, 0)

throw_eval_2b <- throws_2b %>%
  rename(catcher = catcher_name) %>%
  group_by(catcher) %>%
  summarize(avg_velo_2b = round(mean(throw_velo, na.rm = T),3),
            best_zone_2b = round(mean(best_zone, na.rm = T),3))

catcher_eval_full <- catcher_eval_full %>%
  left_join(throw_eval_2b, by = 'catcher')

throw_eval_3b <- throws_3b %>%
  rename(catcher = catcher_name) %>%
  group_by(catcher) %>%
  summarize(avg_velo_3b = round(mean(throw_velo, na.rm = T),3),
            best_zone_3b = round(mean(best_zone, na.rm = T),3))

catcher_eval_full <- catcher_eval_full %>%
  left_join(throw_eval_3b, by = 'catcher')

catcher_eval_full

## # A tibble: 5 × 13
##   catcher   n_2b CS_2b xCS_2b xCS_2b_throw  n_3b CS_3b xCS_3b xCS_3b_throw
##   <chr>    <int> <dbl>  <dbl>        <dbl> <int> <dbl>  <dbl>        <dbl>
## 1 Player D    18 0.278  0.329        0.273     6 0      0            0.093
## 2 Player E    12 0.25   0.244        0.217     2 0      0            0    
## 3 Player A    26 0.115  0.193        0.17      1 0      0            0    
## 4 Player B    26 0.192  0.173        0.165     5 0     -0.001        0.005
## 5 Player C    28 0.143  0.163        0.167     3 0.333  0.335        0.552
## # ℹ 4 more variables: avg_velo_2b <dbl>, best_zone_2b <dbl>, avg_velo_3b <dbl>,
## #   best_zone_3b <dbl>

Evaluations

Player A (mbfa77f1d7d2b05a)

Player A has a very low CS rate to 2B at 12%, but based on his xCS, he should have a significantly higher rate, around 19%. Based on his throws alone, he should have a CS rate of 17%. This is still below average, so improvement is needed. The most glaring weakness is in his lack of velocity (avg 73mph). The sample size to 3B makes this evaluation difficult, but the 68mph throw velocity is still a concern.

Improvement can be expected, but Player A’s throwing ability and run game control is still below-average.

Player B (m9db84fc73b25d29)

Player B’s CS rate is 19%, but his throwing ability can be expected to yield a CS rate of only 17%. Like Player A, Player B struggles to generate velocity (avg 72 mph to both 2B and 3B). This is likely an arm strength issue because he loses more xCS in the throw itself, when the exchange and footwork removed.

Player B is below-average, and his arm strength is a concern.

Player C (m4f63b6eb7e0cc67)

Player C has a very low CS rate of 14%, but his xCS is slightly higher at 16%. Unlike Player B, he loses more xCS when the exchange and footwork are included. When only considering the throw, the xCS is 17%. Likely, Player C’s footwork gets him in the right position, though not efficiently. Of note, Player C throws into the “best zone” 26% of the time to 2B. This suggests flashes of arm ability beyond his current performance.

Player C has the potential to be average at controlling the run game, but his footwork needs to improve.

Player D (m2bdcfa35ae599eb)

Player D has a very high xCS at 33%. This jump from his 28% CS rate is entirely due to his plus footwork and exchange. His 27% xCS based on the throw alone is above average. His velocities at 80mph and 78mph average.

Player D is above-average at controlling the run game, and his footwork and exchange are his strengths.

Player E (m7144941874a1d61)

Player E has an average CS rate of 25%, but his xCS based on the throw alone is only 22%. He has 0 throws into the “best zone”. He has the strongest arm of the group at 81mph and 83 mph, but control seems to be an issue. This issue can be difficult to correct.

Player E is average at controlling the run game, but his performance is unlikely to improve.

Catcher Evaluations

Ben Scartz

2023-12-08