NBA Draft 2024 Prospect Rankings

Intro

The 2024 NBA draft has been widely regarded as one of the weirdest classes in recent history. There is no overarching MVP or ALL-NBA level talent like last year with Victor Wembanyama, and no definitive answer to what the Atlanta Hawks will do with the first overall pick. Even with the negative sentiment surrounding this draft class, many prospects have the potential to be a decent starter or role player in the league, if not better. In my opinion and from what I gather from NBA and Draft analysts, the pick variation for prospects this year is quite high as there is no extreme talent gap between players, meaning teams will generally draft for fit (which I believe could start as early as 7 with the Blazers or 9 with the Grizzlies).

To figure out if high variation of pick selection actually exists for this draft class, I compiled complete 2-round mock drafts from 12 different sources to see the average mocked pick for each prospect and the standard deviation of that average. The mock draft sources included: Bleacher Report, CBS Sports, ESPN, nbadraft.net, NBA Draft Room, Net Scouts Basketball, SB Nation, Sports Illustrated, Tankathon, The Athletic, The Ringer, and Yahoo Sports. After gathering the mock averages and standard deviations, I then modeled the mock average variable using Random Forest and XGBoost regression to determine which features affect mock draft position the most. Features for these models came from prospect bios, combine data, and box-score and advanced stats for the prospects’ 2023-24 season. More information of the data and models will be provided in upcoming sections.

Set Up

To start my analysis, I first collected the data previously mentioned from NBA.com (Bios and Combine), basketballreference.com (College/G-League Stats), and RealGM.com (International Stats). Biography data included variables such as position, age, and status (college, international, or G-League) and combine data included statistics for all combine drills. Previous season data consisted of regular box score and advanced statistics.

library(tidyverse)
library(caret)
library(car)
library(vip)
prospect_data <- read_csv("prospect_data.csv")

prospect_data <- prospect_data %>%
  mutate(across(where(is.character), ~ iconv(.x, 
                                             to = "UTF-8")))

Calculating Average and Standard Deviation Mock Pick

After loading in the data, I first calculated the mean mock draft pick for each player by averaging the mock pick for all 12 mock drafts then subtracting the sum of NA values from that average to penalize prospects that did not appear in one or many mock drafts (mocked to be undrafted). Subtracting the sum of NAs was a basic method for penalization and different methods would and should be applied in future work. Along with the average mock pick, I also calculated the standard deviation for each mock pick average to see if the variation in mock draft pick is as pronounced as it is claimed to be. The table below displays each prospect that appeared in at least 3 of the chosen mock drafts.

prospect_data <- prospect_data %>%
  rowwise() %>%
  mutate(mock_average = mean(c_across(contains("Mock")), 
                             na.rm = TRUE) - 
           sum(is.na(c_across(contains("Mock")))),
         mock_sd = sd(c_across(contains("Mock")), na.rm = TRUE)
  ) %>%
  arrange(mock_average) %>%
  ungroup() %>%
  mutate(Mock_Pick = row_number())

mock_mean_sd <- prospect_data %>%
  select(Name, Mock_Pick, mock_average, mock_sd) %>%
  arrange(Mock_Pick)

knitr::kable(mock_mean_sd)

Name	Mock_Pick	mock_average	mock_sd
Zaccharie Risacher	1	1.333333	0.8498366
Alexandre Sarr	2	1.833333	0.3726780
Reed Sheppard	3	3.666667	1.1785113
Stephon Castle	4	4.833333	1.5723302
Matas Buzelis	5	5.416667	0.9537936
Donovan Clingan	6	5.750000	1.4790199
Dalton Knecht	7	9.083333	2.0999339
Ron Holland II	8	9.750000	4.2056510
Cody Williams	9	9.750000	2.9474565
Tidjane Salaun	10	10.666667	2.2852182
Devin Carter	11	10.916667	2.7525241
Rob Dillingham	12	11.083333	3.6846152
Nikola Topic	13	11.666667	4.2295258
Jared McCain	14	16.083333	1.3202483
JaKobe Walter	15	16.500000	3.7969286
Zach Edey	16	18.333333	4.4783429
Tristan da Silva	17	18.583333	4.6629449
Isaiah Collier	18	18.833333	5.4594465
Carlton Carrington	19	19.750000	4.7980031
Kelel Ware	20	20.250000	5.1498382
Kyshawn George	21	20.916667	4.3866907
Yves Missi	22	21.250000	4.0645828
Tyler Kolek	23	22.083333	3.3530666
Kyle Filipowski	24	24.166667	5.4441610
Johnny Furphy	25	24.333333	3.6817870
DaRon Holmes II	26	24.916667	4.9406196
Tyler Smith	27	27.500000	7.3541372
Terrence Shannon Jr.	28	28.416667	4.5727150
Jaylon Tyson	29	29.416667	3.9255219
Ryan Dunn	30	29.666667	5.0881125
Pacome Dadiet	31	31.333333	3.9440532
Bobi Klintman	32	31.916667	6.2777163
Baylor Scheierman	33	32.250000	5.4791271
Cam Christie	34	36.500000	7.3200638
Kevin McCullar Jr.	35	37.750000	6.7838657
AJ Johnson	36	38.272727	7.4290492
NFaly Dante	37	39.333333	10.8230721
KJ Simpson	38	40.200000	10.2138943
Trentyn Flowers	39	40.833333	7.7855687
Dillon Jones	40	41.272727	7.2431691
Jamal Shead	41	41.363636	6.6705739
Adem Bona	42	41.583333	5.9225323
Justin Edwards	43	41.666667	8.6922699
Harrison Ingram	44	41.818182	6.4846184
Trey Alexander	45	41.833333	8.5993263
PJ Hall	46	41.857143	6.1959911
Jonathan Mogbo	47	42.250000	5.1659946
Nikola Djurisic	48	43.083333	4.7338908
Enrique Freeman	49	44.285714	8.1705357
Ulrich Chomche	50	44.444444	3.7464386
Keshad Johnson	51	45.100000	5.6083542
Pelle Larsson	52	45.300000	6.1622753
Melvin Ajinca	53	46.000000	6.2589330
Juan Nunez	54	46.083333	6.6640620
Ajay Mitchell	55	46.333333	8.0966385
Cam Spencer	56	47.750000	6.3415517
Tristen Newton	57	48.200000	3.5670207
Jaylen Wells	58	48.250000	7.9175438
Antonio Reeves	59	48.666667	6.2902040
Reece Beekman	60	48.750000	9.4597833
Jalen Bridges	61	49.200000	4.4411301
Oso Ighodaro	62	49.727273	5.2492292
Isaac Jones	63	50.500000	6.2888791
Bronny James	64	52.666667	5.0387388
Boogie Ellis	65	53.000000	7.6376262

From the table, you can see that Zaccharie Risacher had the lowest average mock pick, followed by Alex Sarr, Reed Sheppard, and Stephon Castle (the most frequent first four picks in mocks). Among the 65 prospects looked at, USC teammates Boogie Ellis and Bronny James had the lowest average mock pick. Ellis was only picked in a few mocks and the majority of mocks had the Suns taking Bronny at pick 55. Looking at standard deviation, the lowest variation in mock draft pick was Alex Sarr, who was either the first or second in all mocks (sd under 1). Sarr was followed by fellow lottery picks such as Risacher, Sheppard, Matt Buzelis, and Donovan Clingan. On the opposite side of the spectrum, the prospects with the highest variation in mock draft position was N’Faly Dante from Oregon (mock pick 37) and KJ Simpson from Colorado (mock pick 38), who both had standard deviations of over 10 (could be picked around 10 spots higher or lower than average pick). Non-surprisingly, both these players are projected to go in the second round, where team’s draft decisions change frequently. Some other interesting observations to note are Tyler Kolek having the lowest standard deviation of projected non-lottery picks (many mocks have him going 22 to Suns), Nikola Topic and Ron Holland having the largest standard deviation of projected lottery picks, Tristen Newton having the lowest standard deviation of a projected second rounder (13th lowest overall), and Jared Mccain having the 5th lowest standard deviation despite being projected as the 14th pick. Overall, 30 of the 65 prospects had a standard deviation below 5, meaning half the draft class could be picked 5+ spots higher or lower than their average mock pick.

Random Forest

To better understand why some players are mocked (on average) higher than others and which variables most influence average mock draft pick, I ran a random forest using the caret package in R. The response variable for the model was the average mock pick and the explanatory variables included: Offensive Win Shares, Defensive Win Shares, Two-point percentage, Three-point rate, Free Throw Percentage, Wingspan, Standing Vertical Leap, Age, Personal Fouls, Position, Minutes Per Game, Usage Percentage, Turnover Percentage, and Block Percentage. Explanatory features were chosen based on low RMSE, relatively high R^2, reasonable predictions, and overall relevance to the model. Win shares were the only advanced encapsulating statistics that was calculated for all prospects and positively contributed to the model (PER did not). The combination of efficiency statistics chosen (2P%, 3PAr, and FT%) produced the smallest RMSE compared to all other combinations of efficiency metrics (TS%, eFG%, 3P%, FTr, and attempts and makes instead of percentages). Wingspan and standing vertical leap were the only statistically relevant combine metrics. Age was chosen over prospect status (college year, international, or g-league) due to better model performance. The only counting stats chosen were personal fouls and minutes per game, as the variation in games played between prospects altered counting statistics substantially. However, contrary to the other counting stats, both chosen metrics improved model performance while not noticeably changing predictions. Once features were selected, the data was split, scaled, and centered and the model was ran with 5-fold cross-validation.

features <- prospect_data %>%
  select(Name, mock_average, OWS, DWS, `2P%`, `FT%`,
         Wingspan, `Standing Vertical Leap`, Age, PF, Pos,
         MPG, `USG%`, `TOV%`, `BLK%`, `3PAr`)

set.seed(123)
trainIndex <- createDataPartition(features$mock_average, 
                                  p = .7, 
                                  list = FALSE, 
                                  times = 1)
train <- features[trainIndex, ]
test  <- features[-trainIndex, ]

fitControl <- trainControl(
  method = "repeatedcv",
  number = 5,
  repeats = 10)

rf_model <- train(
  mock_average ~ ., 
  data = train[, -1],
  method = "rf",
  trControl = fitControl,
  preProcess = c("center", "scale")
)

The final random forest model explained 20.63% of the variation in mock pick average for the training set, not particularly good but relative to models with other combinations of variables it provided the best trade-off between model complexity and predictive power. The test set RMSE for this model was 11.88, which was relatively low as well. Using this model, mock draft pick was predicted for each player, producing an RMSE of 7.73. The models draft predictions, as well as the initial mock pick average and the residuals are shown below.

From the table, you can see predictions ranged from 6 to 46, with a mean of 30.89 and a majority of prospects having pick predictions between 20 and 40. Because of this, I decided to rank the prospects given their model predictions and used that ranking as the models “mock draft”. Pittsburgh point guard Bub Carrington and Baylor big Yves Missi both jumped up into the top 10 while projected lottery picks Nikola Topic and Dalton Knecht fell to picks 22 and 41 respectively. Knecht had the biggest negative residual, accomponied by other projected first rounders Tristan da Silva and Kyshawn George. Prospects with the largest positive residuals were Melvin Ajinca, Isaac Jones, Nikola Djurisic, and Bronny James, all projected second round prospects.

test_prediction <- predict(rf_model, test)

rmse <- function(actual, predicted) {
  sqrt(mean((predicted - actual)^2))
}
rmse(test$mock_average, test_prediction)

## [1] 11.87708

rf_predictions <- predict(rf_model, features)
rmse(features$mock_average, rf_predictions)

## [1] 7.727638

mock_picks <- features %>%
  mutate(rf_pred = rf_predictions) %>%
  arrange(rf_pred) %>%
  mutate(Rank = row_number(),
         rf_residual = mock_average - rf_pred) %>%
  select(Name, mock_average, rf_pred, rf_residual, Rank)

knitr::kable(mock_picks)

Name	mock_average	rf_pred	rf_residual	Rank
Zaccharie Risacher	1.333333	6.997912	-5.6645787	1
Stephon Castle	4.833333	9.272980	-4.4396467	2
Matas Buzelis	5.416667	10.185224	-4.7685577	3
Alexandre Sarr	1.833333	11.858385	-10.0250512	4
Donovan Clingan	5.750000	13.763617	-8.0136168	5
Ron Holland II	9.750000	14.927029	-5.1770285	6
Reed Sheppard	3.666667	15.724277	-12.0576108	7
Tidjane Salaun	10.666667	16.082777	-5.4161108	8
Cody Williams	9.750000	17.785251	-8.0352508	9
Carlton Carrington	19.750000	19.505192	0.2448080	10
Yves Missi	21.250000	20.766727	0.4832731	11
Kelel Ware	20.250000	21.355050	-1.1050502	12
Devin Carter	10.916667	21.555295	-10.6386284	13
Rob Dillingham	11.083333	21.645070	-10.5617370	14
Tyler Smith	27.500000	22.120956	5.3790441	15
Kyle Filipowski	24.166667	22.598074	1.5685925	16
Zach Edey	18.333333	22.764336	-4.4310025	17
Jared McCain	16.083333	22.966886	-6.8835524	18
Johnny Furphy	24.333333	23.194785	1.1385479	19
JaKobe Walter	16.500000	23.862587	-7.3625872	20
Isaiah Collier	18.833333	24.101406	-5.2680726	21
Nikola Topic	11.666667	25.034453	-13.3677863	22
DaRon Holmes II	24.916667	27.220231	-2.3035645	23
Tyler Kolek	22.083333	27.893503	-5.8101696	24
Pacome Dadiet	31.333333	28.914096	2.4192370	25
Melvin Ajinca	46.000000	29.334860	16.6651402	26
Terrence Shannon Jr.	28.416667	30.968809	-2.5521421	27
Nikola Djurisic	43.083333	32.057005	11.0263286	28
Ryan Dunn	29.666667	32.307017	-2.6403505	29
Jaylon Tyson	29.416667	33.376324	-3.9596570	30
Baylor Scheierman	32.250000	33.872855	-1.6228553	31
Jamal Shead	41.363636	34.021793	7.3418435	32
Bobi Klintman	31.916667	34.390368	-2.4737014	33
Cam Christie	36.500000	34.786834	1.7131659	34
Ulrich Chomche	44.444444	34.937049	9.5073958	35
Justin Edwards	41.666667	35.100633	6.5660336	36
AJ Johnson	38.272727	35.193657	3.0790707	37
Kyshawn George	20.916667	35.597078	-14.6804113	38
Juan Nunez	46.083333	35.930116	10.1532178	39
Tristan da Silva	18.583333	36.108122	-17.5247882	40
Dalton Knecht	9.083333	36.118379	-27.0350457	41
Trey Alexander	41.833333	36.243704	5.5896292	42
Dillon Jones	41.272727	36.374544	4.8981829	43
Kevin McCullar Jr.	37.750000	36.583071	1.1669294	44
NFaly Dante	39.333333	36.700985	2.6323484	45
Trentyn Flowers	40.833333	37.167548	3.6657850	46
Adem Bona	41.583333	37.495626	4.0877074	47
Enrique Freeman	44.285714	37.910549	6.3751653	48
Isaac Jones	50.500000	38.313360	12.1866398	49
PJ Hall	41.857143	38.353300	3.5038425	50
KJ Simpson	40.200000	38.560479	1.6395207	51
Tristen Newton	48.200000	39.281590	8.9184100	52
Keshad Johnson	45.100000	39.415559	5.6844414	53
Jonathan Mogbo	42.250000	40.147405	2.1025951	54
Jaylen Wells	48.250000	41.554553	6.6954468	55
Harrison Ingram	41.818182	41.698179	0.1200029	56
Bronny James	52.666667	42.496154	10.1705129	57
Ajay Mitchell	46.333333	42.946228	3.3871049	58
Oso Ighodaro	49.727273	43.154260	6.5730130	59
Antonio Reeves	48.666667	43.226551	5.4401159	60
Jalen Bridges	49.200000	43.693157	5.5068431	61
Cam Spencer	47.750000	44.529400	3.2206000	62
Reece Beekman	48.750000	44.689514	4.0604864	63
Pelle Larsson	45.300000	44.814170	0.4858296	64
Boogie Ellis	53.000000	46.145090	6.8549104	65

Variable importance was calculated using the R package vip (variable importance score) and is plotted below. Age was by far and away the most important variable, followed by Wingspan, BLK%, 2P%, and USG%. Both position variables (center the one left out) were the least important but were kept in the model as predictive performance decreased with them removed.

vip(rf_model, n = 15)

XGBoost

Although the random forest produced interesting results, I chose to employ an XGBoost regression model to identify any improvement in model performance. Due to time constraints and computational efficiency, I decided to use the basic parameter values for my model. The model was ran with the same centered and scaled variables and was 5-fold cross-validated.

xgb_grid <- expand.grid(
  nrounds = 100,
  max_depth = 6,
  eta = 0.3,
  gamma = 0,
  colsample_bytree = 0.8,
  min_child_weight = 1,
  subsample = 0.8
)

set.seed(123)
xgb_model <- train(mock_average ~ ., 
                   data = train[, -1], 
                   method = "xgbTree", 
                   trControl = fitControl, 
                   tuneGrid = xgb_grid,
                   preProcess = c("center", "scale"))
print(xgb_model)

## eXtreme Gradient Boosting 
## 
## 48 samples
## 14 predictors
## 
## Pre-processing: centered (15), scaled (15) 
## Resampling: Cross-Validated (5 fold, repeated 10 times) 
## Summary of sample sizes: 40, 39, 38, 37, 38, 39, ... 
## Resampling results:
## 
##   RMSE     Rsquared   MAE     
##   15.2571  0.1825577  12.92009
## 
## Tuning parameter 'nrounds' was held constant at a value of 100
## Tuning
##  held constant at a value of 1
## Tuning parameter 'subsample' was held
##  constant at a value of 0.8

For the training set, RMSE was 15.26 and the R^2 was 18.26%, both worse than that of the random forest. The test set RMSE, however, was 13.63, which can partially be attributed to the small sample of data tested on (65 prospects total). Given the limited time frame (wanting to finish project before start of draft), I decided to use this as my final XGBoost model. After predicting the mock pick average for every prospect, the RMSE was 6.97. The XGBoost models predictions and residuals were added to the random forest predictions table which is presented below.

Overall, the XGBoost model vastly outperformed the random forest, with the predictions first 6 picks being the same as the average mock pick. Predictions ranged from 1 to 53 with a mean of 30.54. The first deviation from the average mock draft comes with Kansas forward Johnny Furphy, who jumped to the 7th overall pick (third highest positive residual). Other notable observations include Nikola Topic and Dalton Knecht again falling out of lottery albeit with smaller residuals (picks 23 and 27) and projected second rounders Melvin Ajinca and Jamal Shead entering the first round. The largest negative residuals remained the same (Knecht, da Silva, and George) while the largest positive residuals included Ajinca, Shead, Furphy, and Tristen Newton.

test_predictions <- predict(xgb_model, test)
rmse(test$mock_average, test_predictions)

## [1] 13.62956

xgb_predictions <- predict(xgb_model, features)
rmse(features$mock_average, xgb_predictions)

## [1] 6.970269

mock_picks <- mock_picks %>%
  mutate(xgb_pred = xgb_predictions) %>%
  arrange(xgb_pred) %>%
  mutate(Rank = row_number(),
         xgb_residual = mock_average - xgb_pred) %>%
  select(Name, mock_average, xgb_pred, xgb_residual, 
         rf_pred, rf_residual, Rank)
knitr::kable(mock_picks)

Name	mock_average	xgb_pred	xgb_residual	rf_pred	rf_residual	Rank
Zaccharie Risacher	1.333333	1.333115	0.0002181	6.997912	-5.6645787	1
Stephon Castle	4.833333	1.833295	3.0000383	9.272980	-4.4396467	2
Matas Buzelis	5.416667	3.665965	1.7507021	10.185224	-4.7685577	3
Alexandre Sarr	1.833333	4.834532	-3.0011989	11.858385	-10.0250512	4
Donovan Clingan	5.750000	5.416933	0.3330674	13.763617	-8.0136168	5
Ron Holland II	9.750000	5.750646	3.9993539	14.927029	-5.1770285	6
Pacome Dadiet	31.333333	6.154612	25.1787213	28.914096	2.4192370	7
Tidjane Salaun	10.666667	9.750295	0.9163720	16.082777	-5.4161108	8
Cody Williams	9.750000	10.435095	-0.6850948	17.785251	-8.0352508	9
Carlton Carrington	19.750000	10.666607	9.0833931	19.505192	0.2448080	10
Yves Missi	21.250000	10.917864	10.3321362	20.766727	0.4832731	11
Kelel Ware	20.250000	12.703759	7.5462408	21.355050	-1.1050502	12
Rob Dillingham	11.083333	16.083364	-5.0000312	21.645070	-10.5617370	13
Tyler Smith	27.500000	16.500229	10.9997711	22.120956	5.3790441	14
Kyle Filipowski	24.166667	18.333504	5.8331629	22.598074	1.5685925	15
Keshad Johnson	45.100000	18.751251	26.3487488	39.415559	5.6844414	16
Johnny Furphy	24.333333	19.749514	4.5838197	23.194785	1.1385479	17
JaKobe Walter	16.500000	20.250557	-3.7505569	23.862587	-7.3625872	18
Nikola Topic	11.666667	21.250176	-9.5835088	25.034453	-13.3677863	19
DaRon Holmes II	24.916667	22.083941	2.8327262	27.220231	-2.3035645	20
Dalton Knecht	9.083333	22.233538	-13.1502043	36.118379	-27.0350457	21
Jared McCain	16.083333	23.921852	-7.8385188	22.966886	-6.8835524	22
Devin Carter	10.916667	24.021381	-13.1047147	21.555295	-10.6386284	23
Melvin Ajinca	46.000000	24.916666	21.0833340	29.334860	16.6651402	24
Terrence Shannon Jr.	28.416667	27.499571	0.9170958	30.968809	-2.5521421	25
Nikola Djurisic	43.083333	28.416672	14.6666616	32.057005	11.0263286	26
Reed Sheppard	3.666667	28.781212	-25.1145452	15.724277	-12.0576108	27
Ryan Dunn	29.666667	29.416868	0.2497985	32.307017	-2.6403505	28
Jaylon Tyson	29.416667	29.666752	-0.2500852	33.376324	-3.9596570	29
Tyler Kolek	22.083333	30.212368	-8.1290347	27.893503	-5.8101696	30
Baylor Scheierman	32.250000	31.333406	0.9165936	33.872855	-1.6228553	31
Jamal Shead	41.363636	31.917181	9.4464553	34.021793	7.3418435	32
Bobi Klintman	31.916667	32.249912	-0.3332456	34.390368	-2.4737014	33
NFaly Dante	39.333333	35.865795	3.4675382	36.700985	2.6323484	34
KJ Simpson	40.200000	36.243801	3.9561989	38.560479	1.6395207	35
Cam Christie	36.500000	36.500771	-0.0007706	34.786834	1.7131659	36
Enrique Freeman	44.285714	36.586975	7.6987392	37.910549	6.3751653	37
Zach Edey	18.333333	37.498913	-19.1655795	22.764336	-4.4310025	38
Bronny James	52.666667	37.633964	15.0327021	42.496154	10.1705129	39
Ulrich Chomche	44.444444	37.749859	6.6945856	34.937049	9.5073958	40
Justin Edwards	41.666667	38.272659	3.3940074	35.100633	6.5660336	41
AJ Johnson	38.272727	39.332607	-1.0598800	35.193657	3.0790707	42
Isaiah Collier	18.833333	39.799248	-20.9659144	24.101406	-5.2680726	43
Kyshawn George	20.916667	40.200134	-19.2834676	35.597078	-14.6804113	44
Tristan da Silva	18.583333	41.272423	-22.6890895	36.108122	-17.5247882	45
Trey Alexander	41.833333	41.583473	0.2498601	36.243704	5.5896292	46
Dillon Jones	41.272727	41.665756	-0.3930290	36.374544	4.8981829	47
Kevin McCullar Jr.	37.750000	41.818790	-4.0687904	36.583071	1.1669294	48
Trentyn Flowers	40.833333	41.856918	-1.0235850	37.167548	3.6657850	49
Adem Bona	41.583333	42.250164	-0.6668307	37.495626	4.0877074	50
Reece Beekman	48.750000	43.599632	5.1503677	44.689514	4.0604864	51
Isaac Jones	50.500000	44.284580	6.2154198	38.313360	12.1866398	52
PJ Hall	41.857143	44.444447	-2.5873037	38.353300	3.5038425	53
Tristen Newton	48.200000	45.299885	2.9001152	39.281590	8.9184100	54
Jonathan Mogbo	42.250000	46.082729	-3.8327293	40.147405	2.1025951	55
Jaylen Wells	48.250000	46.333057	1.9169426	41.554553	6.6954468	56
Harrison Ingram	41.818182	47.749931	-5.9317495	41.698179	0.1200029	57
Ajay Mitchell	46.333333	48.249256	-1.9159228	42.946228	3.3871049	58
Oso Ighodaro	49.727273	48.666973	1.0602996	43.154260	6.5730130	59
Antonio Reeves	48.666667	48.749405	-0.0827382	43.226551	5.4401159	60
Jalen Bridges	49.200000	49.199665	0.0003349	43.693157	5.5068431	61
Cam Spencer	47.750000	49.726696	-1.9766960	44.529400	3.2206000	62
Juan Nunez	46.083333	49.946163	-3.8628298	35.930116	10.1532178	63
Pelle Larsson	45.300000	52.666843	-7.3668434	44.814170	0.4858296	64
Boogie Ellis	53.000000	52.999729	0.0002708	46.145090	6.8549104	65

Just as for the random forest model, vip was used to calculate variable importance. There wasn’t much change in importance between models, as age also dominated the XGBoost model. Small changes can be seen between the other top variables as well as the rise of defensive win shares as the fourth most important variable. Position is still has little importance along with 3PAr and FT%.

vip(xgb_model, n = 15)

Conclusion

Despite the lack of a definitive top-tier talent, the 2024 NBA draft presents a challenging landscape for teams due to the high variability in mock draft positions. Teams are likely to prioritize fit and potential over established hierarchy, making this draft unpredictable and potentially yielding surprises in player selections.

Both random forest and XGBoost were utilized to model average mock pick based on several important game- and player-level statistics. Results found XGBoost outperformed random forest and that attributes such as Age, Wingspan, BLK%, Defensive Win Shares, 2P%, and USG% all greatly impact mock draft position while a players position on the court, 3PAr, and FT% have little to no impact relative to other variables in the models.

This analysis provides an understanding of how statistical modeling can assist in predicting draft outcomes, although further refinements and data enhancements could improve predictive accuracy and robustness. Future work could account for the ordinal nature of draft picks and how standard deviation in mock draft pick affects model predictions.