1 Introduction

The research question that our group will be addressing with our multiple regression model is whether an NBA player’s draft position and average points scored per game causes trends in their win shares per 48 minutes during their professional careers.

The data set that we are using contains a plethora of information for each player drafted, such as round number, pick number, number of years, games, and minutes, played, as well as average points per game scored and win shares, among other information. Link to our original data set can be found here: NBA Draft Data.

We modified the data set by creating pick tiers based on a player’s draft position and removed variables we were not analyzing. The three tiers are picks 1-10, picks 11-20, and picks 21-30.

Win Shares is a sabermetric player statistic which attempts to calculate the amount of wins an individual contributed to his team. The calculations for the win shares statistic are complicated, but it involves taking player, team and league-wide statistics and reducing them to a simple number. By summing all of the players’ win shares on a team at the end of the season you will have the number of wins for that team. The higher the win share for a player, the better the player was for his team.

We chose to use win shares per 48 minutes, a statistic that modifies a player’s total win shares by how many minutes they play, creating a number that estimates a player’s contribution for every 48 minutes he plays, or the length of one NBA game rather than just the win shares stat. Since win shares accumulate as a player’s career lengthens, this stat doesn’t equally show how valuable a player is when they are on the floor, but rather favors longevity. This hurts the distribution of win shares in our data set as it has a large right skew. Win shares per 48 minutes normalizes the distribution and shows how valuable a player is when the amount of minutes they’ve played is equalized across the data set.

Note: we excluded players who never made it to the NBA (due to injury, or lack of skill where they ended up playing in developmental leagues or overseas). Their points scored and win shares data appeared as N/A in the data frame we were working with.

For more information about the win share statistic: Win Share Description.

This is why we chose win shares per 48 minutes over simply win shares:

2 Exploratory data analysis

First six rows of our data set:

pick	points	win_share	win_shares_s_48	tier_number
1	NA	NA	NA	1
2	8.0	-0.2	-0.006	1
3	4.7	0.3	0.029	1
4	3.1	-0.1	-0.008	1
5	3.9	0.2	0.012	1
6	9.2	0.5	0.028	1

tier_number	mean	median
1	0.0930609	0.0920
2	0.0639242	0.0710
3	0.0613910	0.0705

From the visualizations, 4 initials trends emerge:

The numbers we are working with are quite small. This is due to the fact that it is hard for most players to contribute high value to their team when only looking at their play in an average game. Most players have relatively similar values to their win shares per 48 minutes, but these small differences lead to large differences over the course of an entire season. Again, we are using win shares per 48 minutes because that statistic is nearly a perfect normal distribution.
The median values for win shares per 48 minutes are quite a bit higher for tiers 2 and 3 than the mean values. This is due to the fact that medians are more robust against outliers than means. There are several players in these categories with very low win shares per 48 minutes. These players probably did not have very long careers so their immediate poor play caused them to never get a chance to redeem themselves and improve on this statistic. This observation is backed up by the outliers on the low end seen in the box plot.
Tier 1 players seem to be contributing consistently better play to their teams (as based on the box plot of win shares per 48 minutes) than tiers 2 and 3 players which look basically equivalent in the visualization.
As a player’s average points scored per game increases, that player’s win shares per 48 minutes tend to increase. This makes sense since a player who scores a lot is most likely adding a lot of value to his team.

3 Multiple regression

Our numerical outcome variable is win share per 48 minutes played. There are two explanatory variables, one categorical and one numerical. The categorical explanatory variable is the player’s pick number separated into 3 tiers. The three tiers are picks 1-10, 11-20 and 21-30. The explanatory numerical variable is average points scored per game.

term	estimate	std_error	statistic	p_value	conf_low	conf_high
intercept	0.025	0.007	3.572	0.000	0.011	0.039
points	0.006	0.001	10.556	0.000	0.005	0.007
tier_number2	-0.027	0.009	-2.883	0.004	-0.045	-0.009
tier_number3	-0.023	0.009	-2.504	0.012	-0.041	-0.005
points:tier_number2	0.003	0.001	3.311	0.001	0.001	0.005
points:tier_number3	0.004	0.001	3.738	0.000	0.002	0.006

3.1 Statistical interpretation

The intercept with a value of 0.025 corresponds to the value of win shares per 48 minutes for a player in tier 1 with 0 points scored per game. This means that a player drafted in picks 1-10 who averages 0 points per game will “add” 0.025 wins to his team for every 48 minutes he plays. This minute number makes sense because a player who is not scoring points is not going to be helping his team greatly, but it is interesting to note that for a top 10 pick in the NBA who doesn’t score any points, on average, that player is still slightly adding wins to his team (possibly due to his defensive abilities or other important parts of the game like assists and rebounds). Although, in actuality, there are probably very few top 10 picks who score 0 points per game.

The equation for the first tier of picks is \(\widehat{winshare}\)= 0.025 + (0.006)points. The equation for the second tier of picks is \(\widehat{winshare}\)= -0.002 + (0.009)points. Finally, The equation for the third tier of picks is \(\widehat{winshare}\)= 0.002 + (0.010)points.

In the exploratory data analysis, we saw a scatter plot with our data. It is clear to see how these slopes correspond to the visualization. All else being equal, as the average points scored per game increase, the win shares per 48 minutes also tend to increase. Also, although the slope values are higher for tiers 2 and 3, meaning that as players in those tiers score more points per game their win shares per 48 minutes increases faster than players in tier 1, the intercept for tiers 2 and 3 are much lower than tier 1. This is backed by the data in the box plot shown in the EDA where players in tier 1 seemed to be performing better than players in the other tiers. Tier 1 players who score a low amount of points are contributing, on average, more wins to their team than players in tiers 2 and 3.

One limitation of our model is the small differences in win shares. We are looking at these players’ performances from a single game perspective. It would be easier to understand the impact of these small differences in win shares per 48 minutes over an entire season.

Additionally, there may be trends in the longevity of a player based on their pick tier leading to career trends in win shares that are missed by conducting the analysis on a per 48 minute basis. It may be that the top picks not only average a higher win share per 48 minutes, but also average more total minutes in their careers leading to much higher total win share values. A trend like this would be missed by our model.

Finally, our decision to group picks by tiers was done to make an entire round of a draft easier to comprehend. It is easier to look at 3 categories than 30. However, trends on a smaller scale of draft position may be missed with this approach. It could be the case that #1 overall picks are on average much better than picks 2-5 in terms of win share per 48 minutes, but our model would not reveal this trend.

3.2 Non-statistical interpretation

As expected, as average points scored per game increased, a player’s win shares per 48 minutes increases no matter where in the first round of the draft a player was selected. However, if a player was picked in the top 10 picks, on average, they had a higher win shares per 48 minutes statistic throughout their NBA career than a player picked in draft positions 11-30. Interestingly, as points per game increased, win shares per 48 minutes increased at the fastest rate for players picked 21-30, then players picked 11-20, and finally players picked 1-10.

The most consistent trend in our data is that as average points scored per game increased so did win shares per 48 minutes irregardless of draft pick. The draft pick tiers also revealed trends about the differing values of players selected at different points in the first round, but these differences were less consistent based on the high starting points and slower increase of win shares per 48 minutes of Tier 1 players compared to the relatively lower starting points, but faster increasing rates of win shares per 48 minutes of Tier 2 and 3 players.

4 Inference for multiple regression

95% confidence intervals

Slope for Tier 1 players [0.005, 0.007]

Difference in Slope for Tier 2 players (Compared to Tier 1) [0.001, 0.005]

Difference in Slope for Tier 3 players (Compared to Tier 1) [0.002, 0.006]

The confidence interval for the slope of average points scored per game to win shares per 48 minutes for Tier 1 players does not contain 0, suggesting a significant positive relationship between average points scored per game and win shares per 48 minutes.

Furthermore, we chose to use an interaction model for our data since none of the other confidence intervals contain 0. These confidence intervals determine if the slopes for Tiers 2 & 3 differ from Tier 1. Without including 0 in the interval, we have evidence that the associated effect of average points scored per game on win shares per 48 minutes truly differs between player draft tiers. Based on our confidence intervals, among the different tiers of drafted players scoring higher amounts of points per game seems to have differing effects on win shares per 48 minutes. An increase in a player’s points per game will likely not lead to equal increases in win shares per 48 minutes across all tiers. Therefore, we do not believe that there is no difference of average points scored per game on win shares per 48 minutes between player tiers.

Hypothesis tests for the slopes within each tier:
H₀: slope equal to 0, vs. H_A: slope greater than 0. Across all tiers, we are testing for an alpha-level = 0.05.

p-values

Tier 1 p-value: 0

Tier 2 p-value: 0.001

Tier 3 p-value: 0

Since the p-values for each slope are less than 0.05, we can reject the null hypothesis that the slope coefficient is equal to 0 for each tier. Essentially, within each tier we are rejecting the assertion that there is no effect of points per game on win shares per 48 minutes. Our data suggests there is a statistically significant relationship between points per game and win shares per 48 minutes. Thus, we believe that a player with a higher amount of points scored per 48 minutes will tend to have a higher amount of win shares per 48 minutes.

5 Conclusion

As expected, as average points scored per game increased, a player’s win shares per 48 minutes increases no matter where in the first round of the draft a player was selected. Surprisingly, the slope for players in Tier 3 (Picks 21-30) was the highest meaning that for 1 more unit in points scored there was a larger associated increase in win shares per 48 minutes of play. Players in Tier 2 (picks 11-20) had the next highest slope and then last was Tier 1 (1 - 10 picks). Using statistical tests, we were able to reject the assertion that there is not an effect of points per game on win shares per 48 minutes for every tier of players we analyzed. The results suggest that there is a significantly significant positive, yet differing relationship for the average points scored per game and win shares per 48 minutes between all tiers.

Take-home message: Most importantly, irregardless of where a player is drafted within the first round, if they score more points per game, they will tend to have a higher win shares per 48 minutes statistic. Since win shares is a measure of value added to a team, our data suggests that a player who scores more points per game will, on average, add more value to his team. Secondly, on average, if a player is drafted in the top 10 of the NBA draft, they are more likely to be more valuable to their team than players taken in the rest of the first round in terms of win shares per 48 minutes. Players drafted in Tier 2 (picks 11-20) and Tier 3 (picks 21-30) provide relatively equal levels of win shares per 48 minutes compared to average points score per game. These findings are based on the EDA box plot and Multiple Regression Equations. Finally, our data suggests that players in Tiers 2 and 3, although they begin with a lesser value of win shares per 48 minutes when average points per game = 0, have, on average, a larger increase in win shares per 48 minutes for every 1 unit increase in points scored per game compared to players in Tier 1.

Limitations

The first limitation of our model is the small differences in win shares. We are looking at these players’ performances from a single game perspective. These small differences on a 48-minute scale would become much longer if we were to analyze these players from a season-long perspective.

Also, there may be trends tied to the longevity of a player leading to career trends in win shares that are missed by conducting the analysis on a per 48 minute basis. In a future study, we could look at the effect of average points scored on total win shares for a player’s entire career, which would take into account a player’s longevity and ability to stay healthy. This would address both limitations mentioned above.

Future Research

One avenue for potential future work is to explore our data set by breaking it down by player positions to see if any additional trends emerge. Another avenue we could take is to use player tracking data supplied by Sport VU to determine whether excess movement around the court by a player results in trends affecting win shares per 48 minutes. Specifically, we could look at whether players who move efficiently have higher win shares per 48 minutes. If the aforementioned findings were true, then we could conclude that fitness is likely an important prerequisite for higher win shares.

6 Citations and References

Win shares definition: https://www.basketball-reference.com/about/ws.html Link to our data set: https://data.world/gmoney/nba-drafts-2016-1989. Sports VU Basketball Player Tracking: https://www.stats.com/sportvu-basketball/

Supplementary Materials

This page gives information on the NBA leaders of statistical categories this season, including win shares and win shares per 48 minutes: https://www.basketball-reference.com/leagues/NBA_2018_leaders.html.

NBA Players’ Effectiveness Compared to Draft Position

WSJ

Thursday, March 29, 2018