The 2016-17 NBA season is having one of the most hotly contested MVP races of all time at a time when the definition of an MVP is at its most blurry. How do we quantify one individual’s value to his team? One way to tackle this issue is to differentiate between a player’s performance in wins and in losses. This may help us determine which players have the largest effect on the outcome of their team, so we chose to go down that analytical path.
The media, the incredible box scores these players produce, and the success of their teams have allowed us to focus on the five primary players in contention for this award: Russell Westbrook, James Harden, LeBron James, Kawhi Leonard, and Isaiah Thomas. We know that they are each the best player on the team, and we know that each of their teams wins frequently, but we are unable to quantify exactly how much value they each add to their respective teams. How impactful are their individual performances on their team’s winning or losing? To determine this, we constructed a logistic regression model based on a few key variables.
The outcome variable we are attempting to predict is a binary value that is 1 in a win and 0 in a loss. The first variable we add to the model is called Game Score. This statistic is a linear combination of all of the important basic statistics a player accumulates in a game, and its purpose is to roughly estimate a player’s productivity in a certain game. This is the formula for Game Score:
\[ Points + 0.4 * Field Goals Made - 0.7 * Field Goals Attempted - 0.4*(Free Throws Attempted - Free Throws Made) + 0.7 * Offensive Rebounds + 0.3 * Defensive Rebounds + Steals + 0.7 * Assists + 0.7 * Blocks - 0.4 * Fouls - Turnovers. \]
The next variable we added to the model was a binary variable signifying whether or not the game was at the player’s home arena or at the opposing team’s arena. Since players tend to have improved play at home with their fans cheering for them (rather than opposing fans booing them), we wanted to account for this potential shift. Our final variable was an interaction between the days of rest players get before a game (how many days it has been since the previous game) and the amount of minutes a player plays within that game. This is important because it measures a player’s fatigue on a given night, which could also be a significant factor in one’s level of play. Back to backs (playing two games on back to back days) are some of the most grueling stretches of scheduling in the NBA, and fatigue sets in the most on such occasions. Furthermore, a player is more likely to grow exhausted when playing closer to or more than 40 minutes of a 48 minute game. We wanted to account for this in our model as well, as it could also alter a player’s level of play.
We then used this model to create a set of probabilities for each game a player played in. Based on a player’s fatigue, location, and box score statistics (all of which can be determinants of a player’s performance), these probabilities determine how likely a player’s team is to win a given game. Here is a plot of the range of win probabilities for each of our five MVP candidate’s teams:
We can see general trends in each player’s play from this plot. For example, we see that Russell Westbrook and LeBron James have around 30 and 18 games respectively in which their team is expected to lose (less than .50 win probability) based on their play. The other three candidates, however, each have approximately 10 such games. This could speak to the player’s lack of impact, but it could also speak to the supporting cast surrounding that player. Westbrook (in some people’s opinions) has a weaker group of players on his team than any of the other candidates, so it makes sense that he may be less able to singlehandedly guarantee his team a win.
Furthermore, Isaiah Thomas often has a high win probability but he never seems to have those few games where his team is basically guaranteed to win (greater than .95 win probability). This could signify that he may not have those epic performances that will be talked about for years to come, while each of the other players does have at least a few such games. Kawhi Leonard has the most consistently good play, with his team’s win probability almost always above .60 based on his play. However, his team does have the most wins of any of these players’ teams, so that may speak to their skill as a team.
We next grew curious about how these predicted win probabilities from our model stack up against the actual results for a game. By comparing these predictions to the actual results, we could see how well a player has to play to significantly influence his team’s victory, and we could also take note of how often he actually achieved that victory. We decided to make side by side box plots for each player, one of which shows the predicted win probability in games the player’s team actually wins and the predicted win probability in games the player’s team loses. Here is the plot for Russell Westbrook:
We see here a quite large disparity between Russell Westbrook’s play when his team wins and his play when they lose. If his play can put them between a 0.4 and a 0.6 (40% and 60%) win probability, they actually tend to lose more of those games. This suggests that his team’s play is very heavily reliant on his play, which further suggests that he may not be surrounded by very talented players. However, his team only won a few less games than these other players’ teams, and he gives his team the lowest average probability to win of all the candidates. Next, we have these box plots for James Harden:
In James Harden’s case, the disparity in his play between wins and losses is not as significant. A reason for this may be that he has a better team than Westbrook, but he also is more consistent, with more games above a .50 win probability (as seen in the line graph). Furthermore, when Harden plays at around a 0.75 win probability level, his team almost never loses. However, the lack of disparity between wins and losses in average win probability shows that his team is less directly reliant on his play. Credit his coach and their system for putting the team in a position to win even if their star player has a bad game. Now, let’s look at LeBron James’ box plots:
LeBron plays at an extremely high level in his team’s victories, but this is unsurprising, as he is widely known as the best and most talented player in the world (Note: this does not mean that he necessarily should win MVP, as the MVP is an award commemorating an individual season). The surprising part of this graph is LeBron’s propensity to play poorly more often than other candidates and how directly that correllates to a loss for his team. When he plays at below a .50 win probability, his team is almost guaranteed to lose, and as seen in the line graph, he plays at this level almost one in four games (18 out of 82 games). He is known to coast and take the regular season lightly while he preparing to only exert full energy in the playoffs. Do we credit him for singlehandedly ensuring wins for his team more often than other candidates, or do we knock him for the games in which he lacks effort? Or both? It’s an interesting and common question that many have about him. Let us now see Kawhi Leonard’s box plots:
These plots show how consistently high Leonard’s level of play is, but also how little a difference there is in the win probabilities his play provides in wins and in losses. Since his team wins so many games (61 out of 82 on the season), it makes sense that his win probability is so consistently high. He plays for an all time great head coach and a supporting cast full of smart, veteran players. We do see here though that when he does dip below the .60 mark, he usually loses. Thus, despite his team’s ability to lose even when he plays well, the rare occasions where he does slip up will all significantly raise the likelihood of a loss for his team. This does imply that they are dependent on his play to a certain degree. Finally, let’s look at Isaiah Thomas’ box plots:
Thomas appears to play well at a more usual rate than LeBron does, but he does not play as consistently as Leonard does. His team also greatly relies on him in the same way as Leonard’s team relies on Leonard, as when Thomas plays at a below 0.60 win probability, the chance of the team losing is greatly increased. However, like we see in the line graph, his peak level of play just doesn’t reach that of the other players’, which hurts his case.
So, how do we use these analyses to determine exactly who the most valuable player is? It isn’t so simple. Although these provide us some insights on the value of a player to his team, there are some variables that are not quantitative at play. For example, the emotional impact of Russell Westbrook producing a playoff level team after his team’s other star, Kevin Durant, left in the previous offseason is difficult to put into a model. There is also a subjective aspect in weighting the different factors of an MVP. How much do you value team success? How much do you value individual statistics? How do you judge the impact of the strength of a supporting cast? Since the definition of most valuable is inherently vague, there are many conflicting philosophies on what exactly to consider and how strongly consider each aspect. What we have developed is simply an important tool to add to the arsenal, a way to measure a player’s statistical impact on his team’s winning.