Objective

The purpose of this assignment is to find NBA players that have high performance but low salaries, in hopes of recruiting them to your team (currently the worst team in the NBA).

Points:

  • Determine a way to use clustering to estimate based on performance if players are under/over paid.

  • Select players expected to be the best for the team, including an explanation of why. Include examples of good, not so good, mid cases.

Methodology: Variable Selection for Clustering

Emphasis will be placed on the Field Goal Percentage, Free Throw Percentage and Points to represent player performance relative to Salary when clustering. Field Goal Percentage accounts for the success rate of points made in the field - such as three or two-point wins. Free Throws account for the remainder of the points that could be scored that are not considered a field goal in basketball. Altogether the success rate represented by these two percentage variables could be indicative of performance, contributing to the total points scored. Salary was chosen to determine recruitment possibilities as the purpose of this task is to search for players that are currently under-paid with their current teams despite their successful performances. Therefore by offering a more suited monetary incentive worth their performance, there is a likelihood of recruiting them to our team and improving our team performance.

Plots of the Cluster Models per Performance Variable

Cluster Model of Field Goal Percentage vs Average Salary per Player

From the graph, a considerable amount of players have between 0.50-0.75 field goal percentages. Despite such, the majority - as depicted by the cluster density of both groups - have low salaries not equally represented by these goal percentages. The range in salary spans low to high, with the great majority having lower salaries despite equivalent (or comparable) field goal percentages.

Cluster Model of Free Throw Percentage vs Average Salary per Player

From the graph, a considerable amount of players have relatively high free throw percentages. Despite such, the majority - as depicted by the cluster density of both groups - have low salaries. While there are some with correspondingly high salaries, the density of the points in the cluster model are incomparable to that of the high free throw percentage & low salary group.

Cluster Model of (Total) Points Earned vs Average Salary per Player

In contrast with the other two variables, the points scored seems more representative at first glance of salaries. Players with low points have low salaries and those with high points have salaries worth their performance. However, there are still those with low salaries despite their high performance or those with low performance yet high salaries. It is observed that cluster density is highest where there are low points and low salaries - another cluster is seen to have a wider range, covering the majority of the other cases.

Visual Representations to find ‘k’

Elbow Method

The Elbow Method computes the percentage of variance explained by clusters for a range of cluster numbers. The goal of the resulting line graph is to find the kink in the inter-cluster variance / total variance indicating where the optimal k value is. The results of this particular case displayed the kink in the graph at k=2 using the Elbow method.

NbClust Method

Through the histogram of the results, the best ‘k’ values across multiple methods are listed. Where the max frequency (or number of k) possible is 15. NbClust runs 30 different tests and provides a “majority vote” for the best k (# of clusters) to use. Overall, k=2 is the recommended ‘majority vote’ with 9 votes using the NbClust method.

Finding ‘k’ Analysis

What differences and similarities did you see between how the clustering worked for the datasets?

Similarities: Both methods resulted in the same k value. Somewhat similarly, they both used a sudden peak or change from relative k values to determine the optimal k.

Differences: The differences were in the process - NbClust compares best k values across multiple methods and then takes the max, using a ‘majority vote’ system. Whereas the Elbow method computes the (inter-cluster variance / total variance) for every k and the k value in which there is a kink in the logarithmic-resembling relation is the optimal k. In summary, the Elbow method is dependent on the (inter-cluster variance / total variance) ratio whereas NbClust is independent of such.

Ultimately indicating that the optimal k (# of clusters) is at k=2, which was the value used for the cluster model configurations in all cases using the specificed variables.

Player Analysis

Recruit!

Cases of ‘Yes’, in which these players should be recruited to our team due to their high performance and low salaries when with other teams. Their addition could improve our team standing due to their good performance in games. Therefore, using their low salaries as an advantage, we could increase the chances of their recruitment by offering them salaries more worth their point contributions.

  1. Luka Doncic

  2. Trae Young

  3. Zion Williamson

Maybe~ Could be Considered

Cases of ‘Maybe’, where these players are not a complete ‘Yes’ but could be considered. Their salaries are not as low, ranging below the approximate median relative to other players despite their equally good performance. Similar to cases of ‘Yes’, these players could be convinced to join our team given the right monetary incentive (competitive salary).

  1. Zach LaVine

  2. Julius Randle

  3. Jerami Grant

Not Likely

Cases of ‘No’ where these players have high performance and are paid adequately with high salaries, well-represented/worth their work. Implying that there is little chance of recruitment without providing a higher salary (better incentives) than the ones they have at the moment. This may imply paying more than their worth to convince them to join our team - especially as there are other factors that may contribute to their decision such as team popularity or standing relative to others (ie. some teams have more famous names or strong game histories).

  1. Stephen Curry

  2. LeBron James

  3. Damian Lillard

3D Visualizations

Plot of Field Throw Percentage & Points against Salary

Plot of Field Goal Percentage & Points against Salary