Overview

Goal: Figure out a way to find players that are high performing but maybe not highly paid that you can steal to get the team to the playoffs!

Details:

  • Determine a way to use clustering to estimate based on performance if players are under or over paid, generally.
  • Then select three players you believe would be best your team and explain why.
  • Provide a well commented and clean (knitted) report of your findings that can be presented to your GM. Include a rationale for variable selection, details on your approach and a overview of the results with supporting visualizations.

Read in Data, Merge Data

We first imported all of our data and merged our two data sets. We also want to clean the dataset to exclude any rows that have NA’s or any duplicate players. We also want to normalize the variables we are interested in looking at.

Examine Relationship Between Variables and Salary to Narrow Down Features

We can see with all three of the plots below that each of these three variables: points scored, minutes played, and age all correlate well with salary. Because of this we want to use these variables to create our clusters.

Begin Clustering with K = 2

Began clustering with K = 2, to see how it performs. Will eventually consider running more clusters after examining elbow and Nbclust methods.

Visualizing Output for K = 2

Want to visualize what our two clusters for salary look like across three pointers and minutes played. From here we can see that those dark blue ones in the top right are the players we may be interested in recruiting to play for us because they are scoring a lot of threes and playing a lot of minutes, however are still getting paid on the lower end of the payscale.

Visualizing in 3D K = 2

We also want to view this in 3D so we can look at our third variable which is age and see how that impacts minutes played and three pointers scored and see if it gives us additional information on which players we should consider underpaid.

Assessing How Good our Clustering Was

We see from this that our variance accounted for by clusters is .44.We will use this to compare this when we use other N’s.

## [1] 0.4448182

Running Elbow Method to assess if there is better K

We want to run the elbow method to assess if it would be worth it to use another K value besides 2. We can see from here that at K = 3, we start to see a way more significant diminish in return. Thus, it may be worth it to bump our K up to 3, however it doesn’t seem to make a huge increase in inter-cluster variance/total variance after K = 3.

Run NbClust to see what they believe to be best K

We also want to look at NbClust to see which methods recommend which number of clusters. When we look at the histogram here we can see that K = 2,3, and 5 are the most frequently suggested number of clusters. Because we have already done K = 2, and according to the elbow method we saw diminishing return after 3, my suggestion would be to also run K = 3 and see if we can gather additional insights from three clusters.

Compare 2 Clusters with 3 Clusters to see which is better

In the steps below I repeat our first few steps, however now using K = 3. Then we will re-plot and re-visualize in order to pick the players in which we believe to be underpaid that we should consider recruiting for the wizards.

Visualize ouput K = 3

Assessing how good our clustering was K = 3

As expected, we see higher variance accounted for by clusters when using K = 3, versus K = 2.

## [1] 0.6212126

Conclusions

1. Donovan Mitchell - Utah Jazz

2. Trae Young - Atlanta Hawkes

3. Luka Dončić - Dallas Mavericks

Ultimately, after looking at our visualizations my suggestion would be to recruit Donovan Mitchell from the Utah Jazz, Trae Young from the Atlanta Hawkes, and Luka Dončić from the Dallas Mavericks. I’ve selected these three players after the rigorous clustering process above. These are our best choices of players because they are high performing, yet underpaid. From our visualizations we can see that these players are playing a lot of minutes and are scoring a lot of pointers, but are getting paid relatively low amounts in comparison to other top performers. This also holds true for their age being relatively young in terms of all of our players. Thus, after this clustering exercise it is true that our best bet of making it to the finals this year is to recruit these players to the Wizards.