Link to the dataset: https://github.com/fivethirtyeight/data/blob/master/nba-draymond/draymond.csv
Link to the data dictionary: https://github.com/fivethirtyeight/data/blob/master/nba-draymond/README.md
This dataset looks at an alternative way to evaluate the production from a defensive player. It aims to look at the space created between a player and the person he is guarding. In the new era of high scoring basketball it quickly being adopted as a worthy metric for evaluation. The information provided in this dataset is number of possessions played and the DRAYNOND rating for each player in the NBA since 2014. The DRAYMOND rating can be +/- based on the league average.
The below cell are for the purpose of loading the appropriate libraries, data and converting the data into a dataframe.
#load libraries
library(RCurl)
# load in the dataset for github repo
csv_dl <- getURL('https://raw.githubusercontent.com/fivethirtyeight/data/master/nba-draymond/draymond.csv')
# convert to dataframe
df <- read.csv( text = csv_dl)
# print first 5 rows
head(df)
## season player possessions DRAYMOND
## 1 2017 AJ Hammons 331.0258 -0.1766801
## 2 2014 AJ Price 211.7156 5.9121720
## 3 2015 AJ Price 633.5186 -1.7909210
## 4 2014 Aaron Brooks 3257.9340 -0.9529003
## 5 2015 Aaron Brooks 3984.0440 -0.1861272
## 6 2016 Aaron Brooks 2276.0170 2.2965770
The below are an introductory look into the data. The last cell creates a subset of the data containing only scores for the year 2014 and looks at summary statistics of DRAYMOND for that year.
# summary statistics of the DRAYMOND column
summary(df$DRAYMOND)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -45.07268 -1.09302 -0.06452 -0.08387 0.97151 61.77634
# plots a histogram of DRAYMOND
hist(df$DRAYMOND,
breaks = 10,
xlab = 'DRAYMOND',
main = 'Distribution of DRAYMOND')
# creates a new df of
df_2014 <- subset(df, season == 2014)
# print summary stats of subset
summary(df_2014$DRAYMOND)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -25.02556 -1.05128 -0.07111 0.07766 1.16220 32.00608
In conclusion I might look to add additional stats from the same season and see how the DRAYMOND score related to other statistics like steals or block. I might also look into grouping the stats as a team and see if it could be used to predict points allowed per game.
.