To access R and R Studio which are installed on the Saint Ann’s server you can go to: http://rstudio.saintannsny.org:8787/ and log in with your Saint Ann’s email address.
Once again we’ll be using the to filter, arrange, select and mutate functions from the dplyr package in R. This time we’ll be using NBA data and focusing on using the piping operator, %>%, to combine operations.
The Sports Analytics package can be used to fetch NBA data. Below, we use it to fetch data from the 2015-2016 NBA season and assign it to the dataframe nba. We also load the dplyr package once again to help us explore the data. Finally, we take a look at the first six rows of the nba dataframe we created in order to know the column names and see what our data looks like.
library(SportsAnalytics)
nba <- fetch_NBAPlayerStatistics("15-16")
library(dplyr)
head(nba)
For the most part, I’ll be expecing you to look back at our first data transformation lab for help on how to use the filter, arrange and select functions. However, for the sake of a little review, here are three lines of code that will: find the top 10 players in the NBA by minutes played, find the top 10 knicks by minutes played, and show the Knicks roster.
nba %>% top_n(10, TotalMinutesPlayed) %>% arrange(desc(TotalMinutesPlayed))
nba %>% filter(Team=="NYK") %>% top_n(10, TotalMinutesPlayed) %>% arrange(desc(TotalMinutesPlayed))
nba %>% filter(Team=="NYK" & TotalMinutesPlayed>=100) %>% select(Name, Position)
We might also want to make use of the mutate() function to add columns based on our own calculations. Our data set does not include each player’s shooting percentage but we can calculate it:
nba %>% mutate(FieldGoalPercentage = FieldGoalsMade/FieldGoalsAttempted)
The line above calculate every player’s shooting percentage and shows them but it doesn’t permanently add this calcuation to our dataframe. To do so, we actually need to write over the dataframe as follows:
nba <- nba %>% mutate(FieldGoalPercentage = FieldGoalsMade/FieldGoalsAttempted)
Google “true shooting percentage” and then calculate it for every player. Decide on a reasonable minimum amount of either playing time or shots taken and then produce a top 5 list by true shooting percentage for every position.