Loading Data

nba <- nba %>%
  distinct(Year, Player, Tm, .keep_all = T) %>%
  filter(G > 5)

Fix Year Column

Add day and month to Year column

Use January 1st for each row and convert to date column.(example year updated: 1980/01/01 = January 1, 1980)

nba <- nba %>%
  mutate(year_updated = as.Date(paste(Year,'/01/01', sep = "")), .after = Year)

Create tsibble

nba_ <- nba %>%
  group_by(year_updated) %>%
  summarise(total_3PA = sum(`3PA`))

Use year function to pluck the year from the date so that 364 days a year are not filled with NA values.

nba_ts <- as_tsibble(nba_, index = year_updated) %>%
  index_by(date = year(year_updated)) %>%
  summarise(total_3PA = sum(total_3PA)) %>%
  fill_gaps()

Create xts

nba_xts <- xts(x = nba_ts$total_3PA,
               order.by = as.Date(paste(nba_ts$date,'/01/01', sep = '')))
nba_xts <- setNames(nba_xts, "total_threes")

Plotting

Full Plot

nba_ts %>%
  ggplot() +
  geom_line(mapping = aes(x = date, y = total_3PA)) +
  labs(title = "3 Point Attempts by Season")

Here we can see a pretty dramatic increase over time with a couple dips. These dips are associated with NBA lockouts which resulted in a shorter season.

Pre Daryl Morey

nba_ts %>%
  filter(date < year(as.Date('2002/01/01'))) %>%
  ggplot() +
  geom_line(mapping = aes(x = date, y = total_3PA)) +
  ylim(0,90000) +
  labs(title = '3 Point Attempts by Season 1980-2001',
       subtitle = 'Pre Daryl Morey',
       x = 'Season',
       y = 'Total 3 Point Attempts')

This graph plots the window of time before Daryl Morey took his first big time basketball operations job for an NBA team. He was very influential in pursuing new age analytics to build his teams, which led to them shooting a lot of threes, sometime too many for their own good.

Insert Daryl Morey

nba_ts %>%
  filter(date >= year(as.Date('2002/01/01'))) %>%
  ggplot() +
  geom_line(mapping = aes(x = date, y = total_3PA))+
  ylim(0,90000) +
  labs(title = '3 Point Attempts by Season 2002-2020',
       subtitle = 'Daryl Morey has arrived',
       x = 'Season',
       y = 'Total 3 Point Attempts')

This is the window of time since Daryl More has been in a role as President or Vice President of Basketball Operations. His direct influence would be seen better by looking specifically at the Rockets, however this trend here is a collection of people adopting similar strategies across the league.

Smoothing

Rolling Average

nba_xts %>%
  rollapply(width = 10, \(x) mean(x,na.rm = TRUE), fill = FALSE) %>%
  ggplot(mapping = aes(x = Index, y = total_threes)) +
  geom_line() +
  labs(title = "3 Point Attempts by Season",
       subtitle = "10 Season Rolling Average",
       x = 'Season',
       y = 'Total 3 Point Attempts')

LO(W)ESS

nba_ts %>%
  ggplot(mapping = aes(x= date, y = total_3PA)) +
  geom_point(size = 1, shape = 'o') +
  geom_smooth(span = 0.4, se = FALSE) +
  labs(title = "3 Point Attempts by Season",
       subtitle = "loess smoothing - span = 0.4",
       x = 'Season',
       y = 'Total 3 Point Attempts')
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

AC / PAC

acf(nba_ts, ci = 0.95, na.action = na.exclude)

pacf(nba_xts, na.action = na.exclude,
     xlab = "Lag", main = "PACF for 3 Point Attempts")

AC/PAC Conclusion

Based on the two above plots we can conclude that there is no major seasonality in the data. The largest component of the data is the overall trend from year to year. This makes sense as there is no league wide component, such as an Olympic cycle, that would introduce seasonality larger than 1 to the data. On a day by day level it may be more apparent as one team shooting a lot could raise the total attempts on their specific game days, but as mentioned there is no seasonality from NBA season to season.