Introduction

I have always enjoyed following the NBA as I both played and watched basketball for most of my life. As I get older and have seen some of my favorite players age, it has made me wonder what effect does age have on NBA players and teams. The below summary statistics gives a view of the datasets statistics around age, points, and rebounding to help us better understand our dataset.

summary(nba_player_stats$AGE, title = 'Summary of Player Age')
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   19.00   23.00   25.00   25.99   29.00   41.00
summary(nba_player_stats$PTS, title = 'Summary of Player Points')
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   4.100   7.350   8.942  12.400  32.000
summary(nba_player_stats$REB, title = 'Summary of Player Rebounding')
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   1.900   3.200   3.635   4.800  14.300
summary(nba_player_stats$`3PM`, title = 'Summary of Player Rebounding')
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.300   0.800   1.003   1.600   5.300

Dataset

The data set being used is 2019-2020 NBA regular individual player stats for a single season. I have worked with the data to group it by age and teams to give the readers a wide range of ways to see the data. For classification, younger players fall into the age range from 19-24 years old; middle age players range from 25-31 years old; and older players would be considered anyone over the age of 32 years old. The dataset started with 540 observations and 34 different column headers. The columns ranged from points per game, to age, and team to name a few variables.

Findings

My initial findings from my dataset showed age does have an effect on players stats and win loss records during the regular season. Based off the data set, younger to middle age players appear to win, score, and play more minutes during the regular season. We also notice a higher average 3 pointers made a game once a player reaches the 30 age group. However what is interesting, 3 pointers made looks to have a negative correlation with overal NBA regular season wins in the 2019-2020 season.

Tab 1

The below graph shows us in ascending order the total number of team wins over the season as well as the average age of the team players on each of the teams. As we can see much of the data for the numbers of wins for the oldest teams in the league fall within the first 2 quartiles of the graph. Based on the data, most of the teams with average younger and middle aged players seemed to have overall more regular season wins.

ggplot(team_wins_age, aes(x=Wins, y= AVG_AGE))+
  geom_line(color='black')+
  geom_point(shape=20, size=4, color='red', fill='white')+
  labs( x = "Wins", y = "Average Age", title = "Do young teams have what it takes to win?")+
  theme_tufte()+
  theme(plot.title = element_text(hjust=0.5))+
  geom_label_repel(aes(label=Category),
                   size = 3, color = 'Grey', segment.color = 'Blue')

Tab 2

Here we have broken down all players with a minimum of 8 field goal attempts into individual age brackets. The graph is showing us the average points per game for each bracket The graph has an upside down parabola feel to it as the younger players have lower scoring averages while the middle age group are the most conistent with higher scoring averages. Once again the older players average is on the lower end besides one outlier that is really due to a small sample size.

ggplot(nba_fga_min_AGE, aes( x = AGE, y  = x, colour = AGE))+
  geom_point(shape = 17, size = 3)+
  #geom_point(show.legend = T, legend="age_list", shape = 17, size = 3)+
  ggtitle("Average points per game by age")+
  theme(plot.title = element_text(hjust=0.5))+
  xlab("AGE")+
  ylab("Points")+
  guides(colour = guide_legend(title = "AGE"))+
  theme_hc()

Tab 3

The pie chart represents average number of 3 pointers made by age per game. Here I wanted to see if older players made more 3 pointers to avoid a potentially more phsyical game than that of the younger players. As you can see many of the younger players age group averaged under one succesful 3 pointer a game, while around the age of 30 there was a large jump in 3 pointers made for all groups with a minimum one 3 pointer made a game.

ggplot(age_sum, aes(x="", y=x, fill = AGE)) +
  geom_bar(stat="identity", width = 1, color = "white") +
  coord_polar("y", start = 0) +
  labs(fill = "Age", x = NULL, y = NULL, title = "3 Pointers by Age")+
  geom_text(aes(label= x, size = 1), position = position_stack(vjust= 0.5))+
  theme_light()

Tab 4

This graph is showing us each teams wins with a line that represents total 3 pointers made during the season. The fact that many of the top winning teams had an overall low 3 pointers made total compared to that of some of the less win teams shows that in this season, teams that made more 3 pointers were not necessarily victorious.

first_plot <- ggplot(team_data_final, aes(x= reorder(Category,Wins)))+
  geom_bar(aes(y= Wins),stat = "identity", position = "dodge", fill= team_data_final$Wins)+
  coord_flip()+
  labs(title = "Do 3 pointers made by team coincide with their wins?")+
  theme(plot.title = element_text(hjust=0.5))+
  ylab("Wins")+
  xlab("NBA Team")
  #scale_fill_manual("legend", values = c("OKC" = "blue"))
  
first_plot+geom_line(inherit.aes = FALSE, data=team_data_final,
                aes(x= Category, y= `3PM`/20, colour = "Total 3s", group=1), size = 1)+
 scale_color_manual(NULL, values= "black")

Tab 5

The graph below displays the amount of minutes averaged per player in each age bracket with players playing a minimum of at least 10 minutes a game. Based off current trends with the NBA trying to preserve players for the playoffs and not over playing players, it is surprising to see many of the older players average more minutes than most of the younger groups. This could also be that younger players are not playing as much due to inexperience.

ggplot(min_sum, aes(x= AGE, y= x))+
  geom_col(fill= min_sum$AGE)+
  labs( x = "AGE", y = "Average Minutes Played", title = "Do older players play less minutes?")+
  theme_hc()+
  theme(plot.title = element_text(hjust=0.5))

Conclusion

From this dataset, we can assume that age does reduce the production of winning teams and NBA stats during the 2020 regular season. We have seen that players between the ages of 28-32 really hold higher scoring averages and play more minutes than most age groups. We also analyzed 3 points made and during the season there was low correlation with 3 pointers made and team wins, which was a bit surprising as the NBA has made a big push towards a more outside centric league. However, using only one year of data does not allow us to track prior years and determine if this season is an outlier from past. Based on this single season analysis, there are indications that 3 pointers made and age effect overall wins in the regular season. However to really get a better analysis and further statistically significant correlations, we should dive deeper into historical data sets or even track individual players over the course of their careers to get a better undertanding of how age factors into NBA stats and wins.