Final Project

# Load packages
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(tidyr)
library(ggplot2)
library(readr)
# Load data
nba <-
  read.csv("https://myxavier-my.sharepoint.com/:x:/g/personal/reddyj_xavier_edu/EWZIHPs9uIRHk8BEZM2EMVoBsUBx1ZHIk0oTM6128qJpVQ?download=1")

I would like to explore the historical trends of how the NBA game has evolved. One part of the game that has become very apparent to viewers is that the number of three point attempts have skyrocketed in the modern era. Using a data set of players stats from 1991 to 2020, this will show how the game has changed over the past 30 plus years. Since I have all the stats I will look at how many 3 pointers have been attempted by year, as well as other stats like steals, rebounds, assists and turnovers to see if there are any other ways the game has evolved.

Link: https://myxavier-my.sharepoint.com/:x:/g/personal/reddyj_xavier_edu/EWZIHPs9uIRHk8BEZM2EMVoBsUBx1ZHIk0oTM6128qJpVQ?download=1

Data Dictionary

Variable Description
Player Name of player
Pos Position
Age Age of player during the season
Tm Team
G Games played
GS Games started
MP Minutes played per game
FG Field goals made per game
FGA Field goals attempted per game
FG. Field goal percentage
X3P 3 pointers made per game
X3PA 3 pointers attempted per game
X3P% 3 point percentage
FT. Free throws made per game
FTA Free throws attempted per game
FT% Free throw percentage
ORB Offensive rebounds per game
DRB Defensive rebounds per game
TRB Rebounds per game
AST Assists per game
STL Steals per game
BLK Blocks per game
TOV Turnovers per game
PTS Points per game
Year Year of the player’s stats

Summary Statistics

Statistic Average Value
Age 26.80 years
Games Played (G) 53 games
Minutes Played (MP) 20.40 minutes/game
Field Goals (FG) 3.09 made/game
Field Goal Attempts (FGA) 6.87 attempts/game
Field Goal % (FG%) 43.66%
3-Point Field Goals (3P) 0.52 made/game
3-Point Attempts (3PA) 1.50 attempts/game
3-Point % (3P%) 23.26%
Free Throws Made (FT) 1.53 made/game
Free Throw Attempts (FTA) 2.05 attempts/game
Free Throw % (FT%) 70.05%
Offensive Rebounds (ORB) 1.00/game
Defensive Rebounds (DRB) 2.58/game
Total Rebounds (TRB) 3.57/game
Assists (AST) 1.84/game
Steals (STL) 0.66/game
Blocks (BLK) 0.42/game
Turnovers (TOV) 1.21/game
Points (PTS) 8.23/game

How many 3-Pointers does a player average per year?

nba %>%
  group_by(Year) %>%
  summarise(avg_3PA = mean(X3PA, na.rm = TRUE)) %>%
  arrange(Year) %>%
  ggplot(aes(x = factor(Year), y = avg_3PA)) +
  geom_col(fill = "blue") +
  scale_x_discrete(breaks = function(x) x[as.integer(x) %% 5 == 0]) +
  labs(
    title = "3-Point Attempts per Game by Year",
    x = "Season Year",
    y = "Average 3PA per Game"
  )

This bar graph shows the consistent increase in 3-Point attempts by players over the past 3 decades. Back in 1991 players were attempting a 3-Pointer about once ever other game. We can see the steady increase over time, but the boom in the past decade is the most intriguing. In 2020, the average player attempted more than 2.5 3-Pointers a game.

How many minutes per game do starters play based on their age?

nba %>%
  filter(GS >= 30) %>%
  group_by(Age) %>%
  summarise(avg_MPG = mean(MP, na.rm = TRUE)) %>%
  arrange(Age) %>%
  ggplot(aes(x = factor(Age), y = avg_MPG)) +
  geom_col(fill = "red") +
  labs(
    title = "Average Minutes Per Game by Age (Minimum 30 Starts)",
    x = "Age",
    y = "Average Minutes Per Game"
  )

One of the questions asked in basketball is: what are the prime years of a players career? This graph shows the average minutes a player plays, but we only took data from players who started more than 30 games in order to leave out outliers. It looks like playing over 30 minutes a night would be the sweet-spot to determine the prime years of a player, which looks to be 21-33 years old. Once a player hits their mid 30’s we can clearly see their minutes per game falls.

Are players sitting out more games than in the past?

nba %>%
  group_by(Year) %>%
  summarise(
    total_players = n(),
    full_82_games = sum(G == 82, na.rm = TRUE),
    percent_82 = 100 * full_82_games / total_players
  ) %>%
  ggplot(aes(x = factor(Year), y = percent_82)) +
  geom_col(fill = "orange") +
  scale_x_discrete(breaks = function(x) x[as.integer(x) %% 5 == 0]) + 
  labs(
    title = "Percentage of Players Who Played All 82 Games by Season",
    x = "Season Year",
    y = "Percentage of Players"
  )

One of the big topics in the NBA right now is load management, or rest days for players. This graph shows the percentage of players that have played all 82 games in their season. The outliers are 1999, 2012 and 2020, because these were shortened seasons due to lockouts or COVID. Apart from these outliers we can see a trend, where players are playing in less games than they used to.

Is the NBA Becoming More Offensive?

nba %>%
  mutate(total_points = PTS * G) %>%
  group_by(Year) %>%
  summarise(
    total_pts_year = sum(total_points, na.rm = TRUE),
    max_games = max(G, na.rm = TRUE)
  ) %>%
  mutate(
    total_games = max_games * 30,
    points_per_game_league = total_pts_year / total_games
  ) %>%
  ggplot(aes(x = factor(Year), y = points_per_game_league)) +
  geom_col(fill = "darkgreen") +
  scale_x_discrete(breaks = function(x) x[as.integer(x) %% 5 == 0]) + 
  labs(
    title = "Total Points Per Game by Season",
    x = "Season Year",
    y = "Total Points Per Game"
  )

Another topic of discussion is the amount of defensive effort players are putting in. As we can see from the graph, the amount of points scored by a team per game is on the rise, and the two reasons argued amongst fans is the lack of defensive effort and the pace of play, which allows more shots per game. Although we have seen an increase over the past two decades, we were actually seeing less offensive production in the late 90’s and early 2000’s.

Does the Increase in 3-Points Attempts Effect the Amount of Free Throw Attempts?

nba %>%
  group_by(Year) %>%
  summarise(
    total_fta = sum(FTA * G, na.rm = TRUE),
    total_3pa = sum(X3PA * G, na.rm = TRUE),
    max_games = max(G, na.rm = TRUE)
  ) %>%
  mutate(
    total_games = max_games * 30,
    fta_per_game = total_fta / total_games,
    x3pa_per_game = total_3pa / total_games
  ) %>%
  pivot_longer(cols = c(fta_per_game, x3pa_per_game),
               names_to = "Stat", values_to = "AttemptsPerGame") %>%
  ggplot(aes(x = Year, y = AttemptsPerGame, color = Stat)) +
  geom_line(size = 1.2) +
  geom_point(size = 2) +
  scale_color_manual(
    values = c("fta_per_game" = "orange", "x3pa_per_game" = "purple"),
    labels = c("Free Throw Attempts", "3-Point Attempts")
  ) +
  labs(
    title = "League-Wide FTA vs 3PA Per Game by Season",
    x = "Season Year",
    y = "Attempts Per Game"
  )
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.

NBA fans argue that more 3 point attempts leads to less fouls called. The graph shows the number of 3 pointers attempted by team per game as well as the number of free throws attempted by a team per game. Since 2006, the amount of free throws attempted per game has decrease by 5 attempts a game, while 3 point attempts have more than doubled. However the decrease in free throws attempted does not reflect a direct correlation to the increase in 3 point attempts per game.

Data Scraped

# Load data
nba2 <- 
  read.csv("https://myxavier-my.sharepoint.com/:x:/g/personal/reddyj_xavier_edu/EX5zyo8pSRNHt-71Eb_ZF64BYWbAIVJ20Yr5nuqeUT5dTA?download=1")

Link:https://myxavier-my.sharepoint.com/:x:/g/personal/reddyj_xavier_edu/EX5zyo8pSRNHt-71Eb_ZF64BYWbAIVJ20Yr5nuqeUT5dTA?download=1

This data was ethically scraped using the rvest package. The link this data is scraped is https://www.basketball-reference.com/leagues/NBA_2025_totals.html This data has all of the total stats for NBA players in the 2024-25 season. Using this data, we can look at the current state of the NBA, and see if this data backs up the changes we noticed in the previous analysis.

Scraped Data Dictionary

Variable Description
Rk Player ID
Player Player’s name
Pos Position
Age Player’s current age
Team Player’s current team
G Total games played for the season
MP Total minutes played for the season
FG Total field goals made for the season
FGA Total field goal attempts for the season
FG. Field goal percentage
X3P Total 3-Pointers made for the season
X3PA Total 3-Point attempts for the season
X3P. 3-Point percentage
FT Total free throws made for the season
FTA Total free throw attempts for the season
FT. Free throw percentage
TRB Total rebounds for the season
AST Total assists for the season
STL Total steals for the season
BLK Total blocks for the season
TOV Total turnovers for the season
PF Total fouls for the season
PTS Total points for the season
league League player plays in

Do teams that Attempt More 3-Point Shots, Attempt Less Free-Throws?

nba2 %>%
  filter(!Team %in% c("2TM", "3TM")) %>%
  group_by(Team) %>%
  summarise(
    total_3P = sum(X3P, na.rm = TRUE),
    total_FT = sum(FT, na.rm = TRUE)
  ) %>%
  arrange(desc(total_3P)) %>%
  ggplot(aes(x = reorder(Team, -total_3P))) +
  geom_col(aes(y = total_3P), fill = "blue") +
  geom_line(aes(y = total_FT, group = 1), color = "red", size = 1) +
  geom_point(aes(y = total_FT), color = "red") +
  labs(
    title = "3-Pointers (Bar) vs Free Throws (Line) by Team",
    y = "Total",
    x = "Team"
  ) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Based on the previous graph, where there wasn’t enough correlation to prove an increase in 3-Point attempts lowers the amount of free-throw attempts, this graph disproves that theory.The line graph does not maintain any consistency, and shows there is no correlation between 3-Point attempts and free throw attempts.

Conclusion

Based on the data, it is clear the game of basketball has changed in the style of play over the last three decades. The changes that stuck out the most were the increases in points per game, 3-Point attempts per game, and the lack of players playing all 82 games in a season.