library(tidyverse)
player_data_url <- "https://myxavier-my.sharepoint.com/:x:/g/personal/augt_xavier_edu/IQBautRyLm1BRoQIgkU4cD3aAcoMwKtEcyYygWBw5GPxYo8?e=zPttoa&download=1"
team_data_url <- "https://myxavier-my.sharepoint.com/:x:/g/personal/augt_xavier_edu/IQA556z4MxpCRJzwDSrp7QARAVtCKj67kcy0MGB7eBdcjIk?e=7GgceK&download=1"
player_stats <-
read_csv(player_data_url)
team_stats <-
read_csv(team_data_url)Positionless Spacing: How the NBA’s Three-Point Growth Changed Player Roles
BAIS 462 Final Project
Introduction
As a basketball fan and business analytics student, I am interested in how data can explain the way the NBA changes over time. One of the biggest changes in modern basketball is the growth of three-point shooting. The league is not only taking more threes, but also asking different kinds of players to become floor spacers.
This project asks:
How has NBA three-point shooting changed from 2015 to 2025, and what player types are driving the change?
This question matters because it connects directly to roster construction and player evaluation. Teams do not only need to know that the league is taking more threes. They also need to understand whether the trend is concentrated among guards or whether it has spread to forwards, centers, and more regular rotation players. That distinction changes how teams value prospects, role players, and big men.
Data and Research Design
This project uses two related datasets from Basketball-Reference.
The primary dataset is a player-level dataset of NBA per-game statistics from the 2015 through 2025 seasons. It includes player name, position, age, team, games played, minutes per game, field goal attempts, three-point attempts, three-point percentage, points per game, season, and source URL.
The secondary dataset is a team-level dataset of NBA per-game statistics from the same seasons. This secondary source is used to compare player-level three-point trends to team-level offensive trends.
Both datasets were collected using HTML scraping tools in separate R scripts and then saved as static CSV files. The final report imports the static CSV files rather than scraping Basketball-Reference live.
Dataset Links
The player data is hosted from the Assignment 7 dataset.
Inspecting the Data
names(player_stats) [1] "season" "player" "pos" "age" "team"
[6] "games" "minutes" "fga" "three_pa" "three_pct"
[11] "pts" "source_url" "primary_pos"
names(team_stats) [1] "season" "team" "games" "team_fg"
[5] "team_fga" "team_fg_pct" "team_three_p" "team_three_pa"
[9] "team_three_pct" "team_pts" "source_url"
glimpse(player_stats)Rows: 5,878
Columns: 13
$ season <dbl> 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015…
$ player <chr> "A.J. Price", "Aaron Brooks", "Aaron Gordon", "Adreian Pay…
$ pos <chr> "PG", "PG", "PF", "PF", "C", "C", "SF", "SG", "SG", "C", "…
$ age <dbl> 28, 30, 19, 23, 28, 30, 24, 32, 23, 23, 21, 26, 26, 22, 27…
$ team <chr> "3TM", "CHI", "ORL", "2TM", "ATL", "CHO", "DAL", "BRK", "U…
$ games <dbl> 26, 82, 47, 32, 76, 65, 74, 74, 27, 5, 69, 42, 68, 51, 54,…
$ minutes <dbl> 12.5, 23.0, 17.0, 23.1, 30.5, 30.6, 18.5, 23.6, 33.3, 2.8,…
$ fga <dbl> 5.3, 10.0, 4.4, 6.9, 12.7, 15.5, 4.8, 5.9, 11.1, 0.8, 5.1,…
$ three_pa <dbl> 2.2, 3.8, 1.0, 0.3, 0.5, 0.1, 1.7, 2.8, 2.5, 0.0, 0.0, 3.3…
$ three_pct <dbl> 0.263, 0.387, 0.271, 0.111, 0.306, 0.400, 0.274, 0.348, 0.…
$ pts <dbl> 5.1, 11.6, 5.2, 6.7, 15.2, 16.6, 5.6, 7.4, 13.9, 0.8, 6.3,…
$ source_url <chr> "https://www.basketball-reference.com/leagues/NBA_2015_per…
$ primary_pos <chr> "PG", "PG", "PF", "PF", "C", "C", "SF", "SG", "SG", "C", "…
glimpse(team_stats)Rows: 341
Columns: 11
$ season <dbl> 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2…
$ team <chr> "Golden State Warriors*", "Los Angeles Clippers*", "Dal…
$ games <dbl> 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82,…
$ team_fg <dbl> 41.6, 39.4, 39.7, 38.8, 37.9, 37.0, 39.1, 37.7, 38.7, 3…
$ team_fga <dbl> 87.0, 83.3, 85.8, 86.8, 83.3, 83.3, 83.6, 82.2, 86.0, 8…
$ team_fg_pct <dbl> 0.478, 0.473, 0.463, 0.447, 0.455, 0.444, 0.468, 0.458,…
$ team_three_p <dbl> 10.8, 10.1, 8.9, 7.7, 8.9, 11.4, 8.3, 10.1, 9.8, 10.0, …
$ team_three_pa <dbl> 27.0, 26.9, 25.4, 22.7, 25.1, 32.7, 22.5, 27.5, 27.2, 2…
$ team_three_pct <dbl> 0.398, 0.376, 0.352, 0.339, 0.352, 0.348, 0.367, 0.367,…
$ team_pts <dbl> 110.0, 106.7, 105.2, 104.0, 104.0, 103.9, 103.2, 103.1,…
$ source_url <chr> "https://www.basketball-reference.com/leagues/NBA_2015.…
nrow(player_stats)[1] 5878
ncol(player_stats)[1] 13
nrow(team_stats)[1] 341
ncol(team_stats)[1] 11
Data Dictionary
Primary Dataset: Player-Level Statistics
| Variable | Meaning |
|---|---|
season |
NBA season endpoint year |
player |
player name |
pos |
listed position |
age |
player age |
team |
team abbreviation |
games |
games played |
minutes |
minutes per game |
fga |
field goal attempts per game |
three_pa |
three-point attempts per game |
three_pct |
three-point percentage |
pts |
points per game |
source_url |
Basketball-Reference source URL |
primary_pos |
simplified position extracted from pos |
Secondary Dataset: Team-Level Statistics
| Variable | Meaning |
|---|---|
season |
NBA season endpoint year |
team |
team name |
games |
games played |
team_fg |
team field goals per game |
team_fga |
team field goal attempts per game |
team_fg_pct |
team field goal percentage |
team_three_p |
team made threes per game |
team_three_pa |
team three-point attempts per game |
team_three_pct |
team three-point percentage |
team_pts |
team points per game |
source_url |
Basketball-Reference source URL |
Cleaning and Preparation
The player data includes some low-minute players. A player who only appears in a few games or plays very few minutes can create misleading percentage results. To focus on players with meaningful roles, I filtered to players with at least 20 games played and at least 10 minutes per game.
player_analysis <-
player_stats %>%
mutate(
season = as.numeric(season),
age = as.numeric(age),
games = as.numeric(games),
minutes = as.numeric(minutes),
fga = as.numeric(fga),
three_pa = as.numeric(three_pa),
three_pct = as.numeric(three_pct),
pts = as.numeric(pts),
primary_pos = as.factor(primary_pos)
) %>%
filter(
games >= 20,
minutes >= 10,
!is.na(three_pa)
)
team_analysis <-
team_stats %>%
mutate(
team = str_remove(team, "\\*$"),
season = as.numeric(season),
games = as.numeric(games),
team_three_pa = as.numeric(team_three_pa),
team_three_pct = as.numeric(team_three_pct),
team_pts = as.numeric(team_pts)
) %>%
filter(
!str_detect(team, "League Average"),
!is.na(team_three_pa)
)Summary Statistics
player_summary_stats <-
player_analysis %>%
summarise(
rows = n(),
players = n_distinct(player),
seasons = n_distinct(season),
avg_3pa = mean(three_pa, na.rm = TRUE),
median_3pa = median(three_pa, na.rm = TRUE),
avg_3p_pct = mean(three_pct, na.rm = TRUE),
avg_pts = mean(pts, na.rm = TRUE)
)
player_summary_stats# A tibble: 1 × 7
rows players seasons avg_3pa median_3pa avg_3p_pct avg_pts
<int> <int> <int> <dbl> <dbl> <dbl> <dbl>
1 4308 1033 11 3.02 2.7 0.325 10.5
team_summary_stats <-
team_analysis %>%
summarise(
rows = n(),
teams = n_distinct(team),
seasons = n_distinct(season),
avg_team_3pa = mean(team_three_pa, na.rm = TRUE),
avg_team_3p_pct = mean(team_three_pct, na.rm = TRUE),
avg_team_pts = mean(team_pts, na.rm = TRUE)
)
team_summary_stats# A tibble: 1 × 6
rows teams seasons avg_team_3pa avg_team_3p_pct avg_team_pts
<int> <int> <int> <dbl> <dbl> <dbl>
1 330 30 11 31.4 0.358 109.
The player dataset provides the individual view of how roles changed. The team dataset provides the strategic view of how team offenses changed. Looking at both levels makes the analysis more complete than only studying individual players.
Descriptive Analysis 1: Player Three-Point Volume Over Time
The first question is whether the average NBA player’s role became more three-point oriented.
player_season_summary <-
player_analysis %>%
group_by(season) %>%
summarise(
avg_player_3pa = mean(three_pa, na.rm = TRUE),
median_player_3pa = median(three_pa, na.rm = TRUE),
avg_player_3p_pct = mean(three_pct, na.rm = TRUE),
players = n()
) %>%
arrange(season)
player_season_summary# A tibble: 11 × 5
season avg_player_3pa median_player_3pa avg_player_3p_pct players
<dbl> <dbl> <dbl> <dbl> <int>
1 2015 2.17 1.9 0.299 386
2 2016 2.27 2 0.306 377
3 2017 2.57 2.35 0.312 374
4 2018 2.78 2.6 0.325 384
5 2019 3.03 2.7 0.324 392
6 2020 3.24 3 0.334 381
7 2021 3.41 3.1 0.335 395
8 2022 3.37 3 0.329 409
9 2023 3.29 3 0.333 399
10 2024 3.35 3.1 0.342 399
11 2025 3.59 3.3 0.334 412
player_season_summary %>%
ggplot(aes(x = season, y = avg_player_3pa)) +
geom_line() +
geom_point() +
labs(
title = "Average Player Three-Point Attempts Per Game by Season",
x = "Season",
y = "Average Player 3PA Per Game"
)Average player three-point attempts increased across the period. This supports the idea that the average NBA player’s offensive role became more three-point oriented.
Descriptive Analysis 2: Player Three-Point Efficiency Over Time
The next question is whether the increase in attempts came with a similar improvement in shooting percentage.
player_season_summary %>%
ggplot(aes(x = season, y = avg_player_3p_pct)) +
geom_line() +
geom_point() +
labs(
title = "Average Player Three-Point Percentage by Season",
x = "Season",
y = "Average Player 3P%"
)This chart separates volume from efficiency. The league increased attempts faster than shooting percentage improved, so this visual frames the three-point boom mainly as a change in shot selection, spacing, and offensive strategy.
Descriptive Analysis 3: Position Trends
Three-point shooting used to be associated mostly with guards and wings. This section checks whether the growth is also visible among forwards and centers.
position_summary <-
player_analysis %>%
group_by(season, primary_pos) %>%
summarise(
avg_3pa = mean(three_pa, na.rm = TRUE),
avg_3p_pct = mean(three_pct, na.rm = TRUE),
players = n(),
.groups = "drop"
) %>%
filter(primary_pos %in% c("PG", "SG", "SF", "PF", "C"))
position_summary# A tibble: 55 × 5
season primary_pos avg_3pa avg_3p_pct players
<dbl> <fct> <dbl> <dbl> <int>
1 2015 C 0.260 0.190 68
2 2015 PF 1.34 0.252 84
3 2015 PG 2.94 0.319 80
4 2015 SF 2.80 0.346 72
5 2015 SG 3.30 0.344 82
6 2016 C 0.347 0.171 73
7 2016 PF 1.65 0.291 78
8 2016 PG 2.85 0.335 77
9 2016 SF 3.05 0.340 72
10 2016 SG 3.43 0.355 77
# ℹ 45 more rows
position_summary %>%
ggplot(aes(x = season, y = avg_3pa, color = primary_pos)) +
geom_line() +
geom_point() +
labs(
title = "Average Three-Point Attempts by Position",
x = "Season",
y = "Average 3PA Per Game",
color = "Position"
)The position trend is the most important part of the project. It shows that three-point growth is broader than a guard-only trend and connects the increase to changes in how NBA positions function.
Descriptive Analysis 4: High-Volume Shooters
The league can increase average three-point attempts because a few stars take a lot more threes or because more players across the league take more threes. This table and chart focus on the second possibility.
shooter_counts <-
player_analysis %>%
group_by(season) %>%
summarise(
players_5plus_3pa = sum(three_pa >= 5, na.rm = TRUE),
players_7plus_3pa = sum(three_pa >= 7, na.rm = TRUE),
players_10plus_3pa = sum(three_pa >= 10, na.rm = TRUE),
total_players = n()
)
shooter_counts# A tibble: 11 × 5
season players_5plus_3pa players_7plus_3pa players_10plus_3pa total_players
<dbl> <int> <int> <int> <int>
1 2015 38 4 0 386
2 2016 31 8 1 377
3 2017 48 12 1 374
4 2018 64 14 1 384
5 2019 69 17 2 392
6 2020 78 26 2 381
7 2021 99 30 3 395
8 2022 104 35 1 409
9 2023 85 33 4 399
10 2024 96 30 2 399
11 2025 111 37 5 412
shooter_counts %>%
select(season, players_5plus_3pa, players_7plus_3pa, players_10plus_3pa) %>%
pivot_longer(
cols = -season,
names_to = "threshold",
values_to = "players"
) %>%
ggplot(aes(x = season, y = players, color = threshold)) +
geom_line() +
geom_point() +
labs(
title = "Number of High-Volume Three-Point Shooters by Season",
x = "Season",
y = "Number of Players",
color = "Threshold"
)This analysis shows that the three-point boom is not only about a few star shooters. More players reached meaningful three-point volume thresholds over time.
Descriptive Analysis 5: Volume and Efficiency Relationship
player_analysis %>%
filter(!is.na(three_pct)) %>%
ggplot(aes(x = three_pa, y = three_pct)) +
geom_point(alpha = 0.35) +
geom_smooth(method = "lm", se = FALSE) +
labs(
title = "Relationship Between Three-Point Volume and Efficiency",
x = "Three-Point Attempts Per Game",
y = "Three-Point Percentage"
)The scatterplot compares three-point volume and accuracy at the player level. This helps show that high volume and high efficiency are related but not identical skills.
Descriptive Analysis 6: Most Recent Season by Position
most_recent_season <- max(player_analysis$season, na.rm = TRUE)
player_analysis %>%
filter(
season == most_recent_season,
primary_pos %in% c("PG", "SG", "SF", "PF", "C")
) %>%
ggplot(aes(x = primary_pos, y = three_pa)) +
geom_boxplot() +
labs(
title = paste("Three-Point Attempts by Position in", most_recent_season),
x = "Position",
y = "Three-Point Attempts Per Game"
)The boxplot shows how three-point volume is distributed across positions in the most recent season. It gives a snapshot of the modern NBA’s position landscape.
Secondary Data Source: Team-Level Trends
The final project requires a secondary data source. I collected team-level per-game statistics from Basketball-Reference for the same seasons. This gives team context for the player-level findings.
team_season_summary <-
team_analysis %>%
group_by(season) %>%
summarise(
avg_team_3pa = mean(team_three_pa, na.rm = TRUE),
avg_team_3p_pct = mean(team_three_pct, na.rm = TRUE),
avg_team_pts = mean(team_pts, na.rm = TRUE),
teams = n()
) %>%
arrange(season)
team_season_summary# A tibble: 11 × 5
season avg_team_3pa avg_team_3p_pct avg_team_pts teams
<dbl> <dbl> <dbl> <dbl> <int>
1 2015 22.4 0.349 100. 30
2 2016 24.1 0.353 103. 30
3 2017 27.0 0.357 106. 30
4 2018 29.0 0.362 106. 30
5 2019 32.0 0.356 111. 30
6 2020 34.1 0.358 112. 30
7 2021 34.6 0.366 112. 30
8 2022 35.2 0.354 111. 30
9 2023 34.2 0.360 115. 30
10 2024 35.1 0.366 114. 30
11 2025 37.6 0.360 114. 30
team_season_summary %>%
ggplot(aes(x = season, y = avg_team_3pa)) +
geom_line() +
geom_point() +
labs(
title = "Average Team Three-Point Attempts Per Game by Season",
x = "Season",
y = "Average Team 3PA Per Game"
)Team-level three-point attempts increased over the same period, which supports the idea that individual player role changes and team offensive strategy changed together.
Comparing Primary and Secondary Data
To compare player-level and team-level data on the same scale, I indexed each series to 100 in the first season. Indexing allows the growth rates to be compared even though raw player attempts and raw team attempts are on different scales.
comparison_summary <-
player_season_summary %>%
select(season, avg_player_3pa) %>%
left_join(
team_season_summary %>%
select(season, avg_team_3pa),
by = "season"
) %>%
arrange(season) %>%
mutate(
player_3pa_index = avg_player_3pa / first(avg_player_3pa) * 100,
team_3pa_index = avg_team_3pa / first(avg_team_3pa) * 100
)
comparison_summary# A tibble: 11 × 5
season avg_player_3pa avg_team_3pa player_3pa_index team_3pa_index
<dbl> <dbl> <dbl> <dbl> <dbl>
1 2015 2.17 22.4 100 100
2 2016 2.27 24.1 105. 107.
3 2017 2.57 27.0 118. 120.
4 2018 2.78 29.0 128. 129.
5 2019 3.03 32.0 140. 143.
6 2020 3.24 34.1 150. 152.
7 2021 3.41 34.6 157. 155.
8 2022 3.37 35.2 156. 157.
9 2023 3.29 34.2 152. 153.
10 2024 3.35 35.1 155. 157.
11 2025 3.59 37.6 166. 168.
comparison_summary %>%
select(season, player_3pa_index, team_3pa_index) %>%
pivot_longer(
cols = -season,
names_to = "series",
values_to = "index_value"
) %>%
ggplot(aes(x = season, y = index_value, color = series)) +
geom_line() +
geom_point() +
labs(
title = "Indexed Growth in Player and Team Three-Point Attempts",
subtitle = "2015 set equal to 100",
x = "Season",
y = "Indexed 3PA",
color = "Series"
)The indexed comparison connects the two datasets. Team three-point attempts and player three-point attempts rise together, which supports the idea that individual role changes and team strategy changes are part of the same league-wide trend.
comparison_summary %>%
summarise(
correlation_player_team_3pa = cor(avg_player_3pa, avg_team_3pa)
)# A tibble: 1 × 1
correlation_player_team_3pa
<dbl>
1 0.997
Recommendations and Takeaways
The analysis suggests that NBA three-point growth is not only about stars taking more difficult shots. It is also about more players across more roles being asked to shoot from outside. This matters for player evaluation because shooting volume and spacing ability are now important across more positions.
The main takeaways are:
- Average player three-point attempts increased from 2015 to 2025.
- Three-point efficiency did not increase as sharply as three-point volume.
- Position trends show that the change is connected to broader role changes.
- The number of high-volume shooters increased substantially.
- Team-level three-point growth supports the player-level trend.
For teams, this means player development and roster construction should continue to value shooting across multiple positions. For analysts, it also means player evaluation should consider both three-point volume and efficiency rather than relying only on shooting percentage.
Closing Thoughts
This project reflects the type of basketball analytics question I enjoy: a question that starts with something fans can see on the court and then uses data to explain how the game is changing. The NBA’s three-point growth is not just a shooting trend. It is a role-change trend, a roster-construction trend, and a strategy trend.
Source Attribution
Primary and secondary data were scraped from Basketball-Reference.
Source homepage: Basketball-Reference
Player per-game page pattern:
https://www.basketball-reference.com/leagues/NBA_YEAR_per_game.html
Team per-game page pattern:
https://www.basketball-reference.com/leagues/NBA_YEAR.html