NBA Positional Importance

Author

Austin Blumenthal

Published

Invalid Date

The Claim

Many claim that the current state of the NBA is “positionless”, and that all spots can do similar things. In this document I will be comparing things like position based totals from the last 3 NBA seasons to evaluate this claim.

Data preperation

To start out, you can download the csv containing this data from “https://myxavier-my.sharepoint.com/:u:/g/personal/blumenthala_xavier_edu/IQCtmWtWZJT7TLkx-1Exg9kJAcut8TJ6tSuum_JrHzuToeU?e=JmAZ1b”. We will then load the necessary packages for this analysis.

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.1     ✔ readr     2.2.0
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(knitr)

# Load the data

nba_data <- read_csv("nba_player_totals_24_to_26.csv")
Rows: 2206 Columns: 33
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (4): Player, Team, Pos, Awards
dbl (29): Rk, Age, G, GS, MP, FG, FGA, FG%, 3P, 3PA, 3P%, 2P, 2PA, 2P%, eFG%...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Clean up the data so that we can have a simpler experiecne analyzing
nba_clean <- nba_data %>%
  drop_na(Pos) %>%
  mutate(
    # Extract only the first position listed (-)
    Primary_Pos = str_extract(Pos, "^[^-]+"),
    PTS = as.numeric(PTS),
    TRB = as.numeric(TRB),
    AST = as.numeric(AST),
    MP = as.numeric(MP)
  ) %>%
  # Remove any players that recieved very small amounts of play time for
  # reasonable comparisons
  filter(MP > 500)

General Position Total

To understand what each position is doing overall, we will grab just the raw totals by position.

position_totals <- nba_clean %>%
  group_by(Primary_Pos) %>%
  summarize(
    Total_Points = sum(PTS, na.rm = TRUE),
    Total_Rebounds = sum(TRB, na.rm = TRUE),
    Total_Assists = sum(AST, na.rm = TRUE),
    Total_Minutes = sum(MP, na.rm = TRUE),
    Player_Count = n()
  ) %>%
  arrange(desc(Total_Points))

kable(position_totals, caption = "Aggregate Production by Position (2024-2026)")
Aggregate Production by Position (2024-2026)
Primary_Pos Total_Points Total_Rebounds Total_Assists Total_Minutes Player_Count
SG 218990 60451 49744 464738 336
PG 179315 44941 64773 352260 251
SF 166821 57664 32293 353170 238
PF 160965 68077 30136 336787 237
C 147749 95422 27510 312407 229

Visualizing Specialties

After looking at the overall totals by season, we should compare by minutes played to see how the positions perform during their time on the court at an even playing field.

# Let's make this a per 36 minutes played basis
nba_per_36 <- nba_clean %>%
  mutate(
    PTS_per_36 = (PTS / MP) * 36,
    TRB_per_36 = (TRB / MP) * 36,
    AST_per_36 = (AST / MP) * 36
  )
ggplot(nba_per_36, aes(x = AST_per_36, y = TRB_per_36, color = Primary_Pos)) +
  geom_point(alpha = 0.6, size = 2) +
  stat_ellipse(level = 0.8) +
  theme_minimal() +
  labs(
    title = "The specialties of positions: Rebounds vs. Assists (Per 36 Minutes)",
    subtitle = "Highlighting the distinct roles of Centers (Rebounds) vs. Point Guards (Assists)",
    x = "Assists per 36 Minutes",
    y = "Rebounds per 36 Minutes",
    color = "Position"
  )

Offensive Value: Scoring + Playmaking

While many people only notice the values of direct points scored by a player, opportunities created for teammates are equally valuable when considering how important a player is to their team.

# Calculate Total Offense Generated
nba_offense <- nba_clean %>%
  mutate(Total_Offense = PTS + (AST * 2))

ggplot(nba_offense, aes(x = fct_reorder(Primary_Pos, Total_Offense, .fun = median), y = Total_Offense, fill = Primary_Pos)) +
  geom_boxplot(alpha = 0.7, outlier.alpha = 0.4) +
  theme_light() +
  coord_flip() +
  labs(
    title = "Distribution of Total Offense Generated by Position",
    subtitle = "Total Offense = Points + (Assists * 2)",
    x = "Primary Position",
    y = "Estimated Points Created (Season Total)",
    fill = "Position"
  )

Scoring Efficiency: making the most of possessions

The importance of efficiency when taking shots is taking over the NBA by swarm. In the new world of sports analytics, let’s look at how efficient each position is when trying to score.

# We need to calculate the fg percentage by going back and adding totals
# rather than taking the averages of the averages.
fg_summary <- nba_clean %>%
  mutate(
    FG = as.numeric(FG),
    FGA = as.numeric(FGA)
  ) %>%
  group_by(Primary_Pos) %>%
  summarize(
    Total_FG = sum(FG, na.rm = TRUE),
    Total_FGA = sum(FGA, na.rm = TRUE),
    FG_Percentage = Total_FG / Total_FGA
  ) %>%
  arrange(desc(FG_Percentage))

# Let's graph the total field goal percentage by position
ggplot(fg_summary, aes(x = reorder(Primary_Pos, FG_Percentage), y = FG_Percentage, fill = Primary_Pos)) +
  geom_col(alpha = 0.85, width = 0.6) +
  # Add the exact percentage text inside the bars
  geom_text(aes(label = scales::percent(FG_Percentage, accuracy = 0.1)), 
            hjust = 1.2, 
            color = "white", 
            fontface = "bold",
            size = 4) +
  coord_flip() +
  theme_minimal() +
  scale_y_continuous(labels = scales::percent_format()) +
  labs(
    title = "Shooting Efficiency by Position",
    subtitle = "True Field Goal Percentage (Total Makes / Total Attempts)",
    x = "Primary Position",
    y = "Field Goal Percentage (FG%)"
  ) +
  scale_fill_viridis_d(option = "mako", begin = 0.3, end = 0.8) +
  theme(
    legend.position = "none",
    panel.grid.major.y = element_blank()
  )

The Free Throw Complaints

“Foul-baiting” is all over the news and is supposedly a problem that is only happening with star guards. Let’s take a look at free throws per position and see if the issue is the free throws, or just the people making them.

ggplot(nba_clean, aes(x = fct_reorder(Primary_Pos, FTA, .fun = median), y = FTA, fill = Primary_Pos)) +
  geom_boxplot(alpha = 0.75, outlier.color = "red", outlier.size = 2, outlier.alpha = 0.8) +
  coord_flip() +
  theme_light() +
  labs(
    title = "Drawing Contact: Free Throw Attempts by Position",
    subtitle = "Red dots indicate statistical outliers (e.g., superstars and post-up specialists)",
    x = "Primary Position",
    y = "Total Free Throw Attempts (Season)",
    fill = "Position"
  ) +
  scale_fill_viridis_d(option = "plasma", begin = 0.2, end = 0.8) +
  theme(legend.position = "none")

The Results

All of this data points heavily to the league having different positional roles. While there are still examples of people like Wemby, who break these trends, the general center is still different form the general point guard. I will explore more on the free throws issue when I reach my final, but I’d like to point out that there are quite a few anomalies on this list that are not complained about.