The NBA uses a salary cap to promote competitive fairness; but can money still buy wins? This project looks into whether team payroll predicts win percentage across all 30 NBA teams from 2021–2023.
This question connects to sociologist Pierre Bourdieu’s (1986) theory of capital, which argues that economic resources convert into structural advantage even in systems designed to be fair. Prior research supports this: Berri & Schmidt (2006) found a weak but positive relationship between NBA payroll and wins, and reporting from FiveThirtyEight (Paine, 2015) showed large-market teams consistently spend above the luxury tax line while small-market teams cluster near the floor.
I use two data sources joined by team name and season:
1. NBA Payroll (Kaggle)
Historical team payroll from 1990–2023, downloaded from Kaggle
(loganlauton/nba-players-and-team-data). Filtered to 2021–2023 seasons
and cleaned in R.
2. Win Percentage (ESPN via hoopR)
League win percentage pulled using the espn_nba_standings()
function from the hoopR R package for the same seasons.
# Load payroll
payroll_raw <- read_csv("~/Downloads/archive/NBA Payroll(1990-2023).csv")
head(payroll_raw)
## # A tibble: 6 × 5
## ...1 team seasonStartYear payroll inflationAdjPayroll
## <dbl> <chr> <dbl> <chr> <chr>
## 1 0 Cleveland 1990 $14,403,000 $32,854,246
## 2 1 New York 1990 $13,290,000 $30,315,416
## 3 2 Detroit 1990 $12,910,000 $29,448,608
## 4 3 LA Lakers 1990 $12,120,000 $27,646,565
## 5 4 Atlanta 1990 $11,761,000 $26,827,658
## 6 5 Dallas 1990 $11,693,000 $26,672,548
payroll <- payroll_raw %>%
filter(seasonStartYear %in% c(2020, 2021, 2022)) %>%
mutate(
payroll_clean = str_remove_all(payroll, "\\$|,") %>% as.numeric(),
season = seasonStartYear + 1,
team = case_when(
team == "Cleveland" ~ "Cleveland Cavaliers",
team == "New York" ~ "New York Knicks",
team == "Detroit" ~ "Detroit Pistons",
team == "LA Lakers" ~ "Los Angeles Lakers",
team == "Atlanta" ~ "Atlanta Hawks",
team == "Dallas" ~ "Dallas Mavericks",
team == "Philadelphia" ~ "Philadelphia 76ers",
team == "Milwaukee" ~ "Milwaukee Bucks",
team == "Phoenix" ~ "Phoenix Suns",
team == "Brooklyn" ~ "Brooklyn Nets",
team == "Boston" ~ "Boston Celtics",
team == "Portland" ~ "Portland Trail Blazers",
team == "Golden State" ~ "Golden State Warriors",
team == "San Antonio" ~ "San Antonio Spurs",
team == "Indiana" ~ "Indiana Pacers",
team == "Utah" ~ "Utah Jazz",
team == "Oklahoma City" ~ "Oklahoma City Thunder",
team == "Houston" ~ "Houston Rockets",
team == "Denver" ~ "Denver Nuggets",
team == "Miami" ~ "Miami Heat",
team == "Memphis" ~ "Memphis Grizzlies",
team == "Minnesota" ~ "Minnesota Timberwolves",
team == "New Orleans" ~ "New Orleans Pelicans",
team == "LA Clippers" ~ "LA Clippers",
team == "Chicago" ~ "Chicago Bulls",
team == "Sacramento" ~ "Sacramento Kings",
team == "Orlando" ~ "Orlando Magic",
team == "Washington" ~ "Washington Wizards",
team == "Charlotte" ~ "Charlotte Hornets",
team == "Toronto" ~ "Toronto Raptors",
TRUE ~ team
)
) %>%
select(team, season, payroll_clean)
head(payroll)
## # A tibble: 6 × 3
## team season payroll_clean
## <chr> <dbl> <dbl>
## 1 Golden State Warriors 2021 171105334
## 2 Brooklyn Nets 2021 170444633
## 3 Philadelphia 76ers 2021 147825311
## 4 LA Clippers 2021 139722606
## 5 Los Angeles Lakers 2021 139334713
## 6 Utah Jazz 2021 136881324
# Pull win percentage from ESPN via hoopR
seasons <- c(2021, 2022, 2023)
standings <- map_df(seasons, function(s) {
espn_nba_standings(s) %>%
mutate(season = s) %>%
select(team, leaguewinpercent, season)
})
head(standings)
## # A tibble: 6 × 3
## team leaguewinpercent season
## <chr> <dbl> <dbl>
## 1 Utah Jazz 0.667 2021
## 2 Phoenix Suns 0.714 2021
## 3 Philadelphia 76ers 0.738 2021
## 4 Brooklyn Nets 0.619 2021
## 5 Denver Nuggets 0.619 2021
## 6 LA Clippers 0.643 2021
# Join payroll and standings
teams <- inner_join(standings, payroll, by = c("team", "season")) %>%
mutate(
leaguewinpercent = as.numeric(leaguewinpercent),
payroll_M = payroll_clean / 1e6
)
glimpse(teams)
## Rows: 90
## Columns: 5
## $ team <chr> "Utah Jazz", "Phoenix Suns", "Philadelphia 76ers", "B…
## $ leaguewinpercent <dbl> 0.6666667, 0.7142857, 0.7380952, 0.6190476, 0.6190476…
## $ season <dbl> 2021, 2021, 2021, 2021, 2021, 2021, 2021, 2021, 2021,…
## $ payroll_clean <dbl> 136881324, 128858241, 147825311, 170444633, 129793210…
## $ payroll_M <dbl> 136.8813, 128.8582, 147.8253, 170.4446, 129.7932, 139…
Data limitations: Payroll does not capture player quality, injuries, or mid-season trades. Metro income is a rough proxy for market size.
I use a scatter plot with a linear regression line to examine the relationship between average team payroll and average win percentage across the 2021–2023 seasons.
Why linear regression? Win percentage is a continuous outcome and payroll is a continuous predictor. Linear regression is the most interpretable method for this relationship and appropriate for the sample size of 30 teams.
Pre-processing: Team names were standardized across
sources using case_when() . Payrolls were averaged across
three seasons to produce one observation per team.
# Average across seasons — one row per team
teams_avg <- teams %>%
group_by(team) %>%
summarise(
avg_payroll_M = mean(payroll_M, na.rm = TRUE),
avg_winpct = mean(leaguewinpercent, na.rm = TRUE)
)
head(teams_avg)
## # A tibble: 6 × 3
## team avg_payroll_M avg_winpct
## <chr> <dbl> <dbl>
## 1 Atlanta Hawks 130. 0.524
## 2 Boston Celtics 135. 0.582
## 3 Brooklyn Nets 173. 0.604
## 4 Charlotte Hornets 117. 0.505
## 5 Chicago Bulls 134. 0.538
## 6 Cleveland Cavaliers 134. 0.473
# Linear regression: does payroll predict win percentage?
model <- lm(avg_winpct ~ avg_payroll_M, data = teams_avg)
summary(model)
##
## Call:
## lm(formula = avg_winpct ~ avg_payroll_M, data = teams_avg)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.261601 -0.042897 0.004459 0.079719 0.241765
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.028014 0.177158 0.158 0.875
## avg_payroll_M 0.003497 0.001302 2.685 0.012 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1209 on 28 degrees of freedom
## Multiple R-squared: 0.2048, Adjusted R-squared: 0.1764
## F-statistic: 7.21 on 1 and 28 DF, p-value: 0.01205
ggplot(teams_avg, aes(x = avg_payroll_M, y = avg_winpct, label = team)) +
geom_point(size = 3.5, alpha = 0.85, color = "#0D1B2A") +
geom_smooth(method = "lm", se = TRUE, color = "#F5A623", fill = "#F5A62340") +
geom_text_repel(size = 2.6, color = "#5F6368", max.overlaps = 30) +
scale_x_continuous(labels = label_dollar(suffix = "M")) +
scale_y_continuous(labels = label_percent(), limits = c(0, 1)) +
labs(
title = "Do Richer Teams Win More?",
subtitle = "Average payroll vs. average win percentage per team (2021–2023)",
x = "Average Team Payroll",
y = "Win Percentage",
caption = "Sources: Kaggle (NBA Payroll 1990-2023), ESPN via hoopR"
) +
theme_minimal(base_size = 13) +
theme(
plot.title = element_text(face = "bold", color = "#0D1B2A"),
plot.subtitle = element_text(color = "#5F6368", size = 10),
plot.caption = element_text(color = "#9AA0A6", size = 8),
panel.grid.minor = element_blank()
)
The regression line shows a positive relationship between payroll and win percentage. The wide confidence band indicates the relationship is real but not deterministic. Payroll raises the floor but does not guarantee outcomes. Notable outliers include Phoenix Suns (overperformed their budget) and Houston Rockets (underperformed despite high spending).
This analysis supports the structural advantage hypothesis. Teams at the bottom of the payroll range rarely break 50% win rate, while high-spending teams cluster near the top. This is consistent with Bourdieu’s argument that economic capital converts into competitive advantage even in regulated systems like the NBA’s soft cap.
The luxury tax was designed to level the playing field, but large-market teams consistently absorb the penalty, maintaining a spending advantage small-market teams cannot match.
Outliers matter: Memphis and Dallas outperformed their payroll through drafting and development, showing that financial disadvantage can be partially offset, but these remain exceptions, not the rule.
Next steps: