From 2013 to 2018, the Golden State Warriors were inarguably one of the best teams in the NBA - averaging over 62 wins per season, scoring on a high percentage of three-point shots, and appearing in five consecutive Finals (including three Finals wins), the Warriors made their case as on of the best teams in the history of the league.

However, the losses of Steph Curry, Klay Thompson (both to injury) and Kevin Durant (free agency) have yielded a lackluster 2019 campaign to date. The Warriors are statistically one of the worst teams in the league this year - the present analysis will assess how far the Warriors have fallen, and why it won’t last forever.


Imports

library(tidyverse)
library(knitr)
library(kableExtra)
library(reshape2)
library(readxl)
library(reticulate)
library(psych)


Data for this analysis was compiled via a custom Python script, which scrapes team statistics from TeamRankings.com. The source code can be found on my GitHub account: Link Here

# Set local file path
setwd(data.path)

# Read in scraped data
nba <- read_excel("NBA_Stats_2019.xlsx")

# Order data by win percentage
nba <- nba %>% 
        arrange(desc(winpct_tot))

# Displays results
nba %>% 
        select(c(team, games, winpct_tot, points)) %>% 
        headTail(top = 2, bottom = 2) %>% 
        kable() %>% kable_styling(kable.format)
team games winpct_tot points
Milwaukee 55 0.86 119.7
LA Lakers 54 0.78 114.8
NA
Cleveland 55 0.27 106.3
Golden State 56 0.21 106.3


Right off the bat, we see that Golden State has the lowest winning percentage in the league, which supports the neccesity of this analysis. We see that less than 15 points per game (PPG) separates Golden State from the best team in the league (Milwaukee). This indicates that winning games in the NBA is a much more nuanced proposition than merely scoring points. Let’s explore this further.


Assessing Golden State


Offensive Stats


Peak-dynasty Warriors offenses were characterized by motion, unpredictability, and dynamic three-point shooting that stretched defenses perilously thin. FiveThirtyEight details the unpredictabile nature of Golden State’s then-championship level play beautifully.ß

Let’s begin our analysis by observing how offensive stats are intercorrelated for teams this season (the 2019-20 season, as of 2/28/20).

Offensive Stat Matrix

Offensive Stat Matrix


Let’s unpack some observations:

  • First-half points and Points/Game have the highest correlation coefficient (r = .90) of any two offensive stats. Notably, this coefficient is higher than that of Second-half points and Points/Game; this suggests that scoring is more interrelated with first-half than second-half performance (don’t tell TNT)

  • % of Points from 2 and % of Points from 3 are nearly anticorrleated (r = -.94) - this follows intuitively, as these metrics are additive (in other words, they are combined to account for all of a team’s PPG - as one goes up the other must go down)

  • % of points from 2 is negatively correlated with offensive efficiency (r = -.28) while % of points from 3 is positively correlated with efficiency (r = .27) - more efficient offensive units, therefore, are those that can effectively score from beyond the three-point line

# Select offensive stat columns
offensive.stats <- colnames(nba)[5:15]

# Grab column of win % data
win.pct <- nba[, "winpct_tot"]

# Empty df to add to
winframe <- data.frame(test.var = NULL, correlation = numeric())

# Loop through offensive stats and assess correlation coefficient w/ win percentage
for (k in 1:length(offensive.stats)) {
        
        temp <- paste("Win %    *    ", offensive.stats[k], sep = "  ")
        winframe[k, "test.var"] <- temp
        winframe[k, "correlation"] <- round(cor(win.pct, nba[, offensive.stats[k]]), 3)
}


# Display results
winframe %>% 
        arrange(desc(correlation)) %>% 
        kable() %>% kable_styling(kable.format)
correlation test.var
0.780 Win % * off_eff
0.762 Win % * fg_pct
0.710 Win % * shoot
0.592 Win % * points
0.572 Win % * second_half
0.504 Win % * first_half
0.495 Win % * fastbreak_eff
0.250 Win % * points_paint
0.137 Win % * pct_from3
-0.122 Win % * pct_from2
-0.209 Win % * off_reb


We observe that offensive efficiency has the most robust relationship with winning percentage for NBA teams this season, with field goal percentage and general shooting percentage following close behind. While we observed that points and first-half performance were highly intercorrelated earlier, it’s evident that second-half performance has a stronger relationship with overal winning percentage.

While we observe that using points as a standalone metric is a flawed approach (due to its low correlation coefficient, relative to other metrics), let’s observe the distribution of points scored across all 30 teams. We’ll first calculate the league average, before assessing each team’s relative distance from that average.

# Calculate leage-wide average PPG (LA == League Average)
points.LA <- mean(nba$points)


Golden State isn’t the worst scoring team in the league, but they’re well below league average. No, points alone are not significantly predictive of a team’s winning percentage, it’s no mistake that the teams on the low-end of this distribution are those that will surely miss the playoffs (Charlotte, New York, Chicago … Orlando is a massive outlier here, as FiveThirtyEight gives them a >99% probability of making the playoffs. We’ll save that for another analysis…)


Golden State’s dominance was predicated on the prowess of the Splash Brothers - Curry and Thompson - who were able to efficiently score points from 30 feet out routinely. We see that this year’s Warriors squad falls a little below league average with a collective 29.4% from three.


This metric is the most notable - we observe that offensive efficiency and winning percentage are highly intercorrelated (r = .78). The fact that Golden State has the lowest offensive efficiency in the league is certainly a factor in explaining their drop off in production this year. We’ll observe the individual players on the Warriors later in this analysis.


Defensive Stats


While offense was the main course for the Warriors during their peak, their performance on the defensive-end of the floor was massively importantly to their success - see NBA.com for a thorough writeup by John Schuhmann. We’ll take a similar methodological approach as we did with the offensive stats above.

Defensive Stat Matrix

Defensive Stat Matrix


Let’s unpack this:

  • Defensive rebounds have a strong negative realationship with opponents’ shooting percentage (r = -.83), and more specifically opponents’ percentage from inside the arc (r = -.87). This follows logically, as teams that make more shots in the paint offer fewer opportunities for rebounds; following that logic, the relationship between defensive rebounds and opponents’ three-point percentage is much less robust (r = -.25)

  • Opponents’ points per game are positively correlated with defensive efficiency (r = .79); keep in mind that efficient defenses have efficiency numbers closer to zero

Let’s correlate these statistics with winning percentage, to see which variables functionally translate into wins.

def.stats <- colnames(nba)[16:24]
win.pct <- nba[, "winpct_tot"]

def.frame <- data.frame(test.var = NULL, correlation = numeric())

for (k in 1:length(def.stats)) {
        
        temp <- paste("Win % * ", def.stats[k], sep = " ")
        def.frame[k, "test.var"] <- temp
        def.frame[k, "correlation"] <- round(cor(win.pct, nba[, def.stats[k]]), 3)
}

def.frame %>% 
        arrange(desc(correlation)) %>% 
        kable() %>% kable_styling(kable.format)
correlation test.var
0.671 Win % * def_reb
0.391 Win % * blocks
-0.047 Win % * steals
-0.622 Win % * opp_paint
-0.655 Win % * opp_ppg
-0.696 Win % * opp_2
-0.701 Win % * opp_3
-0.824 Win % * def_eff
-0.846 Win % * opp_shootpct


Logically, opponents’ shooting percentage is strongly negatively correlated with winning percentage - the more points your opponent puts up, the more difficult it is to consistently win games. Defensive rebounds have a moderately-high correlation with season wins (r = .67); blocks are the only other defensive statistic with a positive correlation to win percentage.


Golden State is giving up the sixth-most points in the league this year, with opposing teams averaging 115.1 points against them. This wouldn’t be a fundamentally damning metric if the team had an offense that could routinely keep them afloat (see: the Memphis Grizzlies). However, as we observed earlier, the Warriors offense is simply not scoring when they have the chance (hence the relatively low offensive efficiency rating).


Defensive rebounds and blocks are the only two positively-correlated defensive metrics in this analysis. We observe that Golden State is well below average on both metrics, averaging 4.7 blocks / game and 32.9 defensive rebounds per game.
Finally, we observe that defensive efficiency and opponents’ shooting percentage are both massively negatively correlated with wins - in other words, as these figures are higher your win percentage is lower. Let’s see how Golden State stacks up.


GSW ranks very high in both metrics, which is evidence not only of their defensive struggles but an obvious contributor to their low winning percentage. There’s a high degree of multicollinearity here (with r = .86 between these variables), but taken together it’s clear that Golden State’s defensive struggles have much to do with their drop off this season.


Individual Player Performances


Digging into the Warriors statistical woes at a more granular level is going to be key to fully understanding what is happening on the court. Let’s read in the player totals from Basketball Reference using the htmltab package.

library(htmltab)

# Read in player data from Basketball reference
players <- "https://www.basketball-reference.com/leagues/NBA_2020_totals.html"
pd <- htmltab(doc = players, which = "//caption[starts-with(text(), 'Player Totals')]/ancestor::table")

# Delete reference rows
pd <- pd %>% 
        filter(Player != "Player")

# Convert rows-of-interest to numeric data type
selected.cols <- c(1,4, 6:length(colnames(pd)))
pd[, selected.cols] <- sapply(pd[, selected.cols], function(x) as.numeric(x))

# Order data by points scored (high-low)
pd <- pd %>% 
        arrange(desc(PTS))

# Display results
pd %>% 
        select(Player, Age, Tm, MP, PTS) %>% 
        head() %>% 
        kable() %>% kable_styling(kable.format)
Player Age Tm MP PTS
James Harden 30 HOU 2061 1955
Trae Young 21 ATL 1981 1670
Giannis Antetokounmpo 25 MIL 1668 1616
Damian Lillard 29 POR 1996 1594
Bradley Beal 26 WAS 1864 1580
Zach LaVine 24 CHI 2085 1530


Great! Data is loaded in successfully. Let’s start by creating a naive marker of player efficiency - points / minute. This is a highly reductionist approach to understanding player performance, especially given the more complex formula provided by Basketball Reference - see here. Given the fact that we’re working with season average data for the most part - and, frankly, that this analysis is purely recreational - we’ll stick to a more basic approach for now.

Essentially, this metric will boil down to how often a player scores when he’s on the court. Players that are on the court often without scoring will be considered less efficient, for the sake of argument.


Authors Note: James Harden is an absolute unicorn, but we knew that already. Whether you care for his game or not, he’s essentially scoring a point for every minute he’s on the court - you’ll find him by himself in the top-right corner of the graph above.

We see the Warriors players represented as blue dots here. They’re not entirely clustered at the bottom, but they don’t seem to exceed the middle of the pack here. This offers empirical support to the notion that the offensive dominance Golden State enjoyed for nearly a decade has not been sustained in the absence of Steph and Klay.

tm.data <- pd %>% 
        group_by(Tm) %>% 
        summarise(av.age = mean(Age, na.rm = T),
                  total.fouls = sum(PF),
                  total.TOs = sum(TOV),
                  av.FT.pct = mean(`FT%`, na.rm = T)) %>% 
        mutate(flag = ifelse(Tm == "GSW", T, F))


It can be argued that - given the lack of vetern leadership on the court - the Warriors are simply experiencing growing pains that many young teams do. To test that theory, let’s check the average age of the Warriors’ roster compared to other teams in the league.

eligible.players GSW.average.age
13 24.08


Clearly the Warriors are a young team, with the fifth-youngest roster in the NBA this season. We observe that Houston, Milwaukee, and Los Angeles (Lakers) are the three oldest rosters on average - these teams are also enjoying ample on-court success. Is the relationship between roster age and winning percentage significant?

# Correlation coefficient b/w age and points scored
pd %>% 
        summarise(age.before.beauty = round(cor(Age, PTS), 2)) %>% 
        kable() %>% kable_styling(kable.format)
age.before.beauty
0.12


No, not really. The correlation between Age and Points (r = .12) does not suggest that there is a meaningful relationship between these factors at the player level.


So, what are the Warriors working with? In other words, what is driving the Warriors’ lackluster performance this season?

Player Offense

We’ll once again construct a naive effiency variable, which will be a simple observation of points per minute for each player.

# Inactive players / Players traded away
filter.out <- c("D'Angelo Russell", "Stephen Curry", "Glenn Robinson", "Alec Burks",
                "Zach Norvell", "Willie Cauley-Stein", "Omari Spellman", "Jeremy Pargo",
                "Jacob Evans")

# Warriors players only
warriors <- pd %>% 
        filter(Tm == "GSW") %>% 
        filter(MP != 0) %>% # Filter out players who have recorded 0 minutes
        filter(!Player %in% filter.out)

Let’s unpack this:

  • We observe that Andrew Wiggins has the highest points per minute of the current active roster; he’s only been on the team since early February, which accounts for the low number of minutes. With that being said, he’s done a good job filling the void left by Curry, Thompson, and Russell (who the Warriors traded for Wiggins)

  • Rookie Eric Paschall has performed well thus far in his young career. At 6’6" and 255 pounds, he’s a big-bodied power forward that gets the majority of his points in the paint with an effective FG% of 0.52

  • While he’s been a pivotal player for the Warriors over the last decade or so, Draymond Green does not possess Curry or Thompson’s offensive skill set. He’s been asked to lead a team devoid of elite talent for the majority of the year - and has done so well by all accounts - but his skill set alone has not been enough to carry the team offensively.

What we’re really missing here is effective three-point shooting. Curry and Thompson boast remarkably high career percentages from outside the arc (0.44 and 0.42, respectively). Without that level of sharpshooting, the 2019 Warriors are struggling to stretch defenses in the same way they have in years past. Indeed, the only point guard listed on the active roster currently is Ky Bowman, who currently holds a 3P% of 0.31


Player Defense

Lastly, let’s look at naive defensive rating for the Warriors. We’ll add key defensive metrics - blocks, steals, and defensive rebounds - and then divide this composite metric by minutes played. Following similar logic to our naive offensive metric, this figure will represent the frequency at which players provide beneficial defensive plays.

warriors %>% 
        mutate(def.metrics = (DRB + STL + BLK)) %>% 
        mutate(naive.def = round(def.metrics / MP, 3)) %>% 
        select(Player, Age, MP, def.metrics, naive.def) %>% 
        arrange(desc(naive.def)) %>% 
        head(5) %>% 
        kable() %>% kable_styling(kable.format)
Player Age MP def.metrics naive.def
Marquese Chriss 22 1082 319 0.295
Draymond Green 29 1222 335 0.274
Dragan Bender 22 117 29 0.248
Juan Toscano-Anderson 26 201 47 0.234
Kevon Looney 23 262 56 0.214




Notably, three of the five most efficient Warriors defenders are also three of the players with lowest minutes. Let’s explore this further:

  • Kevon Looney has battled a variety of injuries this season (which can be said for most of the roster, truthfully). He’s only played in 20 games this year and has logged less than 300 minutes total. When Looney has played he’s maximized his role, even playing at Point Guard in some more avant garde lineups Per NBC Sports

  • Juan Toscano-Anderson was called up from the G-league early in February, which explains his lack of minutes. He’s added value to the team over 8 games played, albeit more so as a defender than a scorer

  • Dragan Bender has a similar story - he joined the Warriors on February 23rd per Wikipedia

  • Notably, Jordan Poole and Eric Paschall have racked up a ton of minutes for the Dubs, but aren’t observed to contribute much on the defensive-end of the floor

While it’s a stretch to say that Golden State has all the pieces they need to turn it around on defense this season, it is certainly worth noting the disconnect between minutes and defensive playmaking. With the highest rated defenders either hurt or newly added to the team, the severe lack of defense since the start of the season in October makes plenty of sense. Draymond Green and Marquese Chriss are currently holding their own as the lone defenders with quality minutes - it will be interesting to see if Golden State’s team defense overall improves as Bender, Toscano-Anderson, and Looney get healthy and up to speed.


Summary

The Warriors drop-off in winning percentage can obviously be attributed to the on-court changes en masse that have redefined the Warriors starting five on a night by night basis. The string of injuries that Golden State has endured has fundamentally depleted the team of consistent talent. This analysis does indicate several metrics in particular in which the Warriors are lacking.

In conclusion, after five years of dominating the league the Warriors are experiencing some regression. There is reason to suggest that the lack of three-point production in the abscence of Steph and Klay has much to do with the team’s current offensive limitations. Not for nothing, the loss of three All-Stars virtually overnight is bound to throw a wrench in your plans and performance. It will interesting to track the team’s defensive performance as Looney, Toscano-Anderson, and Bender get more quality minutes through the end of the season.

Thanks for reading! Follow me on Twitter at human_cactus and check out more analyses on my RPubs portfolio.