In the last decade and a half, the National Basketball Association - or more commonly known as the NBA - has experienced some of the most extreme changes in the way basketball has been played in the history of the sport. Since the addition of the three-point line in 1979, teams have had the option to shoot from beyond the arc, but it has not been fully utilized. That all changed when Stephen Curry, who is widely regarded as the greatest shooter of all time, joined the league. Since joining the league in 2009, the league has seen a drastic uptick in three-point shots attempted and made. Teams have been adjusting their play from attacking the paint to spreading out on the perimeter, leaving room on the inside of the arc, as well as threatening more three-point attempts.
Based on the changes in the style of the different teams in the NBA, and how basketball is being played, we want to determine whether or not Steph Curry has affected the game of basketball since joining the league in the 2009-10 season. Our objective is to find whether or not his success has influenced the number of three-pointers shot in the NBA, increased the pace of the game, forced players to improve their shooting range, and decreased the overall physicality of the league.
We will collect the dataset from the website Basketball References. To best determine whether or not Steph Curry has changed the game, we will look at several different variables. These are the variables that we might use during our research:
Season: The regular season of NBAFG: Total field goals made, the number of baskets
scoredFGA: Total field goals attempted, the number of times
the ball was shot3P: Total number of three-pointer shots made3PA: Total number of three-point shots attemptedFT: Total number of free throws madeFTA: Total number of free throw attemptsPace: An estimate of how many possessions per 48
minutesAvg_Dist: Average distance of shots from the rimPF: Total personal fouls throughout the leagueStephen Curry is widely regarded as one of the greatest point guards of all time, and without a doubt the best shooter of all time. Given the success of both Stephen Curry and the Golden State Warriors, Steph Curry’s game has been put under the microscope. With point-scoring efficiency, people have looked at how his play style has changed the way the league has approached playing basketball. Indeed, Rob, in his article on Axis Talent, writes that using technology and data analytics, the managers of the Golden State Warriors realized Steph’s greatness from beyond the arc and allowed him to shoot more threes. In Brandon Marcus’ article in the Cold Wire, he references the fact that before 2015, there were two players that averaged 6 or more three-point attempts per game, while since 2015 there have been 37 players to do so. The Warriors’ success, specifically their championship in the 2014-15 season, has made a lasting impact on the way the game is played.
Some important factors to consider for the ethical consideration of this study is that we have not received any consent from Steph Curry or any other players whose data we use. We use data collected on every player in the NBA from 2000 to 2022, although they did not necessarily consent to the data collection. The potential stakeholders in the data were the players whose data was recorded, the NBA because it was during the NBA season, and the website we used, basketball-reference.com. Another potential ethical concern is that we are not sure how basketball-reference collected its data from the NBA.
As the initial dataset contains NBA data from the 1946-47 season to the current season, we will have to make it smaller for analysis. We will filter the season from the 2000-01 season to the 2021-22 season, and select 11 columns that we think might be useful for our research. As we also want to consider the two-point attempts and two-point attempts by the NBA, we will add them to our dataset. We will calculate the two variables by using the following formula:
\[ 2PA = FGA - 3PA \] \[ 2P = FG - 3P \]
NBA_cleaned <- NBA_season %>%
filter(Season >= "2000-01" & Season < "2022-23") %>%
select(Season, FG, FGA, `3P`, `3PA`, FT, FTA, Pace, Avg_Dist, PF)
NBA_cleaned$`2P` <- with(NBA_cleaned, FG - `3P`)
NBA_cleaned$`2PA` <- with(NBA_cleaned, FGA - `3PA`)
Here are the glimpse into data after cleaning:
glimpse(NBA_cleaned)
## Rows: 22
## Columns: 12
## $ Season <chr> "2021-22", "2020-21", "2019-20", "2018-19", "2017-18", "2016-…
## $ FG <dbl> 99930, 89020, 86550, 101062, 97435, 96061, 94065, 92287, 9277…
## $ FGA <dbl> 216722, 190983, 188116, 219458, 211709, 210115, 208049, 20557…
## $ `3P` <dbl> 30598, 27427, 25862, 27955, 25807, 23748, 20953, 19300, 19054…
## $ `3PA` <dbl> 86535, 74822, 72252, 78742, 71340, 66422, 59241, 55137, 52974…
## $ FT <dbl> 41657, 36650, 37826, 43494, 40903, 43883, 43489, 42161, 43870…
## $ FTA <dbl> 53781, 47135, 48943, 56758, 53325, 56855, 57469, 56198, 58029…
## $ Pace <dbl> 98.2, 99.2, 100.3, 100.0, 97.3, 96.4, 95.8, 93.9, 93.9, 92.0,…
## $ Avg_Dist <dbl> 14.3, 14.1, 13.9, 13.5, 13.2, 13.3, 12.5, 12.4, 12.8, 12.6, 1…
## $ PF <dbl> 48306, 41669, 44004, 51425, 48837, 48950, 49854, 49728, 50923…
## $ `2P` <dbl> 69332, 61593, 60688, 73107, 71628, 72313, 73112, 72987, 73725…
## $ `2PA` <dbl> 130187, 116161, 115864, 140716, 140369, 143693, 148808, 15043…
Firstly, we will look into the three-point factor and the two-point factor in the NBA from 2000 to 2022
ggplot(NBA_cleaned) +
geom_col(aes(x = Season, y = `3PA`), fill = "cyan", color = "cyan4") +
geom_line(aes(x = Season, y = `3P`, group = 1), size = 1.5, color = "blue") +
labs(x = "Season", y = "Total Three-Point Attempts",
title = "The Total Three-Point Made vs. \n the Total Three-point Attempt") +
theme(axis.text.x=element_text(angle=45, hjust=1), plot.title = element_text(hjust = 0.5)) +
scale_y_continuous(sec.axis = sec_axis(~., name = "Total Three-Point Made"))
Other than the first two seasons of Curry’s NBA career and the shortened seasons mentioned before, the league has seen an increase in three-point attempts and makes. From 2000 to 2009, the amount of three-point attempts increased slightly. However, once Curry joined the league, it increased at a significant rate. In the 2021-22 season, the total three-point attempts were 2.6 times greater than the figure for the 2000-01 season, 10 years before Curry played in the NBA.
ggplot(NBA_cleaned) +
geom_col(aes(x = Season, y = `2PA`), fill = "orange", color = "red") +
geom_line(aes(x = Season, y = `2P`, group = 1), size = 1.5, color = "red3") +
labs(x = "Season", y = "Total Two-point Attempt",
title = "The Total Two-Point Made vs. \n the Total Two-point Attempt") +
theme(axis.text.x=element_text(angle=45, hjust=1), plot.title = element_text(hjust = 0.5)) +
scale_y_continuous(sec.axis = sec_axis(~., name = "Total Two-Point Made"))
Within the 10-year period before the Golden State Warriors’ point guard was drafted, the seasonal number of total two-point shots attempts and total two-point shots made were pretty much consistent. With the number of three-point shot attempts increasing, the number of two-pointers has decreased moderately since the start of Curry’s first year in the league. Surprisingly, the average two-point percentage has also increased, showing that teams are placing an emphasis on accuracy.
ggplot(NBA_cleaned) +
geom_col(aes(x = Season, y = FTA), fill = "grey", color = "black") +
geom_line(aes(x = Season, y = PF, group = 1), size = 1.5, color = "brown") +
labs(x = "Season", y = "Total Free Throw Attempts",
title = "The Total Free Throw Attempts vs Personal Fouls") +
theme(axis.text.x=element_text(angle=45, hjust=1), plot.title = element_text(hjust = 0.5)) +
scale_y_continuous(sec.axis = sec_axis(~., name = "Personal Fouls"))
Since 2009, the number of the total free throws and personal fouls have been on the decline. From 2004 to 2007, the total number of free throws reached a peak with more than 60000 attempts, but this number dipped to less than 50000 in the 2019-2020 and 2020-2021 seasons, at least partially due to the shortened seasons. The trend for personal fouls is also the same, which has also decreased in the past three seasons.
NBA_new <- NBA_cleaned %>% select(FGA, `3PA`, `2PA`, FTA, Pace, Avg_Dist)
ggpairs(NBA_new) + theme(axis.text.x = element_text(angle = 45, hjust = 1, size=8))
The table shows us the correlation between many factors such as FGA, 3PA, 2PA, FTA, Pace, and Avg_Dist. Taking 3PA as the main figure, we can see that it has relationships with every other factor in the table, except for the FGA. Besides, the other factors also seem to have a relationship with each other.
Based on the two bar graphs that we explored above, we will use a two-sample t-test to prove the difference in total three-point attempts between the two groups, which are before and after Stephen Curry joined the league:
season_before <- NBA_cleaned %>% filter(Season <= "2008-09")
season_after <- NBA_cleaned %>% filter(Season > "2008-09")
t.test(season_before$`3PA`, season_after$`3PA`)
##
## Welch Two Sample t-test
##
## data: season_before$`3PA` and season_after$`3PA`
## t = -4.9784, df = 14.682, p-value = 0.0001759
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -31957.23 -12771.03
## sample estimates:
## mean of x mean of y
## 38548.33 60912.46
After running a two-sample t-test with a p-value smaller than 0.05, we can reject the null hypothesis and conclude that there is a difference between the total three-point attempts before and after Curry joined the NBA. The t-value is -4.9784 and the degree of freedom is 14.682. The true mean difference is between -31957.23 and -12771.03.
ggplot(NBA_cleaned) +
geom_point(aes(x = Season, y = Avg_Dist), color = "brown") +
geom_line(aes(x = Season, y = Avg_Dist, group = 1), color = "orange4") +
labs(x = "Season", y = "Average Distance From The Rim (ft)",
title = "The Average Distance From The Rim Overtime") +
theme(axis.text.x=element_text(angle=45, hjust=1), plot.title = element_text(hjust = 0.5))
From the graph, 10 years before Curry joined the NBA, the average distance from the rim increased slightly. But since he attended the league, this figure grew at a noticeable rate. In the most recent season, the average distance from the rim of the whole NBA was around 14.3 feet, compared to only 12.0 feet in 2000-01. We can see that teams in the NBA are putting more emphasis on spreading out their offenses and shooting from long-distance shots. Deep three-pointers have also become more popular, as Stephen Curry has also become known for his logo shots.
We will look into the connection between the average pace of the game and the total three-point attempts to see how the pace of the game changed by creating a scatter plot and a multiple linear regression model:
ggplot(NBA_cleaned, aes(x = `3PA`, y = `Pace`)) +
geom_point(size = 3, aes(color = factor(Season))) +
geom_smooth(method = 'lm', color = 'red') +
labs(x = "Total Field Goal Attempt", y = "Average Pace", title = "Relationship between average pace and \n total field goal attempt") +
theme(plot.title = element_text(hjust = 0.5))
As we can see from the graph above, there is a positive linear connection between the total three-point attempts and the average pace. All the data points are near the regression line. The more shots players in the NBA shoot, the faster the game will be. At first, the average pace was around 90, but since 2013, which was 3 years after Curry played in the NBA, the number has gone up quickly to more than 98 in the four most recent seasons.
mod <- lm(Pace ~ `3PA` + `2PA`, data = NBA_cleaned)
summary(mod)
##
## Call:
## lm(formula = Pace ~ `3PA` + `2PA`, data = NBA_cleaned)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.22454 -0.49757 0.02627 0.42918 1.66394
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.150e+01 3.683e+00 24.846 5.97e-16 ***
## `3PA` 1.693e-04 1.694e-05 9.993 5.32e-09 ***
## `2PA` -4.393e-05 2.052e-05 -2.141 0.0455 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9186 on 19 degrees of freedom
## Multiple R-squared: 0.931, Adjusted R-squared: 0.9238
## F-statistic: 128.3 on 2 and 19 DF, p-value: 9.255e-12
From the multiple linear regression model with a p-value being less than 0.05, we can tell that this model is significant to conclude there is a correlation between the average pace and the total three-point attempts:
3PA
With a p-value smaller than 0.05, this means that total three-point attempts are a significant factor contributing to the average pace. For each increase in the 3PA value, the average pace increases by 1.693e-04.
2PA$
With a p-value smaller than 0.05, there is a significant negative correlation between this value and the average pace. For each increase in the 2PA, the average pace decreases by -4.393e-05.
We also run a two-sample t-test to see the difference in the average pace before and after Curry attended the league. Our null hypothesis is there is no difference between the two groups.
t.test(season_before$Pace, season_after$Pace)
##
## Welch Two Sample t-test
##
## data: season_before$Pace and season_after$Pace
## t = -4.8473, df = 13.758, p-value = 0.0002716
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -6.431493 -2.481327
## sample estimates:
## mean of x mean of y
## 91.16667 95.62308
From the model with a p-value smaller than 0.05, we can reject the null hypothesis and conclude that there is a difference between the pace before and after Curry joined the NBA. The t-value is -4.8473 and the degree of freedom is 13.758. The true mean difference is between -6.431 and -2.481.
Regarding the physicality of the NBA games, we will look into the relationship between the total three-point attempts and the total number of personal fouls by using the scatter plot and a simple linear regression model:
ggplot(NBA_cleaned, aes(x = `3PA`, y = `PF`)) +
geom_point(size = 3, aes(color = factor(Season))) +
geom_smooth(method = 'lm', color = 'red') +
labs(x = "Total Three-point Attempts", y = "Personal Fouls", title = "Relationship between Personal Fouls and \n Total Three-point Attempts") +
theme(plot.title = element_text(hjust = 0.5))
From the graph above, it is logical to come to the conclusion that there is a negative linear relationship between the total three-point attempts and the total personal fouls. In the earlier seasons, fewer three-point shots were taken, instead, teams focused on attacking the paint and getting closer to the rim with two-pointers. With this more aggressive and physical style of play, there were more personal fouls called. Although, when we look at the linear regression model between the total number of personal fouls compared to 3-point attempts, we realize that we cannot come to that conclusion.
mod1 <- lm(PF ~ `3PA`, data = NBA_cleaned)
summary(mod1)
##
## Call:
## lm(formula = PF ~ `3PA`, data = NBA_cleaned)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12739.1 -405.6 613.8 1237.2 4787.5
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.503e+04 2.801e+03 19.651 1.51e-14 ***
## `3PA` -9.754e-02 5.169e-02 -1.887 0.0738 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3878 on 20 degrees of freedom
## Multiple R-squared: 0.1511, Adjusted R-squared: 0.1087
## F-statistic: 3.561 on 1 and 20 DF, p-value: 0.07377
From the simple linear regression model with a p-value larger than 0.05, we can tell that this model is not significant to conclude there is a correlation between the personal fouls and the total three-point attempts:
3PA
With a p-value larger than 0.05, this means that total three-point attempts are not a significant factor contributing to the personal foul.
Based on both the graph and the simple linear regression model, we can come to the conclusion that there is no connection between the 3-point attempts and the number of personal fouls. Although it makes sense that because the league has changed its focus to efficiently shooting three-pointers, the game has shifted from one revolving around physicality to one where the focus is spreading out the offense and creating open three-point attempts, there would be a relationship between 3-point attempts and decreased number of personal fouls, but there is not. The game has simply become less physical over time, but it is not necessarily a result of shooting more threes.
In conclusion, the NBA has changed since Stephen Curry joined the league. He and the Golden State Warriors dynasty have significantly impacted the NBA, by changing the pace of the game of basketball and influencing the league’s mindset of shooting more three-pointers. Although there is no relationship between the total personal fouls and the total number of shot attempts, we could say that the game has been less physical due to the decreasing number of personal fouls.
The first limitation that we encountered during our research is that 2011-12, 2021-13, 2019-20, and 2020-21 are seasons that had fewer total games played, which accounted for the outliers in our findings. In addition, there are many external factors besides Stephen Curry that also affected the evolution of the NBA. As we only consider the data from the regular seasons from a specific time range, we cannot have an overall conclusion for our research question. In addition, when looking at the personal fouls variable, we do not know whether the foul came from inside the paint or outside the three-point line. Thus, the conclusion about the physicality of the league might not be correct.
For future studies, we could potentially look into other factors that have been affected by Curry. One of them might be the minutes of a specific player. Due to the fact that a three-point shooter is crucial for a team in the NBA, we can consider how players’ minutes have changed. For example, if a team has a player who is a point guard but his shooting percentage outside the three-point line is low, his minutes on the court could potentially decrease. On the other hand, if a person plays the center or power forward position and knows how to shoot consistently and effectively, he will be a key player for his team. Moreover, we can also find more detailed data about the game and have some predictions about the evolution of modern basketball.
Marcus, B. (2022, August 19). Stat shows how Steph Curry changed the NBA. The Cold Wire. Retrieved December 17, 2022, from https://www.thecoldwire.com/stat-shows-how-steph-curry-changed-the-nba/ NBA League Averages - Totals. Basketball Reference. (n.d.). Retrieved December 1, 2022, from https://www.basketball-reference.com/leagues/NBA_stats_totals.html Rob. (n.d.). How Steph Curry changed the world of NBA through Data Analytics. Axis Talent. Retrieved December 17, 2022, from https://www.axistalent.io/blog/how-steph-curry-changed-the-world-of-nba-through-data-analytics