2026-03-28

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.2.0     ✔ readr     2.2.0
## ✔ forcats   1.0.1     ✔ stringr   1.6.0
## ✔ ggplot2   4.0.2     ✔ tibble    3.3.1
## ✔ lubridate 1.9.5     ✔ tidyr     1.3.2
## ✔ purrr     1.2.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)
library(plotly)
## 
## Attaching package: 'plotly'
## 
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## 
## The following object is masked from 'package:stats':
## 
##     filter
## 
## The following object is masked from 'package:graphics':
## 
##     layout

About dataset

This data set has NBA player statistics from the 2017 season. It includes player names, team, game played, position, turnover percentage, rebound percentage, assist percentage, and field goal percentage. The data was created from a Kaggle source that was originally scraped from basketball reference.

Load data

nba <- read.delim("nba_positions_full.tsv")
names(nba)
## [1] "Player"       "Team"         "Games"        "Position"     "TurnoverPct" 
## [6] "ReboundPct"   "AssistPct"    "FieldGoalPct"
head(nba)
##          Player Team Games      Position TurnoverPct ReboundPct AssistPct
## 1  Alex Abrines  OKC    68 ShootingGuard         8.3        4.5       5.5
## 2    Quincy Acy  TOT    38  PowerForward         9.7       11.0       4.9
## 3    Quincy Acy  DAL     6  PowerForward         9.8        9.7       0.0
## 4    Quincy Acy  BRK    32  PowerForward         9.6       11.1       5.4
## 5  Steven Adams  OKC    80        Center        16.0       14.2       5.4
## 6 Arron Afflalo  SAC    61 ShootingGuard         8.4        4.6       7.4
##   FieldGoalPct
## 1         39.3
## 2         41.2
## 3         29.4
## 4         42.5
## 5         57.1
## 6         44.0

data cleaning

nba_df <- nba %>%
  mutate(
    RoleGroup = case_when(
      Position %in% c("PointGuard", "ShootingGuard") ~ "Guard",
      Position %in% c("SmallForward", "PowerForward") ~ "Forward",
      Position == "Center" ~ "Center",
      TRUE ~ "Other"
    )
  )

summary(nba_df)
##     Player              Team               Games         Position        
##  Length:595         Length:595         Min.   : 1.00   Length:595        
##  Class :character   Class :character   1st Qu.:24.00   Class :character  
##  Mode  :character   Mode  :character   Median :55.00   Mode  :character  
##                                        Mean   :48.43                     
##                                        3rd Qu.:73.00                     
##                                        Max.   :82.00                     
##                                                                          
##   TurnoverPct      ReboundPct      AssistPct      FieldGoalPct   
##  Min.   : 0.00   Min.   : 0.00   Min.   : 0.00   Min.   :  0.00  
##  1st Qu.: 9.70   1st Qu.: 6.20   1st Qu.: 6.20   1st Qu.: 40.00  
##  Median :12.50   Median : 8.90   Median :10.10   Median : 44.20  
##  Mean   :12.89   Mean   :10.03   Mean   :12.77   Mean   : 44.12  
##  3rd Qu.:15.60   3rd Qu.:13.00   3rd Qu.:17.50   3rd Qu.: 48.50  
##  Max.   :43.60   Max.   :56.40   Max.   :57.30   Max.   :100.00  
##  NA's   :2                                       NA's   :2       
##   RoleGroup        
##  Length:595        
##  Class :character  
##  Mode  :character  
##                    
##                    
##                    
## 

ggplot 1: Assist vs Turnover

This plot compares the assist percentage and turnover percentage trends across different player positions

ggplot 2: Rebound by Position

The box plot shows how rebound percentages differs across positions. Centers and forwards rebound more.

Interactive Plotly Plot

This version of the Assist Percentage vs Turnover Percentage is interactive and lets you explore the relationship more closely

Plotly 2: 3D Interactive Plot

This 3D interactive plot helps compare 3 variable at once at various player positions

Statistical Analysis

[1] 0.2463049
Call:
lm(formula = TurnoverPct ~ AssistPct, data = nba_stats)

Residuals:
     Min       1Q   Median       3Q      Max 
-17.0308  -3.0953  -0.5802   2.1962  29.5726 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 10.94813    0.38539  28.408  < 2e-16 ***
AssistPct    0.15169    0.02455   6.178 1.21e-09 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 5.42 on 591 degrees of freedom
Multiple R-squared:  0.06067,   Adjusted R-squared:  0.05908 
F-statistic: 38.17 on 1 and 591 DF,  p-value: 1.208e-09

Statistical Analysis Commentary

This analysis highlights the relationship between assist percentage and turnover percentage using correlation and linear regression. The correlation value is 0.246 shows a weak positive relationship between the 2 variables. This means that players with higher assist percentages tend to also have slightly higher turnover percentages, but the relationship is not very strong. The regression results also supports this pattern. The AssistPct coefficient is positive, which means turnover percentage tends to increase as assist percentage increases. The p-value is extremely small, showing that this relationship is statistically significant. However, the R-squared value is only about 0.061, which means assist percentage explains only a small portion of the variation in turnover percentage. Overall, the result suggest there is a real but weak positive relationship between assists and turnover.

Conclusion

This analysis showed that player statistics vary by position and role group. Guards generally contribute more to the assist, while bigger players tend to show stronger rebound numbers. The visualization and statiscal analysis sugesst that positions helps explain differneces in how players contribute on the court