Overview

RAPTOR, which stands for Robust Algorithm (using) Player Tracking (and) On/Off Ratings, is FiveThirtyEight’s new NBA statistic. RAPTOR has the following two advantages:

First,create a publicly available statistic that takes advantage of modern NBA data, specifically player tracking and play-by-play data that isn’t available in traditional box scores.

Second, and relatedly, a statistic that better reflects how modern NBA teams actually evaluate players.

Introducing RAPTOR, Our New Metric For The Modern NBA:https://fivethirtyeight.com/features/introducing-raptor-our-new-metric-for-the-modern-nba/

Installl Packages

library('tidyverse')
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6     ✔ purrr   0.3.4
## ✔ tibble  3.1.7     ✔ dplyr   1.0.9
## ✔ tidyr   1.2.0     ✔ stringr 1.4.0
## ✔ readr   2.1.2     ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

Loading Data and Creating Dataframe

library(readr)
df <- read_csv("Downloads/nba-raptor/modern_RAPTOR_by_player.csv",show_col_types = FALSE)
head(df)
## # A tibble: 6 × 21
##   player_name     player_id season  poss    mp raptor_box_offe… raptor_box_defe…
##   <chr>           <chr>      <dbl> <dbl> <dbl>            <dbl>            <dbl>
## 1 Alex Abrines    abrinal01   2017  2387  1135            0.746           -0.373
## 2 Alex Abrines    abrinal01   2018  2546  1244            0.318           -1.73 
## 3 Alex Abrines    abrinal01   2019  1279   588           -3.22             1.08 
## 4 Precious Achiu… achiupr01   2021  1581   749           -4.12             1.36 
## 5 Precious Achiu… achiupr01   2022  3802  1892           -2.52             1.76 
## 6 Quincy Acy      acyqu01     2014  1716   847           -1.72             0.133
## # … with 14 more variables: raptor_box_total <dbl>, raptor_onoff_offense <dbl>,
## #   raptor_onoff_defense <dbl>, raptor_onoff_total <dbl>, raptor_offense <dbl>,
## #   raptor_defense <dbl>, raptor_total <dbl>, war_total <dbl>,
## #   war_reg_season <dbl>, war_playoffs <dbl>, predator_offense <dbl>,
## #   predator_defense <dbl>, predator_total <dbl>, pace_impact <dbl>

Summary Statistics

Get an overview of the data type, min, median, max, mean of each variables.

summary(df)
##  player_name         player_id             season          poss     
##  Length:4685        Length:4685        Min.   :2014   Min.   :   2  
##  Class :character   Class :character   1st Qu.:2016   1st Qu.: 709  
##  Mode  :character   Mode  :character   Median :2018   Median :2252  
##                                        Mean   :2018   Mean   :2443  
##                                        3rd Qu.:2020   3rd Qu.:3880  
##                                        Max.   :2022   Max.   :8026  
##                                                                     
##        mp       raptor_box_offense raptor_box_defense raptor_box_total 
##  Min.   :   1   Min.   :-44.1863   Min.   :-48.1886   Min.   :-61.675  
##  1st Qu.: 341   1st Qu.: -2.4776   1st Qu.: -1.7985   1st Qu.: -3.426  
##  Median :1087   Median : -0.9246   Median : -0.2859   Median : -1.005  
##  Mean   :1187   Mean   : -1.1141   Mean   : -0.4223   Mean   : -1.536  
##  3rd Qu.:1894   3rd Qu.:  0.5053   3rd Qu.:  1.1696   3rd Qu.:  1.024  
##  Max.   :3948   Max.   : 47.5576   Max.   : 57.3057   Max.   : 54.871  
##                 NA's   :1          NA's   :1          NA's   :1        
##  raptor_onoff_offense raptor_onoff_defense raptor_onoff_total 
##  Min.   :-67.777      Min.   :-89.09612    Min.   :-156.8736  
##  1st Qu.: -3.326      1st Qu.: -1.86018    1st Qu.:  -4.0029  
##  Median : -1.067      Median :  0.01282    Median :  -0.9001  
##  Mean   : -1.585      Mean   :  0.11361    Mean   :  -1.4711  
##  3rd Qu.:  1.125      3rd Qu.:  1.97923    3rd Qu.:   2.0047  
##  Max.   : 85.407      Max.   : 66.52346    Max.   : 123.1519  
##  NA's   :1            NA's   :1            NA's   :1          
##  raptor_offense     raptor_defense      raptor_total       war_total       
##  Min.   :-45.3233   Min.   :-56.9825   Min.   :-67.356   Min.   :-7.38298  
##  1st Qu.: -2.6857   1st Qu.: -1.7698   1st Qu.: -3.512   1st Qu.:-0.09548  
##  Median : -1.0054   Median : -0.2324   Median : -1.111   Median : 0.55871  
##  Mean   : -1.2780   Mean   : -0.3342   Mean   : -1.612   Mean   : 1.74370  
##  3rd Qu.:  0.5177   3rd Qu.:  1.2485   3rd Qu.:  1.086   3rd Qu.: 2.72788  
##  Max.   : 53.2289   Max.   : 62.4692   Max.   : 72.622   Max.   :26.66687  
##                                                                            
##  war_reg_season      war_playoffs      predator_offense   predator_defense  
##  Min.   :-7.38298   Min.   :-1.37652   Min.   :-36.4333   Min.   :-37.8717  
##  1st Qu.:-0.09885   1st Qu.: 0.00000   1st Qu.: -2.5966   1st Qu.: -1.9725  
##  Median : 0.53869   Median : 0.00000   Median : -1.0569   Median : -0.4880  
##  Mean   : 1.56462   Mean   : 0.17907   Mean   : -1.1814   Mean   : -0.6181  
##  3rd Qu.: 2.55072   3rd Qu.: 0.03787   3rd Qu.:  0.4025   3rd Qu.:  1.0185  
##  Max.   :23.65932   Max.   : 6.18924   Max.   : 42.8903   Max.   : 42.9891  
##                                                                             
##  predator_total      pace_impact      
##  Min.   :-69.0924   Min.   :-7.19196  
##  1st Qu.: -3.9587   1st Qu.:-0.45629  
##  Median : -1.4353   Median :-0.01576  
##  Mean   : -1.7995   Mean   : 0.05633  
##  3rd Qu.:  0.9293   3rd Qu.: 0.46597  
##  Max.   : 49.1062   Max.   :15.81771  
##                     NA's   :1
str(df)
## spec_tbl_df [4,685 × 21] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ player_name         : chr [1:4685] "Alex Abrines" "Alex Abrines" "Alex Abrines" "Precious Achiuwa" ...
##  $ player_id           : chr [1:4685] "abrinal01" "abrinal01" "abrinal01" "achiupr01" ...
##  $ season              : num [1:4685] 2017 2018 2019 2021 2022 ...
##  $ poss                : num [1:4685] 2387 2546 1279 1581 3802 ...
##  $ mp                  : num [1:4685] 1135 1244 588 749 1892 ...
##  $ raptor_box_offense  : num [1:4685] 0.746 0.318 -3.216 -4.123 -2.522 ...
##  $ raptor_box_defense  : num [1:4685] -0.373 -1.725 1.078 1.359 1.764 ...
##  $ raptor_box_total    : num [1:4685] 0.373 -1.408 -2.137 -2.764 -0.758 ...
##  $ raptor_onoff_offense: num [1:4685] -0.419 -1.292 -6.159 -4.051 -1.688 ...
##  $ raptor_onoff_defense: num [1:4685] -3.857 -0.0497 4.9012 -0.9197 3.1034 ...
##  $ raptor_onoff_total  : num [1:4685] -4.28 -1.34 -1.26 -4.97 1.42 ...
##  $ raptor_offense      : num [1:4685] 0.5434 -0.0208 -4.0402 -4.3476 -2.5174 ...
##  $ raptor_defense      : num [1:4685] -1.145 -1.503 1.886 0.955 2.144 ...
##  $ raptor_total        : num [1:4685] -0.601 -1.523 -2.155 -3.393 -0.373 ...
##  $ war_total           : num [1:4685] 1.249 0.777 0.178 -0.246 2.263 ...
##  $ war_reg_season      : num [1:4685] 1.448 0.466 0.178 -0.247 2.31 ...
##  $ war_playoffs        : num [1:4685] -0.1987 0.311392 0 0.000721 -0.046953 ...
##  $ predator_offense    : num [1:4685] 0.0771 -0.1746 -4.5777 -3.8177 -2.484 ...
##  $ predator_defense    : num [1:4685] -1.039 -1.113 1.543 0.475 2.024 ...
##  $ predator_total      : num [1:4685] -0.962 -1.287 -3.034 -3.343 -0.46 ...
##  $ pace_impact         : num [1:4685] 0.326 -0.456 -0.268 0.329 -0.729 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   player_name = col_character(),
##   ..   player_id = col_character(),
##   ..   season = col_double(),
##   ..   poss = col_double(),
##   ..   mp = col_double(),
##   ..   raptor_box_offense = col_double(),
##   ..   raptor_box_defense = col_double(),
##   ..   raptor_box_total = col_double(),
##   ..   raptor_onoff_offense = col_double(),
##   ..   raptor_onoff_defense = col_double(),
##   ..   raptor_onoff_total = col_double(),
##   ..   raptor_offense = col_double(),
##   ..   raptor_defense = col_double(),
##   ..   raptor_total = col_double(),
##   ..   war_total = col_double(),
##   ..   war_reg_season = col_double(),
##   ..   war_playoffs = col_double(),
##   ..   predator_offense = col_double(),
##   ..   predator_defense = col_double(),
##   ..   predator_total = col_double(),
##   ..   pace_impact = col_double()
##   .. )
##  - attr(*, "problems")=<externalptr>

Data subset-Filter

I am only interested in the the most recent year. In addition, we need to have enough data so filter in players who play at least 1000 mp and 2000 poss. To carve out useful data, creating a subset and only include columns of ‘player_name’,‘mp’,‘poss’,‘raptor_total’

df_subset <- subset(df,season==2022 & mp>=1000 & poss>=2000 )
df_subset2 <- df_subset[,c('player_name','mp','poss','raptor_total')]
head(df_subset2)
## # A tibble: 6 × 4
##   player_name                 mp  poss raptor_total
##   <chr>                    <dbl> <dbl>        <dbl>
## 1 Precious Achiuwa          1892  3802       -0.373
## 2 Steven Adams              2113  4392        2.14 
## 3 Bam Adebayo               2439  4893        3.86 
## 4 LaMarcus Aldridge         1050  2205        1.59 
## 5 Nickeil Alexander-Walker  1471  3037       -3.00 
## 6 Grayson Allen             2110  4468        0.145

Data Cleaning and Manipulation

check distinct entries in all columns

sapply(df_subset2, function(x) length(unique(x)))
##  player_name           mp         poss raptor_total 
##          280          256          269          280

Remove missing and duplicate data

df_subset3<-remove_missing(df_subset2, na.rm = FALSE, vars = names(df_subset2), name = "", finite = FALSE)
duplicated(df_subset3)
##   [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [49] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [61] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [73] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [85] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [97] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [109] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [121] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [133] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [145] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [157] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [169] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [181] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [193] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [205] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [217] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [229] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [241] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [253] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [265] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [277] FALSE FALSE FALSE FALSE

Sort Data by ‘raptor_total’ DESC so we can easily spot the top 5

df_sort <- df_subset3[order(-df_subset3$raptor_total),]
head(df_sort)
## # A tibble: 6 × 4
##   player_name              mp  poss raptor_total
##   <chr>                 <dbl> <dbl>        <dbl>
## 1 Nikola Jokic           2647  5481        14.6 
## 2 Giannis Antetokounmpo  2652  5669         8.07
## 3 Joel Embiid            2682  5377         7.78
## 4 Rudy Gobert            2317  4762         6.89
## 5 Stephen Curry          2975  6218         6.80
## 6 Luka Doncic            2853  5683         6.38

Conclusion

RAPTOR thinks ball-dominant players such as Steph Curry are phenomenally good.It can have a love-hate relationship with centers, who are sometimes overvalued in other statistical systems. But it appreciates modern centers such as Nikola Jokić and Joel Embiid, as well as defensive stalwarts like Rudy Gobert.

Next step, I recommend to join data set with RAPTOR team data on historical level and latest level to further understanding RAPTOR score variable relationship and build a model to predict team success.