RAPTOR, which stands for Robust Algorithm (using) Player Tracking (and) On/Off Ratings, is FiveThirtyEight’s new NBA statistic. RAPTOR has the following two advantages:
First,create a publicly available statistic that takes advantage of modern NBA data, specifically player tracking and play-by-play data that isn’t available in traditional box scores.
Second, and relatedly, a statistic that better reflects how modern NBA teams actually evaluate players.
Introducing RAPTOR, Our New Metric For The Modern NBA:https://fivethirtyeight.com/features/introducing-raptor-our-new-metric-for-the-modern-nba/
library('tidyverse')
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.7 ✔ dplyr 1.0.9
## ✔ tidyr 1.2.0 ✔ stringr 1.4.0
## ✔ readr 2.1.2 ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(readr)
df <- read_csv("Downloads/nba-raptor/modern_RAPTOR_by_player.csv",show_col_types = FALSE)
head(df)
## # A tibble: 6 × 21
## player_name player_id season poss mp raptor_box_offe… raptor_box_defe…
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Alex Abrines abrinal01 2017 2387 1135 0.746 -0.373
## 2 Alex Abrines abrinal01 2018 2546 1244 0.318 -1.73
## 3 Alex Abrines abrinal01 2019 1279 588 -3.22 1.08
## 4 Precious Achiu… achiupr01 2021 1581 749 -4.12 1.36
## 5 Precious Achiu… achiupr01 2022 3802 1892 -2.52 1.76
## 6 Quincy Acy acyqu01 2014 1716 847 -1.72 0.133
## # … with 14 more variables: raptor_box_total <dbl>, raptor_onoff_offense <dbl>,
## # raptor_onoff_defense <dbl>, raptor_onoff_total <dbl>, raptor_offense <dbl>,
## # raptor_defense <dbl>, raptor_total <dbl>, war_total <dbl>,
## # war_reg_season <dbl>, war_playoffs <dbl>, predator_offense <dbl>,
## # predator_defense <dbl>, predator_total <dbl>, pace_impact <dbl>
Get an overview of the data type, min, median, max, mean of each variables.
summary(df)
## player_name player_id season poss
## Length:4685 Length:4685 Min. :2014 Min. : 2
## Class :character Class :character 1st Qu.:2016 1st Qu.: 709
## Mode :character Mode :character Median :2018 Median :2252
## Mean :2018 Mean :2443
## 3rd Qu.:2020 3rd Qu.:3880
## Max. :2022 Max. :8026
##
## mp raptor_box_offense raptor_box_defense raptor_box_total
## Min. : 1 Min. :-44.1863 Min. :-48.1886 Min. :-61.675
## 1st Qu.: 341 1st Qu.: -2.4776 1st Qu.: -1.7985 1st Qu.: -3.426
## Median :1087 Median : -0.9246 Median : -0.2859 Median : -1.005
## Mean :1187 Mean : -1.1141 Mean : -0.4223 Mean : -1.536
## 3rd Qu.:1894 3rd Qu.: 0.5053 3rd Qu.: 1.1696 3rd Qu.: 1.024
## Max. :3948 Max. : 47.5576 Max. : 57.3057 Max. : 54.871
## NA's :1 NA's :1 NA's :1
## raptor_onoff_offense raptor_onoff_defense raptor_onoff_total
## Min. :-67.777 Min. :-89.09612 Min. :-156.8736
## 1st Qu.: -3.326 1st Qu.: -1.86018 1st Qu.: -4.0029
## Median : -1.067 Median : 0.01282 Median : -0.9001
## Mean : -1.585 Mean : 0.11361 Mean : -1.4711
## 3rd Qu.: 1.125 3rd Qu.: 1.97923 3rd Qu.: 2.0047
## Max. : 85.407 Max. : 66.52346 Max. : 123.1519
## NA's :1 NA's :1 NA's :1
## raptor_offense raptor_defense raptor_total war_total
## Min. :-45.3233 Min. :-56.9825 Min. :-67.356 Min. :-7.38298
## 1st Qu.: -2.6857 1st Qu.: -1.7698 1st Qu.: -3.512 1st Qu.:-0.09548
## Median : -1.0054 Median : -0.2324 Median : -1.111 Median : 0.55871
## Mean : -1.2780 Mean : -0.3342 Mean : -1.612 Mean : 1.74370
## 3rd Qu.: 0.5177 3rd Qu.: 1.2485 3rd Qu.: 1.086 3rd Qu.: 2.72788
## Max. : 53.2289 Max. : 62.4692 Max. : 72.622 Max. :26.66687
##
## war_reg_season war_playoffs predator_offense predator_defense
## Min. :-7.38298 Min. :-1.37652 Min. :-36.4333 Min. :-37.8717
## 1st Qu.:-0.09885 1st Qu.: 0.00000 1st Qu.: -2.5966 1st Qu.: -1.9725
## Median : 0.53869 Median : 0.00000 Median : -1.0569 Median : -0.4880
## Mean : 1.56462 Mean : 0.17907 Mean : -1.1814 Mean : -0.6181
## 3rd Qu.: 2.55072 3rd Qu.: 0.03787 3rd Qu.: 0.4025 3rd Qu.: 1.0185
## Max. :23.65932 Max. : 6.18924 Max. : 42.8903 Max. : 42.9891
##
## predator_total pace_impact
## Min. :-69.0924 Min. :-7.19196
## 1st Qu.: -3.9587 1st Qu.:-0.45629
## Median : -1.4353 Median :-0.01576
## Mean : -1.7995 Mean : 0.05633
## 3rd Qu.: 0.9293 3rd Qu.: 0.46597
## Max. : 49.1062 Max. :15.81771
## NA's :1
str(df)
## spec_tbl_df [4,685 × 21] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
## $ player_name : chr [1:4685] "Alex Abrines" "Alex Abrines" "Alex Abrines" "Precious Achiuwa" ...
## $ player_id : chr [1:4685] "abrinal01" "abrinal01" "abrinal01" "achiupr01" ...
## $ season : num [1:4685] 2017 2018 2019 2021 2022 ...
## $ poss : num [1:4685] 2387 2546 1279 1581 3802 ...
## $ mp : num [1:4685] 1135 1244 588 749 1892 ...
## $ raptor_box_offense : num [1:4685] 0.746 0.318 -3.216 -4.123 -2.522 ...
## $ raptor_box_defense : num [1:4685] -0.373 -1.725 1.078 1.359 1.764 ...
## $ raptor_box_total : num [1:4685] 0.373 -1.408 -2.137 -2.764 -0.758 ...
## $ raptor_onoff_offense: num [1:4685] -0.419 -1.292 -6.159 -4.051 -1.688 ...
## $ raptor_onoff_defense: num [1:4685] -3.857 -0.0497 4.9012 -0.9197 3.1034 ...
## $ raptor_onoff_total : num [1:4685] -4.28 -1.34 -1.26 -4.97 1.42 ...
## $ raptor_offense : num [1:4685] 0.5434 -0.0208 -4.0402 -4.3476 -2.5174 ...
## $ raptor_defense : num [1:4685] -1.145 -1.503 1.886 0.955 2.144 ...
## $ raptor_total : num [1:4685] -0.601 -1.523 -2.155 -3.393 -0.373 ...
## $ war_total : num [1:4685] 1.249 0.777 0.178 -0.246 2.263 ...
## $ war_reg_season : num [1:4685] 1.448 0.466 0.178 -0.247 2.31 ...
## $ war_playoffs : num [1:4685] -0.1987 0.311392 0 0.000721 -0.046953 ...
## $ predator_offense : num [1:4685] 0.0771 -0.1746 -4.5777 -3.8177 -2.484 ...
## $ predator_defense : num [1:4685] -1.039 -1.113 1.543 0.475 2.024 ...
## $ predator_total : num [1:4685] -0.962 -1.287 -3.034 -3.343 -0.46 ...
## $ pace_impact : num [1:4685] 0.326 -0.456 -0.268 0.329 -0.729 ...
## - attr(*, "spec")=
## .. cols(
## .. player_name = col_character(),
## .. player_id = col_character(),
## .. season = col_double(),
## .. poss = col_double(),
## .. mp = col_double(),
## .. raptor_box_offense = col_double(),
## .. raptor_box_defense = col_double(),
## .. raptor_box_total = col_double(),
## .. raptor_onoff_offense = col_double(),
## .. raptor_onoff_defense = col_double(),
## .. raptor_onoff_total = col_double(),
## .. raptor_offense = col_double(),
## .. raptor_defense = col_double(),
## .. raptor_total = col_double(),
## .. war_total = col_double(),
## .. war_reg_season = col_double(),
## .. war_playoffs = col_double(),
## .. predator_offense = col_double(),
## .. predator_defense = col_double(),
## .. predator_total = col_double(),
## .. pace_impact = col_double()
## .. )
## - attr(*, "problems")=<externalptr>
I am only interested in the the most recent year. In addition, we need to have enough data so filter in players who play at least 1000 mp and 2000 poss. To carve out useful data, creating a subset and only include columns of ‘player_name’,‘mp’,‘poss’,‘raptor_total’
df_subset <- subset(df,season==2022 & mp>=1000 & poss>=2000 )
df_subset2 <- df_subset[,c('player_name','mp','poss','raptor_total')]
head(df_subset2)
## # A tibble: 6 × 4
## player_name mp poss raptor_total
## <chr> <dbl> <dbl> <dbl>
## 1 Precious Achiuwa 1892 3802 -0.373
## 2 Steven Adams 2113 4392 2.14
## 3 Bam Adebayo 2439 4893 3.86
## 4 LaMarcus Aldridge 1050 2205 1.59
## 5 Nickeil Alexander-Walker 1471 3037 -3.00
## 6 Grayson Allen 2110 4468 0.145
check distinct entries in all columns
sapply(df_subset2, function(x) length(unique(x)))
## player_name mp poss raptor_total
## 280 256 269 280
Remove missing and duplicate data
df_subset3<-remove_missing(df_subset2, na.rm = FALSE, vars = names(df_subset2), name = "", finite = FALSE)
duplicated(df_subset3)
## [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [49] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [61] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [73] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [85] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [97] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [109] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [121] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [133] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [145] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [157] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [169] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [181] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [193] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [205] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [217] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [229] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [241] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [253] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [265] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [277] FALSE FALSE FALSE FALSE
Sort Data by ‘raptor_total’ DESC so we can easily spot the top 5
df_sort <- df_subset3[order(-df_subset3$raptor_total),]
head(df_sort)
## # A tibble: 6 × 4
## player_name mp poss raptor_total
## <chr> <dbl> <dbl> <dbl>
## 1 Nikola Jokic 2647 5481 14.6
## 2 Giannis Antetokounmpo 2652 5669 8.07
## 3 Joel Embiid 2682 5377 7.78
## 4 Rudy Gobert 2317 4762 6.89
## 5 Stephen Curry 2975 6218 6.80
## 6 Luka Doncic 2853 5683 6.38
RAPTOR thinks ball-dominant players such as Steph Curry are phenomenally good.It can have a love-hate relationship with centers, who are sometimes overvalued in other statistical systems. But it appreciates modern centers such as Nikola Jokić and Joel Embiid, as well as defensive stalwarts like Rudy Gobert.
Next step, I recommend to join data set with RAPTOR team data on historical level and latest level to further understanding RAPTOR score variable relationship and build a model to predict team success.