I’m a big NBA/sports fan. I’m curious to see which players in the NBA are most productive. To simplify the example, I’m going to use the pre-game counting stats of points, assists, and rebounds to see the top 10 players and bottom 10 players in each category.
I pulled 2021-22 NBA season data from basketball-reference.com.
df = read.csv('./data/nba-stats-21-22.csv', sep=',', header=TRUE)
head(df, 15)
## Rk Player Pos Age Tm G GS MP FG FGA FG. X3P X3PA
## 1 1 Precious Achiuwa C 22 TOR 73 28 1725 265 603 0.439 56 156
## 2 2 Steven Adams C 28 MEM 76 75 1999 210 384 0.547 0 1
## 3 3 Bam Adebayo C 24 MIA 56 56 1825 406 729 0.557 0 6
## 4 4 Santi Aldama PF 21 MEM 32 0 360 53 132 0.402 6 48
## 5 5 LaMarcus Aldridge C 36 BRK 47 12 1050 252 458 0.550 14 46
## 6 6 Nickeil Alexander-Walker SG 23 TOT 65 21 1466 253 680 0.372 105 338
## 7 7 Grayson Allen SG 26 MIL 66 61 1805 255 569 0.448 159 389
## 8 8 Jarrett Allen C 23 CLE 56 56 1809 369 545 0.677 1 10
## 9 9 Jose Alvarado PG 23 NOP 54 1 834 131 294 0.446 32 110
## 10 10 Justin Anderson SF 28 TOT 16 6 316 36 95 0.379 15 59
## 11 11 Kyle Anderson PF 28 MEM 69 11 1484 209 469 0.446 36 109
## 12 12 Giannis Antetokounmpo PF 27 MIL 67 67 2204 689 1245 0.553 71 242
## 13 13 Thanasis Antetokounmpo SF 29 MIL 48 6 473 70 128 0.547 2 14
## 14 14 Carmelo Anthony PF 37 LAL 69 3 1793 319 723 0.441 149 397
## 15 15 Cole Anthony PG 21 ORL 65 65 2059 357 912 0.391 132 391
## X3P. X2P X2PA X2P. eFG. FT FTA FT. ORB DRB TRB AST STL BLK TOV PF
## 1 0.359 209 447 0.468 0.486 78 131 0.595 146 327 473 82 37 41 84 151
## 2 0.000 210 383 0.548 0.547 108 199 0.543 349 411 760 256 65 60 115 153
## 3 0.000 406 723 0.562 0.557 256 340 0.753 137 427 564 190 80 44 148 171
## 4 0.125 47 84 0.560 0.424 20 32 0.625 33 54 87 21 6 10 16 36
## 5 0.304 238 412 0.578 0.566 89 102 0.873 73 185 258 42 14 47 44 78
## 6 0.311 148 342 0.433 0.449 81 109 0.743 37 150 187 156 46 23 93 103
## 7 0.409 96 180 0.533 0.588 64 74 0.865 32 190 222 100 46 18 43 96
## 8 0.100 368 535 0.688 0.678 165 233 0.708 192 410 602 92 44 75 94 97
## 9 0.291 99 184 0.538 0.500 36 53 0.679 25 75 100 152 71 7 40 73
## 10 0.254 21 36 0.583 0.458 15 19 0.789 4 42 46 33 8 6 8 22
## 11 0.330 173 360 0.481 0.484 67 105 0.638 69 299 368 183 77 45 71 108
## 12 0.293 618 1003 0.616 0.582 553 766 0.722 134 644 778 388 72 91 219 212
## 13 0.143 68 114 0.596 0.555 29 46 0.630 40 61 101 22 16 12 26 68
## 14 0.375 170 326 0.521 0.544 132 159 0.830 62 226 288 68 47 52 59 166
## 15 0.338 225 521 0.432 0.464 216 253 0.854 32 316 348 369 46 17 170 171
## PTS Player.additional
## 1 664 achiupr01
## 2 528 adamsst01
## 3 1068 adebaba01
## 4 132 aldamsa01
## 5 607 aldrila01
## 6 692 alexani01
## 7 733 allengr01
## 8 904 allenja01
## 9 330 alvarjo01
## 10 102 anderju01
## 11 521 anderky01
## 12 2002 antetgi01
## 13 171 antetth01
## 14 919 anthoca01
## 15 1062 anthoco01
summary.data.frame(df)
## Rk Player Pos Age
## Min. : 1 Length:605 Length:605 Min. :19.00
## 1st Qu.:152 Class :character Class :character 1st Qu.:23.00
## Median :303 Mode :character Mode :character Median :25.00
## Mean :303 Mean :25.75
## 3rd Qu.:454 3rd Qu.:28.00
## Max. :605 Max. :41.00
##
## Tm G GS MP
## Length:605 Min. : 1.00 Min. : 0.00 Min. : 1.0
## Class :character 1st Qu.:17.00 1st Qu.: 0.00 1st Qu.: 195.0
## Mode :character Median :48.00 Median : 7.00 Median : 883.0
## Mean :43.04 Mean :20.33 Mean : 981.4
## 3rd Qu.:66.00 3rd Qu.:35.00 3rd Qu.:1669.0
## Max. :82.00 Max. :82.00 Max. :2854.0
##
## FG FGA FG. X3P
## Min. : 0.0 Min. : 0.0 Min. :0.0000 Min. : 0.00
## 1st Qu.: 24.0 1st Qu.: 55.0 1st Qu.:0.3940 1st Qu.: 3.00
## Median :116.0 Median : 262.0 Median :0.4435 Median : 27.00
## Mean :165.2 Mean : 358.2 Mean :0.4385 Mean : 50.58
## 3rd Qu.:255.0 3rd Qu.: 553.0 3rd Qu.:0.4983 3rd Qu.: 79.00
## Max. :774.0 Max. :1564.0 Max. :1.0000 Max. :285.00
## NA's :9
## X3PA X3P. X2P X2PA
## Min. : 0 Min. :0.0000 Min. : 0.0 Min. : 0.0
## 1st Qu.: 12 1st Qu.:0.2690 1st Qu.: 15.0 1st Qu.: 31.0
## Median : 85 Median :0.3330 Median : 74.0 Median : 142.0
## Mean :143 Mean :0.3047 Mean :114.6 Mean : 215.2
## 3rd Qu.:234 3rd Qu.:0.3740 3rd Qu.:174.0 3rd Qu.: 336.0
## Max. :750 Max. :1.0000 Max. :724.0 Max. :1393.0
## NA's :44
## X2P. eFG. FT FTA
## Min. :0.0000 Min. :0.0000 Min. : 0.00 Min. : 0.00
## 1st Qu.:0.4640 1st Qu.:0.4710 1st Qu.: 7.00 1st Qu.: 10.00
## Median :0.5230 Median :0.5220 Median : 40.00 Median : 54.00
## Mean :0.5108 Mean :0.5019 Mean : 68.85 Mean : 88.89
## 3rd Qu.:0.5810 3rd Qu.:0.5630 3rd Qu.: 93.00 3rd Qu.:125.00
## Max. :1.0000 Max. :1.0000 Max. :654.00 Max. :803.00
## NA's :16 NA's :9
## FT. ORB DRB TRB
## Min. :0.0000 Min. : 0.00 Min. : 0.0 Min. : 0.0
## 1st Qu.:0.6810 1st Qu.: 7.00 1st Qu.: 24.0 1st Qu.: 33.0
## Median :0.7675 Median : 26.00 Median :104.0 Median : 135.0
## Mean :0.7476 Mean : 42.02 Mean :138.7 Mean : 180.7
## 3rd Qu.:0.8420 3rd Qu.: 58.00 3rd Qu.:209.0 3rd Qu.: 271.0
## Max. :1.0000 Max. :349.00 Max. :813.0 Max. :1019.0
## NA's :59
## AST STL BLK TOV
## Min. : 0.0 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 12.0 1st Qu.: 5.00 1st Qu.: 3.00 1st Qu.: 8.00
## Median : 59.0 Median : 26.00 Median : 12.00 Median : 39.00
## Mean :100.2 Mean : 31.03 Mean : 19.16 Mean : 53.09
## 3rd Qu.:145.0 3rd Qu.: 49.00 3rd Qu.: 25.00 3rd Qu.: 75.00
## Max. :737.0 Max. :138.00 Max. :177.00 Max. :303.00
##
## PF PTS Player.additional
## Min. : 0.00 Min. : 0.0 Length:605
## 1st Qu.: 18.00 1st Qu.: 67.0 Class :character
## Median : 70.00 Median : 319.0 Mode :character
## Mean : 79.84 Mean : 449.8
## 3rd Qu.:128.00 3rd Qu.: 685.0
## Max. :286.00 Max. :2155.0
##
Getting the top 10 players in points, assists, and rebounds,
respectively. We only want certain columns from our
data.frame, so we’ll
# Sorting the data frame on total assist column, in decreasing order
ast_sorted <- df[order(df$AST[df$Tm != ""], decreasing=TRUE), c("Player", "Tm", "AST")]
head(ast_sorted, 10)
## Player Tm AST
## 602 Trae Young ATL 737
## 438 Chris Paul PHO 702
## 218 James Harden TOT 667
## 214 Tyrese Haliburton TOT 628
## 400 Dejounte Murray SAS 627
## 290 Nikola Jokić DEN 584
## 184 Darius Garland CLE 583
## 25 LaMelo Ball CHO 571
## 141 Luka Dončić DAL 568
## 576 Russell Westbrook LAL 550
Check out those players that didn’t involve their teammates
tail(ast_sorted, 10)
## Player Tm AST
## 431 Daniel Oturu TOR 0
## 434 Trayvon Palmer DET 0
## 442 Norvel Pelle UTA 0
## 456 Micah Potter DET 0
## 487 Matt Ryan BOS 0
## 491 Jordan Schakel WAS 0
## 494 Tre Scott CLE 0
## 503 Marko Simonovic CHI 0
## 515 Jaden Springer PHI 0
## 530 Tyrell Terry MEM 0
Checking out top 10 and bottom 10 of points
pts_sorted <- df[order(df$PTS, decreasing=TRUE), c("Player", "PTS")]
head(pts_sorted, 10)
## Player PTS
## 602 Trae Young 2155
## 134 DeMar DeRozan 2118
## 162 Joel Embiid 2079
## 526 Jayson Tatum 2046
## 290 Nikola Jokić 2004
## 12 Giannis Antetokounmpo 2002
## 141 Luka Dončić 1847
## 59 Devin Booker 1822
## 546 Karl-Anthony Towns 1818
## 383 Donovan Mitchell 1733
tail(pts_sorted, 10)
## Player PTS
## 320 Anthony Lamb 0
## 371 JaQuori McLaughlin 0
## 378 C.J. Miles 0
## 389 Matt Mooney 0
## 398 Ade Murkey 0
## 433 Jaysean Paige 0
## 434 Trayvon Palmer 0
## 529 Emanuel Terry 0
## 531 Jon Teske 0
## 564 M.J. Walker 0
Grabbing rebounds. (Get it?)
# Using total rebounds (TRB column name) to include offensive and defensive rebs
rbs_sorted <- df[order(df$TRB, decreasing=TRUE), c("Player", "TRB")]
head(rbs_sorted, 10)
## Player TRB
## 290 Nikola Jokić 1019
## 195 Rudy Gobert 968
## 93 Clint Capela 877
## 551 Jonas Valančiūnas 843
## 557 Nikola Vučević 804
## 162 Joel Embiid 796
## 12 Giannis Antetokounmpo 778
## 2 Steven Adams 760
## 488 Domantas Sabonis 752
## 546 Karl-Anthony Towns 727
tail(rbs_sorted, 10)
## Player TRB
## 331 Scottie Lewis 0
## 371 JaQuori McLaughlin 0
## 378 C.J. Miles 0
## 389 Matt Mooney 0
## 396 Emmanuel Mudiay 0
## 398 Ade Murkey 0
## 487 Matt Ryan 0
## 523 Craig Sword 0
## 530 Tyrell Terry 0
## 598 McKinley Wright IV 0
df$APG <- df$AST / df$G
df$PPG <- df$PTS / df$G
df$RPG <- df$TRB / df$G
head(df, 15)
## Rk Player Pos Age Tm G GS MP FG FGA FG. X3P X3PA
## 1 1 Precious Achiuwa C 22 TOR 73 28 1725 265 603 0.439 56 156
## 2 2 Steven Adams C 28 MEM 76 75 1999 210 384 0.547 0 1
## 3 3 Bam Adebayo C 24 MIA 56 56 1825 406 729 0.557 0 6
## 4 4 Santi Aldama PF 21 MEM 32 0 360 53 132 0.402 6 48
## 5 5 LaMarcus Aldridge C 36 BRK 47 12 1050 252 458 0.550 14 46
## 6 6 Nickeil Alexander-Walker SG 23 TOT 65 21 1466 253 680 0.372 105 338
## 7 7 Grayson Allen SG 26 MIL 66 61 1805 255 569 0.448 159 389
## 8 8 Jarrett Allen C 23 CLE 56 56 1809 369 545 0.677 1 10
## 9 9 Jose Alvarado PG 23 NOP 54 1 834 131 294 0.446 32 110
## 10 10 Justin Anderson SF 28 TOT 16 6 316 36 95 0.379 15 59
## 11 11 Kyle Anderson PF 28 MEM 69 11 1484 209 469 0.446 36 109
## 12 12 Giannis Antetokounmpo PF 27 MIL 67 67 2204 689 1245 0.553 71 242
## 13 13 Thanasis Antetokounmpo SF 29 MIL 48 6 473 70 128 0.547 2 14
## 14 14 Carmelo Anthony PF 37 LAL 69 3 1793 319 723 0.441 149 397
## 15 15 Cole Anthony PG 21 ORL 65 65 2059 357 912 0.391 132 391
## X3P. X2P X2PA X2P. eFG. FT FTA FT. ORB DRB TRB AST STL BLK TOV PF
## 1 0.359 209 447 0.468 0.486 78 131 0.595 146 327 473 82 37 41 84 151
## 2 0.000 210 383 0.548 0.547 108 199 0.543 349 411 760 256 65 60 115 153
## 3 0.000 406 723 0.562 0.557 256 340 0.753 137 427 564 190 80 44 148 171
## 4 0.125 47 84 0.560 0.424 20 32 0.625 33 54 87 21 6 10 16 36
## 5 0.304 238 412 0.578 0.566 89 102 0.873 73 185 258 42 14 47 44 78
## 6 0.311 148 342 0.433 0.449 81 109 0.743 37 150 187 156 46 23 93 103
## 7 0.409 96 180 0.533 0.588 64 74 0.865 32 190 222 100 46 18 43 96
## 8 0.100 368 535 0.688 0.678 165 233 0.708 192 410 602 92 44 75 94 97
## 9 0.291 99 184 0.538 0.500 36 53 0.679 25 75 100 152 71 7 40 73
## 10 0.254 21 36 0.583 0.458 15 19 0.789 4 42 46 33 8 6 8 22
## 11 0.330 173 360 0.481 0.484 67 105 0.638 69 299 368 183 77 45 71 108
## 12 0.293 618 1003 0.616 0.582 553 766 0.722 134 644 778 388 72 91 219 212
## 13 0.143 68 114 0.596 0.555 29 46 0.630 40 61 101 22 16 12 26 68
## 14 0.375 170 326 0.521 0.544 132 159 0.830 62 226 288 68 47 52 59 166
## 15 0.338 225 521 0.432 0.464 216 253 0.854 32 316 348 369 46 17 170 171
## PTS Player.additional APG PPG RPG
## 1 664 achiupr01 1.1232877 9.095890 6.479452
## 2 528 adamsst01 3.3684211 6.947368 10.000000
## 3 1068 adebaba01 3.3928571 19.071429 10.071429
## 4 132 aldamsa01 0.6562500 4.125000 2.718750
## 5 607 aldrila01 0.8936170 12.914894 5.489362
## 6 692 alexani01 2.4000000 10.646154 2.876923
## 7 733 allengr01 1.5151515 11.106061 3.363636
## 8 904 allenja01 1.6428571 16.142857 10.750000
## 9 330 alvarjo01 2.8148148 6.111111 1.851852
## 10 102 anderju01 2.0625000 6.375000 2.875000
## 11 521 anderky01 2.6521739 7.550725 5.333333
## 12 2002 antetgi01 5.7910448 29.880597 11.611940
## 13 171 antetth01 0.4583333 3.562500 2.104167
## 14 919 anthoca01 0.9855072 13.318841 4.173913
## 15 1062 anthoco01 5.6769231 16.338462 5.353846
ggplot2library(ggplot2)
# Plotting points per game vs assists per game
ggplot(df, aes(x=APG, y=PPG))+
geom_point(size=2, shape=23)
# Would like to plot rbounds vs assists as those are generally not performed by players of the same position
ggplot(df, aes(x=RPG, y=APG)) +
geom_point(size=2, shape=23)
Had to download the dataset from the source page but would like to read the HTML directly for the table. Note that the Hide Partial Rows option is turned off in our setup.
# Reading from HTML produces an error, but reading the csv file locally works
#df = read.html("https://www.basketball-reference.com/leagues/NBA_2022_totals.html#totals_stats", sep=",", header=TRUE)