Question

I’m a big NBA/sports fan. I’m curious to see which players in the NBA are most productive. To simplify the example, I’m going to use the pre-game counting stats of points, assists, and rebounds to see the top 10 players and bottom 10 players in each category.

Pulling Data

I pulled 2021-22 NBA season data from basketball-reference.com.

df = read.csv('./data/nba-stats-21-22.csv', sep=',', header=TRUE)

head(df, 15)
##    Rk                   Player Pos Age  Tm  G GS   MP  FG  FGA   FG. X3P X3PA
## 1   1         Precious Achiuwa   C  22 TOR 73 28 1725 265  603 0.439  56  156
## 2   2             Steven Adams   C  28 MEM 76 75 1999 210  384 0.547   0    1
## 3   3              Bam Adebayo   C  24 MIA 56 56 1825 406  729 0.557   0    6
## 4   4             Santi Aldama  PF  21 MEM 32  0  360  53  132 0.402   6   48
## 5   5        LaMarcus Aldridge   C  36 BRK 47 12 1050 252  458 0.550  14   46
## 6   6 Nickeil Alexander-Walker  SG  23 TOT 65 21 1466 253  680 0.372 105  338
## 7   7            Grayson Allen  SG  26 MIL 66 61 1805 255  569 0.448 159  389
## 8   8            Jarrett Allen   C  23 CLE 56 56 1809 369  545 0.677   1   10
## 9   9            Jose Alvarado  PG  23 NOP 54  1  834 131  294 0.446  32  110
## 10 10          Justin Anderson  SF  28 TOT 16  6  316  36   95 0.379  15   59
## 11 11            Kyle Anderson  PF  28 MEM 69 11 1484 209  469 0.446  36  109
## 12 12    Giannis Antetokounmpo  PF  27 MIL 67 67 2204 689 1245 0.553  71  242
## 13 13   Thanasis Antetokounmpo  SF  29 MIL 48  6  473  70  128 0.547   2   14
## 14 14          Carmelo Anthony  PF  37 LAL 69  3 1793 319  723 0.441 149  397
## 15 15             Cole Anthony  PG  21 ORL 65 65 2059 357  912 0.391 132  391
##     X3P. X2P X2PA  X2P.  eFG.  FT FTA   FT. ORB DRB TRB AST STL BLK TOV  PF
## 1  0.359 209  447 0.468 0.486  78 131 0.595 146 327 473  82  37  41  84 151
## 2  0.000 210  383 0.548 0.547 108 199 0.543 349 411 760 256  65  60 115 153
## 3  0.000 406  723 0.562 0.557 256 340 0.753 137 427 564 190  80  44 148 171
## 4  0.125  47   84 0.560 0.424  20  32 0.625  33  54  87  21   6  10  16  36
## 5  0.304 238  412 0.578 0.566  89 102 0.873  73 185 258  42  14  47  44  78
## 6  0.311 148  342 0.433 0.449  81 109 0.743  37 150 187 156  46  23  93 103
## 7  0.409  96  180 0.533 0.588  64  74 0.865  32 190 222 100  46  18  43  96
## 8  0.100 368  535 0.688 0.678 165 233 0.708 192 410 602  92  44  75  94  97
## 9  0.291  99  184 0.538 0.500  36  53 0.679  25  75 100 152  71   7  40  73
## 10 0.254  21   36 0.583 0.458  15  19 0.789   4  42  46  33   8   6   8  22
## 11 0.330 173  360 0.481 0.484  67 105 0.638  69 299 368 183  77  45  71 108
## 12 0.293 618 1003 0.616 0.582 553 766 0.722 134 644 778 388  72  91 219 212
## 13 0.143  68  114 0.596 0.555  29  46 0.630  40  61 101  22  16  12  26  68
## 14 0.375 170  326 0.521 0.544 132 159 0.830  62 226 288  68  47  52  59 166
## 15 0.338 225  521 0.432 0.464 216 253 0.854  32 316 348 369  46  17 170 171
##     PTS Player.additional
## 1   664         achiupr01
## 2   528         adamsst01
## 3  1068         adebaba01
## 4   132         aldamsa01
## 5   607         aldrila01
## 6   692         alexani01
## 7   733         allengr01
## 8   904         allenja01
## 9   330         alvarjo01
## 10  102         anderju01
## 11  521         anderky01
## 12 2002         antetgi01
## 13  171         antetth01
## 14  919         anthoca01
## 15 1062         anthoco01
summary.data.frame(df)
##        Rk         Player              Pos                 Age       
##  Min.   :  1   Length:605         Length:605         Min.   :19.00  
##  1st Qu.:152   Class :character   Class :character   1st Qu.:23.00  
##  Median :303   Mode  :character   Mode  :character   Median :25.00  
##  Mean   :303                                         Mean   :25.75  
##  3rd Qu.:454                                         3rd Qu.:28.00  
##  Max.   :605                                         Max.   :41.00  
##                                                                     
##       Tm                  G               GS              MP        
##  Length:605         Min.   : 1.00   Min.   : 0.00   Min.   :   1.0  
##  Class :character   1st Qu.:17.00   1st Qu.: 0.00   1st Qu.: 195.0  
##  Mode  :character   Median :48.00   Median : 7.00   Median : 883.0  
##                     Mean   :43.04   Mean   :20.33   Mean   : 981.4  
##                     3rd Qu.:66.00   3rd Qu.:35.00   3rd Qu.:1669.0  
##                     Max.   :82.00   Max.   :82.00   Max.   :2854.0  
##                                                                     
##        FG             FGA              FG.              X3P        
##  Min.   :  0.0   Min.   :   0.0   Min.   :0.0000   Min.   :  0.00  
##  1st Qu.: 24.0   1st Qu.:  55.0   1st Qu.:0.3940   1st Qu.:  3.00  
##  Median :116.0   Median : 262.0   Median :0.4435   Median : 27.00  
##  Mean   :165.2   Mean   : 358.2   Mean   :0.4385   Mean   : 50.58  
##  3rd Qu.:255.0   3rd Qu.: 553.0   3rd Qu.:0.4983   3rd Qu.: 79.00  
##  Max.   :774.0   Max.   :1564.0   Max.   :1.0000   Max.   :285.00  
##                                   NA's   :9                        
##       X3PA          X3P.             X2P             X2PA       
##  Min.   :  0   Min.   :0.0000   Min.   :  0.0   Min.   :   0.0  
##  1st Qu.: 12   1st Qu.:0.2690   1st Qu.: 15.0   1st Qu.:  31.0  
##  Median : 85   Median :0.3330   Median : 74.0   Median : 142.0  
##  Mean   :143   Mean   :0.3047   Mean   :114.6   Mean   : 215.2  
##  3rd Qu.:234   3rd Qu.:0.3740   3rd Qu.:174.0   3rd Qu.: 336.0  
##  Max.   :750   Max.   :1.0000   Max.   :724.0   Max.   :1393.0  
##                NA's   :44                                       
##       X2P.             eFG.              FT              FTA        
##  Min.   :0.0000   Min.   :0.0000   Min.   :  0.00   Min.   :  0.00  
##  1st Qu.:0.4640   1st Qu.:0.4710   1st Qu.:  7.00   1st Qu.: 10.00  
##  Median :0.5230   Median :0.5220   Median : 40.00   Median : 54.00  
##  Mean   :0.5108   Mean   :0.5019   Mean   : 68.85   Mean   : 88.89  
##  3rd Qu.:0.5810   3rd Qu.:0.5630   3rd Qu.: 93.00   3rd Qu.:125.00  
##  Max.   :1.0000   Max.   :1.0000   Max.   :654.00   Max.   :803.00  
##  NA's   :16       NA's   :9                                         
##       FT.              ORB              DRB             TRB        
##  Min.   :0.0000   Min.   :  0.00   Min.   :  0.0   Min.   :   0.0  
##  1st Qu.:0.6810   1st Qu.:  7.00   1st Qu.: 24.0   1st Qu.:  33.0  
##  Median :0.7675   Median : 26.00   Median :104.0   Median : 135.0  
##  Mean   :0.7476   Mean   : 42.02   Mean   :138.7   Mean   : 180.7  
##  3rd Qu.:0.8420   3rd Qu.: 58.00   3rd Qu.:209.0   3rd Qu.: 271.0  
##  Max.   :1.0000   Max.   :349.00   Max.   :813.0   Max.   :1019.0  
##  NA's   :59                                                        
##       AST             STL              BLK              TOV        
##  Min.   :  0.0   Min.   :  0.00   Min.   :  0.00   Min.   :  0.00  
##  1st Qu.: 12.0   1st Qu.:  5.00   1st Qu.:  3.00   1st Qu.:  8.00  
##  Median : 59.0   Median : 26.00   Median : 12.00   Median : 39.00  
##  Mean   :100.2   Mean   : 31.03   Mean   : 19.16   Mean   : 53.09  
##  3rd Qu.:145.0   3rd Qu.: 49.00   3rd Qu.: 25.00   3rd Qu.: 75.00  
##  Max.   :737.0   Max.   :138.00   Max.   :177.00   Max.   :303.00  
##                                                                    
##        PF              PTS         Player.additional 
##  Min.   :  0.00   Min.   :   0.0   Length:605        
##  1st Qu.: 18.00   1st Qu.:  67.0   Class :character  
##  Median : 70.00   Median : 319.0   Mode  :character  
##  Mean   : 79.84   Mean   : 449.8                     
##  3rd Qu.:128.00   3rd Qu.: 685.0                     
##  Max.   :286.00   Max.   :2155.0                     
## 

Getting the top 10 players in points, assists, and rebounds, respectively. We only want certain columns from our data.frame, so we’ll

# Sorting the data frame on total assist column, in decreasing order
ast_sorted <- df[order(df$AST[df$Tm != ""], decreasing=TRUE), c("Player", "Tm", "AST")]

head(ast_sorted, 10)
##                Player  Tm AST
## 602        Trae Young ATL 737
## 438        Chris Paul PHO 702
## 218      James Harden TOT 667
## 214 Tyrese Haliburton TOT 628
## 400   Dejounte Murray SAS 627
## 290      Nikola Jokić DEN 584
## 184    Darius Garland CLE 583
## 25        LaMelo Ball CHO 571
## 141       Luka Dončić DAL 568
## 576 Russell Westbrook LAL 550

Check out those players that didn’t involve their teammates

tail(ast_sorted, 10)
##              Player  Tm AST
## 431    Daniel Oturu TOR   0
## 434  Trayvon Palmer DET   0
## 442    Norvel Pelle UTA   0
## 456    Micah Potter DET   0
## 487       Matt Ryan BOS   0
## 491  Jordan Schakel WAS   0
## 494       Tre Scott CLE   0
## 503 Marko Simonovic CHI   0
## 515  Jaden Springer PHI   0
## 530    Tyrell Terry MEM   0

Checking out top 10 and bottom 10 of points

pts_sorted <- df[order(df$PTS, decreasing=TRUE), c("Player", "PTS")]

head(pts_sorted, 10)
##                    Player  PTS
## 602            Trae Young 2155
## 134         DeMar DeRozan 2118
## 162           Joel Embiid 2079
## 526          Jayson Tatum 2046
## 290          Nikola Jokić 2004
## 12  Giannis Antetokounmpo 2002
## 141           Luka Dončić 1847
## 59           Devin Booker 1822
## 546    Karl-Anthony Towns 1818
## 383      Donovan Mitchell 1733
tail(pts_sorted, 10)
##                 Player PTS
## 320       Anthony Lamb   0
## 371 JaQuori McLaughlin   0
## 378         C.J. Miles   0
## 389        Matt Mooney   0
## 398         Ade Murkey   0
## 433      Jaysean Paige   0
## 434     Trayvon Palmer   0
## 529      Emanuel Terry   0
## 531          Jon Teske   0
## 564        M.J. Walker   0

Grabbing rebounds. (Get it?)

# Using total rebounds (TRB column name) to include offensive and defensive rebs
rbs_sorted <- df[order(df$TRB, decreasing=TRUE), c("Player", "TRB")]

head(rbs_sorted, 10)
##                    Player  TRB
## 290          Nikola Jokić 1019
## 195           Rudy Gobert  968
## 93           Clint Capela  877
## 551     Jonas Valančiūnas  843
## 557        Nikola Vučević  804
## 162           Joel Embiid  796
## 12  Giannis Antetokounmpo  778
## 2            Steven Adams  760
## 488      Domantas Sabonis  752
## 546    Karl-Anthony Towns  727
tail(rbs_sorted, 10)
##                 Player TRB
## 331      Scottie Lewis   0
## 371 JaQuori McLaughlin   0
## 378         C.J. Miles   0
## 389        Matt Mooney   0
## 396    Emmanuel Mudiay   0
## 398         Ade Murkey   0
## 487          Matt Ryan   0
## 523        Craig Sword   0
## 530       Tyrell Terry   0
## 598 McKinley Wright IV   0

Getting per-game averages

df$APG <- df$AST / df$G
df$PPG <- df$PTS / df$G
df$RPG <- df$TRB / df$G


head(df, 15)
##    Rk                   Player Pos Age  Tm  G GS   MP  FG  FGA   FG. X3P X3PA
## 1   1         Precious Achiuwa   C  22 TOR 73 28 1725 265  603 0.439  56  156
## 2   2             Steven Adams   C  28 MEM 76 75 1999 210  384 0.547   0    1
## 3   3              Bam Adebayo   C  24 MIA 56 56 1825 406  729 0.557   0    6
## 4   4             Santi Aldama  PF  21 MEM 32  0  360  53  132 0.402   6   48
## 5   5        LaMarcus Aldridge   C  36 BRK 47 12 1050 252  458 0.550  14   46
## 6   6 Nickeil Alexander-Walker  SG  23 TOT 65 21 1466 253  680 0.372 105  338
## 7   7            Grayson Allen  SG  26 MIL 66 61 1805 255  569 0.448 159  389
## 8   8            Jarrett Allen   C  23 CLE 56 56 1809 369  545 0.677   1   10
## 9   9            Jose Alvarado  PG  23 NOP 54  1  834 131  294 0.446  32  110
## 10 10          Justin Anderson  SF  28 TOT 16  6  316  36   95 0.379  15   59
## 11 11            Kyle Anderson  PF  28 MEM 69 11 1484 209  469 0.446  36  109
## 12 12    Giannis Antetokounmpo  PF  27 MIL 67 67 2204 689 1245 0.553  71  242
## 13 13   Thanasis Antetokounmpo  SF  29 MIL 48  6  473  70  128 0.547   2   14
## 14 14          Carmelo Anthony  PF  37 LAL 69  3 1793 319  723 0.441 149  397
## 15 15             Cole Anthony  PG  21 ORL 65 65 2059 357  912 0.391 132  391
##     X3P. X2P X2PA  X2P.  eFG.  FT FTA   FT. ORB DRB TRB AST STL BLK TOV  PF
## 1  0.359 209  447 0.468 0.486  78 131 0.595 146 327 473  82  37  41  84 151
## 2  0.000 210  383 0.548 0.547 108 199 0.543 349 411 760 256  65  60 115 153
## 3  0.000 406  723 0.562 0.557 256 340 0.753 137 427 564 190  80  44 148 171
## 4  0.125  47   84 0.560 0.424  20  32 0.625  33  54  87  21   6  10  16  36
## 5  0.304 238  412 0.578 0.566  89 102 0.873  73 185 258  42  14  47  44  78
## 6  0.311 148  342 0.433 0.449  81 109 0.743  37 150 187 156  46  23  93 103
## 7  0.409  96  180 0.533 0.588  64  74 0.865  32 190 222 100  46  18  43  96
## 8  0.100 368  535 0.688 0.678 165 233 0.708 192 410 602  92  44  75  94  97
## 9  0.291  99  184 0.538 0.500  36  53 0.679  25  75 100 152  71   7  40  73
## 10 0.254  21   36 0.583 0.458  15  19 0.789   4  42  46  33   8   6   8  22
## 11 0.330 173  360 0.481 0.484  67 105 0.638  69 299 368 183  77  45  71 108
## 12 0.293 618 1003 0.616 0.582 553 766 0.722 134 644 778 388  72  91 219 212
## 13 0.143  68  114 0.596 0.555  29  46 0.630  40  61 101  22  16  12  26  68
## 14 0.375 170  326 0.521 0.544 132 159 0.830  62 226 288  68  47  52  59 166
## 15 0.338 225  521 0.432 0.464 216 253 0.854  32 316 348 369  46  17 170 171
##     PTS Player.additional       APG       PPG       RPG
## 1   664         achiupr01 1.1232877  9.095890  6.479452
## 2   528         adamsst01 3.3684211  6.947368 10.000000
## 3  1068         adebaba01 3.3928571 19.071429 10.071429
## 4   132         aldamsa01 0.6562500  4.125000  2.718750
## 5   607         aldrila01 0.8936170 12.914894  5.489362
## 6   692         alexani01 2.4000000 10.646154  2.876923
## 7   733         allengr01 1.5151515 11.106061  3.363636
## 8   904         allenja01 1.6428571 16.142857 10.750000
## 9   330         alvarjo01 2.8148148  6.111111  1.851852
## 10  102         anderju01 2.0625000  6.375000  2.875000
## 11  521         anderky01 2.6521739  7.550725  5.333333
## 12 2002         antetgi01 5.7910448 29.880597 11.611940
## 13  171         antetth01 0.4583333  3.562500  2.104167
## 14  919         anthoca01 0.9855072 13.318841  4.173913
## 15 1062         anthoco01 5.6769231 16.338462  5.353846

Plotting our data using ggplot2

library(ggplot2)

# Plotting points per game vs assists per game 
ggplot(df, aes(x=APG, y=PPG))+
  geom_point(size=2, shape=23)

# Would like to plot rbounds vs assists as those are generally not performed by players of the same position
ggplot(df, aes(x=RPG, y=APG)) + 
  geom_point(size=2, shape=23)

Had to download the dataset from the source page but would like to read the HTML directly for the table. Note that the Hide Partial Rows option is turned off in our setup.

# Reading from HTML produces an error, but reading the csv file locally works
#df = read.html("https://www.basketball-reference.com/leagues/NBA_2022_totals.html#totals_stats", sep=",", header=TRUE)