1.1 - Who is the all time great? - Exploring the Best Basketball Players of All Time

1.2 - Introduction and Posing the Question

For sports lovers who watch basketball, one of the toughest questions of all time is “who is the best basketball player of all time?” Even harder to debate is, “who are the top 5 or top 10 best basketball players of all time?” We all love a good debate that often turns into an argument over a few beverages. And this one might never be solved. While the argument may not be answered by statistics alone because rules have changed over time like the three point line. But, they certainly provide some insight to these questions of talent. Some of the statistics I’ll dive into are Rebounds per game, blocks per game, assists per game, and Field Goal percentage.

In this assignment I’m using data from “Basketball Reference” which provides the statistics mentioned before. This data table is titled “Slam 500 Greatest NBA Players of All Time.” This list was selected by 2011 SLAM Magazine, so the players are slightly outdated. However, the stats are actually up to date as you’ll see in the column titled “From” which shows current players. We will be testing if their ranking system gives the same results, however, we will be asking the question and testing who the best players of all time are.

Packages for My Analysis

I downloaded a few packages that are necessary for some of the applications I will be using. First, Tidyverse is going to be critical for creating a visual analysis and data cleaning. I also downloaded XML which is critical for gathering and scraping the data in the project. Other packages that I wanted to make sure I had include, hhtr, dplyer, rvest, Rcurl, and magrittr. Httr will be useful for web authentication. Rvest has some useful tools for working with HTM and XML. These packages will allow us to have more functionality and capabilities when we dig through these stats and solve our “problem.”

3 - Getting My Data ready for Analysis

## [1] 1
##   Rank           Player From   To    G   MP  PTS  TRB  AST STL BLK  FG%  3P%
## 1    1   Michael Jordan 1985 2003 1072 38.3 30.1  6.2  5.3 2.3 0.8 .497 .327
## 2    2 Wilt Chamberlain 1960 1973 1045 45.8 30.1 22.9  4.4         .540     
## 3    3     Bill Russell 1957 1969  963 42.3 15.1 22.5  4.3         .440     
## 4    4 Shaquille O'Neal 1993 2011 1207 34.7 23.7 10.9  2.5 0.6 2.3 .582 .045
## 5    5  Oscar Robertson 1961 1974 1040 42.2 25.7  7.5  9.5 1.1 0.1 .485     
## 6    6    Magic Johnson 1980 1996  906 36.7 19.5  7.2 11.2 1.9 0.4 .520 .303
##    FT%    WS WS/48
## 1 .835 214.0  .250
## 2 .511 247.3  .248
## 3 .561 163.5  .193
## 4 .527 181.7  .208
## 5 .838 189.2  .207
## 6 .848 155.8  .225

This is the data set that we will use before any data cleaning has occured. This is the raw data pulled from the website in the format of a table. Once it has been cleaned it will be ready for further analysis.

Remove any players that averaged less than 30 minutes per game and any player who averaged 40 minutes per game in their playing career. Players who have played more than 40 minutes per game might have inflated statistics in scoring, assisting, rebounding and stealing. Players who averaged less than 30 minute a game might aren’t considered in my analysis because the best players of all time were needed on the court for more than 60% of the game.

The data set has removed a significant amount of players who have exceeded the 40 minutes per game maximum, and also the 30 minute per game threshold. This will remove players who have inflated stats and players who are deemed insignificant to the top players analysis.

Now that we have our players for our analysis, I will need to define a few stats as they might not be common to some people.

Column Headings and Their Meanings

G Means the total games played in the players carreer

MP means the average minutes played per game in the players carrer

PTS means the average points per game for the player

TRB means total rebounds per game for the player

AST means the total assists per game for the player

STL means the total amount of assists per game for the player

BLK means the total amount of blocks per game for the player

FG stands for field goal percentage - (shots made/total shots taken)

X3P stands for 3-point percentage - (3-points made/3-points taken)

FT stands for the players free throw percentage (made free throws/taken free throws)

We will not be focusing on the columns WS and WS.48

4 - Analysis

The first analysis we can run is a scatter plot to show the correlation between assists and points. It generally understood that those players who run up more assists per game tend to have less points per game. We want to find out if this is true given the elite pool of players we are analyzing. Is someone who has a higher assists per game statistic a indication of someone who scores less points?

My expectations for the scatter plot were very different. I was imagining the scatter plot to be very populated with black circles in the bottom right hand corner indicating that people who are good at scoring are not great at passing to lead to an assist. And I also expected the top left hand corner to be heavily populated with black circles, indicating that the people who are great passers are not as talented scorers. This was not true and there seemed to be a very large spread.

Next one big debate about basketball, among sports analysts, is about the well rounded player. What that means - are they offensive threats and horrendous on defense - vice versa. I’m going to create a table (1) that shows the amount of players that have excellent defensive stats and excellent offensive stats, (2) that shows the amount of players who only have excellent offensive stats, (3) that shows the amount of players who only have excellent defensive stats.

##   Rank         Player From   To    G   MP  PTS TRB AST STL BLK  FG. X3P.  FT.
## 1    1 Michael Jordan 1985 2003 1072 38.3 30.1 6.2 5.3 2.3 0.8 .497 .327 .835
## 2   11     Jerry West 1961 1974  932 39.2 27.0 5.8 6.7 2.6 0.7 .474      .814
## 3   21     Rick Barry 1966 1980  794 36.3 23.2 6.5 5.1 2.0 0.5 .449 .330 .900
## 4   27 Scottie Pippen 1988 2004 1178 34.9 16.1 6.4 5.2 2.0 0.8 .473 .326 .704
## 5   42  Clyde Drexler 1984 1998 1086 34.6 20.4 6.1 5.6 2.0 0.7 .472 .318 .788
##      WS WS.48
## 1 214.0  .250
## 2 162.6  .213
## 3  93.4  .156
## 4 125.1  .146
## 5 135.6  .173

The analysis by creating a table, results in only 5 rows of data. That is pretty significant because it shows that it is extremely hard to be a well rounded player who is good at offense and defense. I also find this interesting because the five players displayed are also in the top 50 of the top 500 players. Three of those players are also in the top 25. This shows that people do value players who are well rounded when creating their list of the best players of all time.

Most interesting of all of this is that not one of those players’ careers extended past 2005. This tells me that the game has changed in the last 15 years. Before 2005, there was more emphasis on defense and becoming a complete player.

##    Rank            Player From   To    G   MP  PTS  TRB AST STL BLK  FG. X3P.
## 1     1    Michael Jordan 1985 2003 1072 38.3 30.1  6.2 5.3 2.3 0.8 .497 .327
## 2     9        Larry Bird 1980 1992  897 38.4 24.3 10.0 6.3 1.7 0.8 .496 .376
## 3    10       Kobe Bryant 1997 2016 1346 36.1 25.0  5.2 4.7 1.4 0.5 .447 .329
## 4    11        Jerry West 1961 1974  932 39.2 27.0  5.8 6.7 2.6 0.7 .474     
## 5    17     John Havlicek 1963 1978 1270 36.6 20.8  6.3 4.8 1.2 0.3 .439     
## 6    19      Isiah Thomas 1982 1994  979 36.3 19.2  3.6 9.3 1.9 0.3 .452 .290
## 7    21        Rick Barry 1966 1980  794 36.3 23.2  6.5 5.1 2.0 0.5 .449 .330
## 8    24         Bob Cousy 1951 1970  924 35.3 18.4  5.2 7.5         .375     
## 9    27    Scottie Pippen 1988 2004 1178 34.9 16.1  6.4 5.2 2.0 0.8 .473 .326
## 10   31      LeBron James 2004 2021 1306 38.2 27.0  7.4 7.4 1.6 0.8 .504 .345
## 11   36      Walt Frazier 1968 1980  825 37.5 18.9  5.9 6.1 1.9 0.2 .490 .000
## 12   39       Gary Payton 1991 2007 1335 35.3 16.3  3.9 6.7 1.8 0.2 .466 .317
## 13   42     Clyde Drexler 1984 1998 1086 34.6 20.4  6.1 5.6 2.0 0.7 .472 .318
## 14   49       Dwyane Wade 2004 2019 1054 33.9 22.0  4.7 5.4 1.5 0.8 .480 .293
## 15   52    Tiny Archibald 1971 1984  876 35.6 18.8  2.3 7.4 1.1 0.1 .467 .224
## 16   57         Dave Bing 1967 1978  901 36.4 20.3  3.8 6.0 1.3 0.2 .441     
## 17   60     Pete Maravich 1971 1980  658 37.0 24.2  4.2 5.4 1.4 0.3 .441 .667
## 18   72     Lenny Wilkens 1961 1975 1077 35.3 16.5  4.7 6.7 1.3 0.2 .432     
## 19   87      Tim Hardaway 1990 2003  867 35.3 17.7  3.3 8.2 1.6 0.1 .431 .355
## 20  102     Kevin Johnson 1988 2000  735 34.1 17.9  3.3 9.1 1.5 0.2 .493 .305
## 21  107        Chris Paul 2006 2021 1080 34.7 18.3  4.5 9.4 2.1 0.1 .471 .370
## 22  113     Richie Guerin 1957 1970  848 32.4 17.3  5.0 5.0         .416     
## 23  118       Jo Jo White 1970 1981  837 35.8 17.2  4.0 4.9 1.3 0.2 .444 .167
## 24  124  Chauncey Billups 1998 2014 1043 31.6 15.2  2.9 5.4 1.0 0.2 .415 .387
## 25  128    Deron Williams 2006 2017  845 34.2 16.3  3.1 8.1 1.0 0.2 .445 .357
## 26  130 Anfernee Hardaway 1994 2008  704 33.7 15.2  4.5 5.0 1.6 0.4 .458 .316
## 27  132     Charlie Scott 1972 1980  560 34.4 17.9  3.6 4.8 1.3 0.3 .444 .182
## 28  139       Tony Parker 2002 2019 1254 30.5 15.5  2.7 5.6 0.8 0.1 .491 .324
## 29  149      Gus Williams 1976 1987  825 31.1 17.1  2.7 5.6 2.0 0.4 .461 .238
## 30  160      Mark Jackson 1988 2004 1296 30.2  9.6  3.8 8.0 1.2 0.1 .447 .332
## 31  166       Randy Smith 1972 1983  976 32.2 16.7  3.7 4.6 1.7 0.1 .470 .155
## 32  169      Andy Phillip 1948 1958  701 32.2  9.1  4.4 5.4         .368     
## 33  176      Reggie Theus 1979 1991 1026 33.7 18.5  3.3 6.3 1.2 0.2 .471 .252
## 34  183    Gilbert Arenas 2002 2012  552 35.1 20.7  3.9 5.3 1.6 0.2 .421 .351
## 35  186        Norm Nixon 1978 1989  768 35.5 15.7  2.6 8.3 1.5 0.1 .483 .294
## 36  189      Archie Clark 1967 1976  725 32.5 16.3  3.3 4.8 1.1 0.1 .480     
## 37  195       Sam Cassell 1994 2008  993 30.0 15.7  3.2 6.0 1.1 0.2 .454 .331
## 38  197   Stephon Marbury 1997 2009  846 37.7 19.3  3.0 7.6 1.2 0.1 .433 .325
## 39  199       Baron Davis 2000 2012  835 34.2 16.1  3.8 7.2 1.8 0.4 .409 .320
## 40  228     Steve Francis 2000 2008  576 37.6 18.1  5.6 6.0 1.5 0.4 .429 .341
## 41  259      Geoff Petrie 1971 1976  446 37.6 21.8  2.8 4.6 1.1 0.1 .455     
## 42  353       Monta Ellis 2006 2017  833 34.8 17.8  3.5 4.6 1.7 0.3 .451 .314
##     FT.    WS WS.48
## 1  .835 214.0  .250
## 2  .886 145.8  .203
## 3  .837 172.7  .170
## 4  .814 162.6  .213
## 5  .815 131.7  .136
## 6  .759  80.7  .109
## 7  .900  93.4  .156
## 8  .803  91.1  .139
## 9  .704 125.1  .146
## 10 .734 242.0  .233
## 11 .786 113.5  .176
## 12 .729 145.5  .148
## 13 .788 135.6  .173
## 14 .765 120.7  .162
## 15 .810  83.4  .128
## 16 .775  68.8  .101
## 17 .820  46.7  .092
## 18 .774  95.5  .120
## 19 .782  85.0  .133
## 20 .841  92.8  .178
## 21 .872 188.0  .241
## 22 .780  69.2  .121
## 23 .834  54.0  .087
## 24 .894 120.8  .176
## 25 .822  77.3  .129
## 26 .774  61.9  .125
## 27 .772  26.9  .067
## 28 .751 111.3  .140
## 29 .756  67.9  .127
## 30 .770  91.8  .113
## 31 .781  60.1  .092
## 32 .695  60.8  .101
## 33 .826  66.9  .093
## 34 .803  51.3  .127
## 35 .772  47.6  .084
## 36 .769  64.7  .132
## 37 .861  87.5  .141
## 38 .784  77.5  .117
## 39 .711  63.1  .106
## 40 .797  54.1  .120
## 41 .805  26.1  .075
## 42 .772  41.9  .069

This data table is a great display because we can draw a few conclusions from it. First, there are 42 rows displayed which means that it is common for a player to be very good at offense but not also at defense. Second, people will generally rank higher the offensively gifted basketball players higher than their defensively gifted counterparts.

##    Rank          Player From   To    G   MP  PTS  TRB AST STL BLK  FG. X3P.
## 1     1  Michael Jordan 1985 2003 1072 38.3 30.1  6.2 5.3 2.3 0.8 .497 .327
## 2     9      Larry Bird 1980 1992  897 38.4 24.3 10.0 6.3 1.7 0.8 .496 .376
## 3    11      Jerry West 1961 1974  932 39.2 27.0  5.8 6.7 2.6 0.7 .474     
## 4    13 Hakeem Olajuwon 1985 2002 1238 35.7 21.8 11.1 2.5 1.7 3.1 .512 .202
## 5    16   Julius Erving 1977 1987  836 34.3 22.0  6.7 3.9 1.8 1.5 .507 .261
## 6    27  Scottie Pippen 1988 2004 1178 34.9 16.1  6.4 5.2 2.0 0.8 .473 .326
## 7    31    LeBron James 2004 2021 1306 38.2 27.0  7.4 7.4 1.6 0.8 .504 .345
## 8    42   Clyde Drexler 1984 1998 1086 34.6 20.4  6.1 5.6 2.0 0.7 .472 .318
## 9    88    Chris Mullin 1986 2001  986 32.6 18.2  4.1 3.5 1.6 0.6 .509 .384
## 10  131    Phil Chenier 1972 1981  578 33.1 17.2  3.6 3.0 1.6 0.6 .444 .400
## 11  262     Eddie Jones 1995 2008  954 34.4 14.8  4.0 2.9 1.7 0.6 .437 .373
## 12  327      Ron Harper 1987 2001 1009 30.9 13.8  4.3 3.9 1.7 0.7 .446 .289
##     FT.    WS WS.48
## 1  .835 214.0  .250
## 2  .886 145.8  .203
## 3  .814 162.6  .213
## 4  .712 162.8  .177
## 5  .777 106.2  .178
## 6  .704 125.1  .146
## 7  .734 242.0  .233
## 8  .788 135.6  .173
## 9  .865  93.1  .139
## 10 .806  39.3  .099
## 11 .809 100.6  .147
## 12 .720  65.8  .101

This data is interesting to me because there are far less rows of data than the table for offensively gifted players. This goes to show that the leagues top 500 players are likely less good at defense or that they are just much better at defense. Like the first table showing well-rounded players, there are hardly any players that played after the year 2005 on this list. Again, I believe this shows that there was more emphasis on playing hard defense and that the sport has changed to be an offensive game.

Without a surprise, Lebron James is on this list but he doesn’t have the same stats to be considered a well-rounded defensive and offensive player by the standards before. Given his time, he would be considered the most well rounded player for offense and defense.

## # A tibble: 20 x 3
## # Groups:   Points Per Game [15]
##    `Points Per Game` Name                `Minutes Played`
##    <chr>             <chr>               <chr>           
##  1 24.2              Pete Maravich       37.0            
##  2 24.3              Larry Bird          38.4            
##  3 24.3              Adrian Dantley      35.8            
##  4 24.6              Kareem Abdul-Jabbar 36.8            
##  5 24.8              Dominique Wilkins   35.5            
##  6 25.0              Karl Malone         37.2            
##  7 25.0              Kobe Bryant         36.1            
##  8 26.2              George Gervin       33.5            
##  9 26.4              Bob Pettit          38.8            
## 10 27.0              Jerry West          39.2            
## 11 27.0              LeBron James        38.2            
## 12 30.1              Michael Jordan      38.3            
## 13 7.3               Dennis Rodman       31.7            
## 14 8.6               Shane Battier       30.7            
## 15 9.1               Andy Phillip        32.2            
## 16 9.1               P.J. Brown          31.1            
## 17 9.6               Mark Jackson        30.2            
## 18 9.7               Charles Oakley      31.4            
## 19 9.8               Slater Martin       35.9            
## 20 9.8               Dikembe Mutombo     30.8

This is a summary table depicting the leading scorers in the list of Significant Players. This shows the minutes played for each player and the points per game. This is grouped by points per game name on the left. As we can see these are the top 20 best scorers in terms of efficiency. There are many household names like Lebron James, Michael Jordan, and Larry Bird. Though these players are all great scorers, I don’t think it will determine who the greatest player of all time is.

5 - Further Analysis Needed

I think we’ve been able to learn a lot from the data gathered today when we filtered the data and inspected the most well rounded players, best scorers, more efficient offensive players, and the best defensive threats. But there is even more data in basketball that was not provided in this set. The importance of minutes played in the fourth quarter and second half are critical in analyzing who the most valuable player to basketball was. Furthermore, average “box +/-” is a statistic that was not measure in this set. What +/- means is a statistical value that measures a players contribution each game in terms of points allowed and points contributed.

With that being said I do belive that Lebron James would have a better +/- than Michael Jordan because he has played more games with teams that were built with superstars. I also beleive that some of the older players would have more minutes played in the 4th quarters becasue they played a different style of basketball where it was a lot slower. 

Perhaps these statistics alone cannot determine who the greatest basketball player of all time is because you cannot measure “clutch” timing. Even though there are stats to show game winning shots, there is no stat that determines a momentum shift when a player simply steps on the floor.

We determined that the game has since changed since 2005. People used to play defense much better or try much harder on defense. If Lebron James played defense as hard as Michael Jordan would he score as many points, or play as many minutes as Jordan. We probably will never truly be able to come to the conclusion who the best basketball player of all time is, but, we do know that two of the best basketball players of all time play in two completely different eras of the game.