Leonardo Pegeas HW6

Rough Draft of Final Project

Leonardo Pegeas
6/30/2022

The dataset (below) I will be using for my project is title “NBA Players stats since 1950” that I have obtained from Kaggle. This dataset only goes up to year 2017 so we will be missing some data from our current and recent basketball seasons.

#Reading my Dataset in
Seasons_Stats = read.csv("Seasons_Stats.csv")

Seasons_Stats = Seasons_Stats %>%
  select("Year", "Player", "Pos", "Age", "Tm", "G", "GS", "MP", "PER", "TS.", "TRB.", "AST.", "WS", "FG", "FGA", "X3P.", "X2P.", "eFG.", "FT.", "TRB", "AST", "STL", "BLK", "TOV", "PF", "PTS")

Below is the description of the variables kept for this project. Year - The year/season the player had played in. Player - The name of the player Pos - The Position the Player had played Age - The age the player was during that year/season Tm - The Team that player had played for G - Total Games played GS - Total Games Started MP - Number of Minutes Played by the player PER - Player Efficiency Rating TS. - True Shooting Percentage of which is based on field goals, 3 point field goals and free throws TRB. - Total Rebound Percentage of which is based on the number of rebounds available for the player to grab while playing AST. - Assist Percentage which of the teammate field goals a player assisted while he was on the floor. WS - Win share, an estimate of the number of wins contributed by a player. FG - Total number of field goals FGA - Total of field goal attempts X3P. - The 3 Point goal percentage X2P. - 2 Point goal percentage eFG. - Effective Field Goal Percentage based on the other goal percentages (X3P., X2P., FT.) FT. - Free Throw Percentage TRB - Number of Rebounds made AST - Number of Assists made STL - Number of Steals made BLK - Number of Blocks made TOV - Number of Turnovers made PF - Number of Personal Fouls called on the player PTS - Number of Points scored

Other adjustments made are combining certain position. The primary categories that they will be merged into are Center (C), Power Forward (PF), Small Forward (SF), Point Guard (PG) and Shooting (SG). Some players might have played more than one role which would then isolate themselves into their own category. For example, Ed Manning st one point during his career was classified as a Small Forward-Power Forward (SF-PF). In this case, we would fit him with the first corresponding position given, which would be a Small Forward (SF from SF-PF).

#Making appropriate changes to Player's position if they played more than one.
#Ex) Center-Forward is moved to Center in the next line

Seasons_Stats$Pos = replace(Seasons_Stats$Pos, Seasons_Stats$Pos == "C-F", "C")
Seasons_Stats$Pos = replace(Seasons_Stats$Pos, Seasons_Stats$Pos == "C-PF", "C")
Seasons_Stats$Pos = replace(Seasons_Stats$Pos, Seasons_Stats$Pos == "C-SF", "C")
Seasons_Stats$Pos = replace(Seasons_Stats$Pos, Seasons_Stats$Pos == "F", "PF")
Seasons_Stats$Pos = replace(Seasons_Stats$Pos, Seasons_Stats$Pos == "F-C", "PF")
Seasons_Stats$Pos = replace(Seasons_Stats$Pos, Seasons_Stats$Pos == "F-G", "PF")
Seasons_Stats$Pos = replace(Seasons_Stats$Pos, Seasons_Stats$Pos == "G", "PG")
Seasons_Stats$Pos = replace(Seasons_Stats$Pos, Seasons_Stats$Pos == "G-F", "PG")
Seasons_Stats$Pos = replace(Seasons_Stats$Pos, Seasons_Stats$Pos == "PF-C", "PF")
Seasons_Stats$Pos = replace(Seasons_Stats$Pos, Seasons_Stats$Pos == "PF-SF", "PF")
Seasons_Stats$Pos = replace(Seasons_Stats$Pos, Seasons_Stats$Pos == "PG-SF", "PG")
Seasons_Stats$Pos = replace(Seasons_Stats$Pos, Seasons_Stats$Pos == "PG-SG", "PG")
Seasons_Stats$Pos = replace(Seasons_Stats$Pos, Seasons_Stats$Pos == "SF-PF", "SF")
Seasons_Stats$Pos = replace(Seasons_Stats$Pos, Seasons_Stats$Pos == "SF-PG", "SF")
Seasons_Stats$Pos = replace(Seasons_Stats$Pos, Seasons_Stats$Pos == "SF-SG", "SF")
Seasons_Stats$Pos = replace(Seasons_Stats$Pos, Seasons_Stats$Pos == "SG-PF", "SG")
Seasons_Stats$Pos = replace(Seasons_Stats$Pos, Seasons_Stats$Pos == "SG-PG", "SG")
Seasons_Stats$Pos = replace(Seasons_Stats$Pos, Seasons_Stats$Pos == "SG-SF", "SG")

After evaluating the dataset heavily and some research, I have come to a conclusion on more adjustments that needed to be made. From looking at the original, there are a lot of missing variables in the early stages of basketball. There’s two simple answers as to why that is the case. One is simply it wasn’t recorded during those times while the other is rule changes that have effect the overall play of the game. If you go way back into the years of basketball, the three point line of basketball didn’t exist. The three point line was implemented in the year 1986 in an effort to reward players from taking shots far away and to disrupt the domination of players that ruled over the paint (the large rectangle under the basket). And since this rule wasn’t made before 1987, then the players before than wouldn’t have the chance to shoot three point shots and thus are NA values. I don’t believe it will effect my process. If it will, I will remove them and let the audience know here what happened.

My primary goal for this dataset is to answer basketball’s most asked question: Who belongs on the Mount Rushmore of basketball? To answer this question, we will solely be using statistics of the players performance without any other form of outside factors (such as attitude, win streaks, miracle plays, etc). To do this, we will have a selection process by which a player is picked for consideration of being on Mount Rushmore. Starting with looking at the overall player pool and creating a standard based on the averages of all the players so we can distinguish between above and below average statistic in a given variable. We will then shift our focus to two primary variables: Points (PTS) and the Player Efficiency Rating (PER). We will select the best of players based on being above the average points made and average PER to complete first round selection and then see who had appeared in that section more than once. We will then take the top performers and move on to the last round where we will select the faces for our Mount Rushmore of Basketball. We will compare the final group to each other and attempt to look a few key aspects that can be seen in their player performance shown across the years of their career such as growth, consistency and plain being the best.

The reason as to why I selected Player Efficiency Rating (PER) because it is an essential part that evaluates a players overall performance in a given amount of time. PER had started when John Hollinger, a ESPN Journalist had come up with the idea for a per minute rating that looks at all of the players good aspects of their plays and subtraction of the negative aspects over the minutes they played. There is a formula in which has a variety of how much a particular accomplishment weighs with other considerations involved with the league stats and team stats. So I personally believe that this is a good representation of how a player is evaluated and thus is why I focus on it on portions of my project.

#Standard of all players across all seasons
#Mean
Stats_Mean = Seasons_Stats %>%
  summarize_all(mean, na.rm = TRUE)
Stats_Mean
      Year Player Pos      Age Tm        G       GS      MP      PER
1 1992.595     NA  NA 26.66441 NA 50.83711 23.59337 1209.72 12.47907
        TS.    TRB.     AST.       WS       FG      FGA      X3P.
1 0.4930013 9.94921 13.00996 2.485796 195.3258 430.6458 0.2487963
       X2P.      eFG.       FT.      TRB      AST      STL      BLK
1 0.4453431 0.4506584 0.7192786 224.6374 114.8526 39.89705 24.47026
       TOV       PF      PTS
1 73.93983 116.3392 510.1163
#SD
Seasons_Stats %>%
  summarize_all(sd, na.rm = TRUE)
      Year Player Pos      Age Tm        G       GS       MP      PER
1 17.42959     NA  NA 3.841892 NA 26.49616 28.63239 941.1466 6.039014
         TS.     TRB.     AST.       WS       FG      FGA      X3P.
1 0.09446901 5.040283 9.191843 3.058638 188.1144 397.6247 0.1766832
        X2P.       eFG.      FT.      TRB      AST      STL      BLK
1 0.09980326 0.09920031 0.141824 228.1902 135.8639 38.71305 36.93508
      TOV       PF     PTS
1 67.7138 84.79187 492.923
# Distribution of Points in all Seasons
ggplot(Seasons_Stats, aes(PTS)) + 
  geom_histogram(aes(y = ..density..), alpha = 0.5) +
  geom_density(alpha = 0.2, fill="blue") +
  theme_minimal() +
  labs(x = "Points", y= "Density", title = "Distribution of Points Scored in decade of 1970 by Position")

The graph above illustrates the distribution of points made throughout the seasons. The question would be what’s the distribution of points given across the seasons. This suggests that majority of basketball players are around 100 points at the end of seasons. The more points that a player has slides more to the right of the graph and there’s fewer and fewer players each step of the way. This is because it is hard for someone to score more than 1000 points in a given season but it does happen (displaying it is possible). What is insane is that there are people that have crossed 2000 points in a season. These players I suggest would be considered a candidate for being on Mount Rushmore of Basketball, but we will see.

#Round 1 elimination

#Finding Number of players starting with
All_Players = Seasons_Stats %>% 
  distinct(Player)

#Displays all NBA Players
head(All_Players[order(All_Players$Player),])
[1] ""              "A.C. Green"    "A.J. Bramlett" "A.J. English" 
[5] "A.J. Guyton"   "A.J. Hammons" 
#Subtract one for a "Ghost" Player
Number_Of_Players = nrow(All_Players)-1

#Eliminating Players below all factors
#Factors not used: Year, Player, Age, Tm, and TOV. (PF and GS added)
Round1_Players = Seasons_Stats %>% 
  filter(Stats_Mean$G < G,
         Stats_Mean$MP < MP,
         Stats_Mean$PER < PER,
         Stats_Mean$TS. < TS.,
         Stats_Mean$TRB. < TRB.,
         Stats_Mean$AST. < AST.,
         Stats_Mean$WS < WS,
         Stats_Mean$FG < FG,
         Stats_Mean$FGA < FGA,
         Stats_Mean$X3P. < X3P.,
         Stats_Mean$X2P. < X2P.,
         Stats_Mean$eFG. < eFG.,
         Stats_Mean$FT. < FT.,
         Stats_Mean$TRB < TRB,
         Stats_Mean$AST < AST,
         Stats_Mean$STL < STL,
         Stats_Mean$BLK < BLK,
         Stats_Mean$PF < PF,
         Stats_Mean$PTS < PTS)

#Displays Records of all players and years
#Round1_Players

#Displays all Players passing round 1
Round1_Players %>% 
  distinct(Player)
                  Player
1            Larry Bird*
2          Alex English*
3            Bob Lanier*
4         Julius Erving*
5            Alvan Adams
6            Albert King
7          Clark Kellogg
8        Alvin Robertson
9       Charles Barkley*
10        Brad Daugherty
11        Clyde Drexler*
12       Michael Jordan*
13            Jack Sikma
14         John Williams
15        Magic Johnson*
16          Karl Malone*
17         Rodney McCray
18           Larry Nance
19    Dominique Wilkins*
20       Derrick Coleman
21       Detlef Schrempf
22      Frank Brickowski
23         Larry Johnson
24          Robert Horry
25       David Robinson*
26        Patrick Ewing*
27    Christian Laettner
28         Tom Gugliotta
29          Juwan Howard
30      Arvydas Sabonis*
31         Kevin Garnett
32         Anthony Mason
33      Hakeem Olajuwon*
34   Shareef Abdur-Rahim
35            Grant Hill
36          Chris Webber
37         Steve Francis
38         Tracy McGrady
39        Glenn Robinson
40           Bonzi Wells
41           Brad Miller
42         Dirk Nowitzki
43           Paul Pierce
44      Andrei Kirilenko
45            Lamar Odom
46          LeBron James
47           Elton Brand
48            Boris Diaw
49           Mike Miller
50           Metta World
51       Carmelo Anthony
52             Luol Deng
53           Matt Barnes
54             Pau Gasol
55        Andray Blatche
56          Kevin Durant
57            David West
58            Al Horford
59             David Lee
60            Josh Smith
61     Amar'e Stoudemire
62           Dwyane Wade
63           Paul George
64        Gerald Wallace
65            Tim Duncan
66          Paul Millsap
67         Nicolas Batum
68         Spencer Hawes
69            Kevin Love
70      DeMarcus Cousins
71         Blake Griffin
72       Jared Sullinger
73 Giannis Antetokounmpo
74           Will Barton
75            Marc Gasol
76          Nikola Jokic
77      Patrick Beverley
78          James Harden
79          Kelly Olynyk
80         Julius Randle
81    Karl-Anthony Towns
82     Russell Westbrook

We started with a total of 3921 Players at the beginning and after round 1 of the elimination process, we are left with 76 Players. This is a big jump from a thousand players to less than a hundred. Factors that were not included in this process was the player’s name, the year, their age, the team they played for and the Turnovers (TOV) they had. I decided to eliminate it since the top performing players would either have more played time and thus would have more opportunities to turn the ball over more than the average player would and not necessary for this section of the process. I had also decided to remove the Games Started because then majority of the players would be starters (the players that a team starts with). This would make sense to have a team’s starters be the top performing players, but for my project to eliminate favoritism of soley starters, I wanted to remove it. This leads into why the decision to keep both Games (G) and Minutes Played because they would be a sufficient source of filtering for critical players of the game of basketball versus just looking at those that started.

#Graph displaying PER versus Points of the players remaining after round 1
Round1_Players %>%
  filter(! is.na(PER)) %>%
  ggplot(aes(PER, PTS, color = Pos)) +
  geom_point() +
  theme_bw() +
  labs(x="Player Efficiency Rating", y="Number of Points scored by Player", title="Player Efficiency Rating versus Points")

We shall begin our steps to starting our round 2 process. Above is a graph that shows a players Player Efficiency Rating by the Points they have scored in a season and the color indicates the players corresponding position. We can see a nice little trend that shows the higher the number of points a player scored, the higher likely the player efficiency is for that individual. It does seem to show that the Centers are towards the bottom of this trend. Power Forward and Small Forward are co-existing at the same areas. There are very few Shooting Guards and I believe four Point Guards on this graph which illustrates that the elite players are mainly of the other three positions. What I would like to do to eliminate more players is to focus on the portion of players that have a high Player Efficiency Rating and Points scored.

(Had a thought here where I can see who had showed up multiple times in the graph. So now I’m currently deciding if I take those people that showed up multiple times or continue with this idea of a portion or do a combination of both)

#Round 2 begins 
#Selecting players that scored above 1750 points a season and a PER greater than 22.5
Round2_Players = Round1_Players %>%
  filter(Round1_Players$PER > 22.5,
         Round1_Players$PTS > 1750)

It was decided that we will evaluate the players based on the top 4x4 sections on the top right corner of the graph given. The faces of Mount Rushmore would have to consist of the the top basketball players which would heavily weighs in on the points scored by the player and their Player Efficiency Rating. The “Can’t win if you don’t score” mentality fits this filter section perfectly. Next we will then select those that have been seen more than once after this filteration.

#Round 2 Elimination
#Seeing who had outstanding multiple seasons performances given the range above PTS and PER
table(Round2_Players['Player'])

    Amar'e Stoudemire       Carmelo Anthony      Charles Barkley* 
                    1                     2                     3 
         Chris Webber        Clyde Drexler*       David Robinson* 
                    1                     1                     3 
     DeMarcus Cousins         Dirk Nowitzki    Dominique Wilkins* 
                    1                     5                     1 
          Dwyane Wade           Elton Brand Giannis Antetokounmpo 
                    1                     1                     1 
           Grant Hill      Hakeem Olajuwon*          James Harden 
                    1                     1                     1 
       Julius Erving*    Karl-Anthony Towns          Karl Malone* 
                    1                     1                     7 
         Kevin Durant         Kevin Garnett            Kevin Love 
                    6                     4                     1 
          Larry Bird*          LeBron James        Magic Johnson* 
                    5                     6                     1 
      Michael Jordan*           Paul Pierce     Russell Westbrook 
                    3                     2                     1 
        Tracy McGrady 
                    2 
#Removing anyone that was displayed only once
Round2_Players = Round2_Players %>%
  filter(Round2_Players['Player'] != "Amar'e Stoudemire",
         Round2_Players['Player'] != "Chris Webber",
         Round2_Players['Player'] != "Clyde Drexler*",
         Round2_Players['Player'] != "DeMarcus Cousins",
         Round2_Players['Player'] != "Dominique Wilkins*",
         Round2_Players['Player'] != "Dwyane Wade",
         Round2_Players['Player'] != "Elton Brand",
         Round2_Players['Player'] != "Giannis Antetokounmpo",
         Round2_Players['Player'] != "Grant Hill",
         Round2_Players['Player'] != "Hakeem Olajuwon*",
         Round2_Players['Player'] != "James Harden",
         Round2_Players['Player'] != "Julius Erving*",
         Round2_Players['Player'] != "Karl-Anthony Towns",
         Round2_Players['Player'] != "Kevin Love",
         Round2_Players['Player'] != "Magic Johnson*",
         Round2_Players['Player'] != "Russell Westbrook")

#Number of Players who Remains
#nrow(Round2_Players)

#Those that have survived Round 2
Round2_Players %>% distinct(Player)
             Player
1       Larry Bird*
2  Charles Barkley*
3   Michael Jordan*
4      Karl Malone*
5   David Robinson*
6     Kevin Garnett
7     Tracy McGrady
8     Dirk Nowitzki
9       Paul Pierce
10     LeBron James
11     Kevin Durant
12  Carmelo Anthony

We have narrowed the size down to just 12 players was significant considering the names that we have remaining. The names that are still here aren’t what specifically makes this surprising; It’s the names that didn’t make it through the rounds. Legendary names like Bill Russell, Kareem Abdul-Jabbar, Magic Johnson, and Kobe Bryant did not make it up to this point. Other names such as Michael Jordan, LeBron James and Larry Bird have survived though. Lets continue this elimination process. Originally had planned to eliminate players based on their performance against others in the same position as those players. However, there is 12 players remaining and that doesn’t seem necessary anymore, so I will skip to comparing the players against themselves.

#Final Elimination before selection

#Finding top 4 players in each column
#Then combining them into one dataframe with rbind

top4s_list = Round2_Players %>% arrange(desc(PER)) %>% head(5)
a = Round2_Players %>% arrange(desc(TS.)) %>% head(5)
top4s_list = rbind(top4s_list,a)

a = Round2_Players %>% arrange(desc(TRB.)) %>% head(5)
top4s_list = rbind(top4s_list,a)

a = Round2_Players %>% arrange(desc(AST.)) %>% head(5)
top4s_list = rbind(top4s_list,a)

a = Round2_Players %>% arrange(desc(WS)) %>% head(5)
top4s_list = rbind(top4s_list,a)

a = Round2_Players %>% arrange(desc(FG)) %>% head(5)
top4s_list = rbind(top4s_list,a)

a = Round2_Players %>% arrange(desc(FGA)) %>% head(5)
top4s_list = rbind(top4s_list,a)

a = Round2_Players %>% arrange(desc(X3P.)) %>% head(5)
top4s_list = rbind(top4s_list,a)

a = Round2_Players %>% arrange(desc(X2P.)) %>% head(5)
top4s_list = rbind(top4s_list,a)

a = Round2_Players %>% arrange(desc(eFG.)) %>% head(5)
top4s_list = rbind(top4s_list,a)

a = Round2_Players %>% arrange(desc(FT.)) %>% head(5)
top4s_list = rbind(top4s_list,a)

a = Round2_Players %>% arrange(desc(TRB)) %>% head(5)
top4s_list = rbind(top4s_list,a)

a = Round2_Players %>% arrange(desc(AST)) %>% head(5)
top4s_list = rbind(top4s_list,a)

a = Round2_Players %>% arrange(desc(STL)) %>% head(5)
top4s_list = rbind(top4s_list,a)

a = Round2_Players %>% arrange(desc(BLK)) %>% head(5)
top4s_list = rbind(top4s_list,a)

a = Round2_Players %>% arrange(desc(TOV)) %>% tail(5)
top4s_list = rbind(top4s_list,a)

a = Round2_Players %>% arrange(desc(PF)) %>% tail(5)
top4s_list = rbind(top4s_list,a)

a = Round2_Players %>% arrange(desc(PTS)) %>% head(5)
top4s_list = rbind(top4s_list,a)

#Took the tails of both Turnovers (TOV) and Personal Fouls (PF) because a player would like to have the least amount of those as possible, hence them being tails instead of heads.


#Counting how many times a player name appeared in those top 5's
table(top4s_list['Player'])

 Carmelo Anthony Charles Barkley*  David Robinson*    Dirk Nowitzki 
               1                8                7                7 
    Karl Malone*     Kevin Durant    Kevin Garnett      Larry Bird* 
               4               10                7                8 
    LeBron James  Michael Jordan*    Tracy McGrady 
              18               19                1 

After taking the top 5 players of each column and re-counting the top names of each role, it can be said that the 3 people that should definitely be on Mount Rushmore of Basketball is Michael Jordan, LeBron James, and Kevin Durant. The last member of that would have to be between Charles Barkley and Larry Bird. The reason for looking at the top 5 players instead of top 4 for the number of heads is to allow for a comeback in this counting system.

Considering that Larry Bird had 2 more seasons than Charles Barkley where he was in the 4x4 section, that would mean that he would be our 4th head. The Mount Rushmore of Basketball statistically should be Michael Jordan, LeBron James, Kevin Durant and Larry Bird.

Here we will evaluate how the position of players can effect the points gathered

#Made a distinction between the decades 
Seasons_Stats_1970 = Seasons_Stats %>% 
  filter(Year >= 1970, Year < 1980)
Seasons_Stats_1980 = Seasons_Stats %>% 
  filter(Year >= 1980, Year < 1990)
Seasons_Stats_1990 = Seasons_Stats %>% 
  filter(Year >= 1990, Year < 2000)
Seasons_Stats_2000 = Seasons_Stats %>% 
  filter(Year >= 2000, Year < 2010)
Seasons_Stats_2010 = Seasons_Stats %>% 
  filter(Year >= 2010, Year < 2020)

#Plotted decades into a histogram to see if they had changed over time
ggplot(Seasons_Stats_1970) +
  geom_histogram(aes(x = PTS, binwidth = 200, fill = Pos)) +
  labs(x = "Points", y= "Number of Players", title = "Distribution of Points Scored in decade of 1970 by Position") +
  facet_wrap(vars(Pos))+
  xlim(0, 2500)+
  ylim(0, 250)
ggplot(Seasons_Stats_1980) +
  geom_histogram(aes(x = PTS, binwidth = 200, fill = Pos)) +
  labs(x = "Points", y= "Number of Players", title = "Distribution of Points Scored in decade of 1980 by Position") +
  facet_wrap(vars(Pos))+
  xlim(0, 2500)+
  ylim(0, 250)
ggplot(Seasons_Stats_1990) +
  geom_histogram(aes(x = PTS, binwidth = 200, fill = Pos)) +
  labs(x = "Points", y= "Number of Players", title = "Distribution of Points Scored in decade of 1990 by Position") +
  facet_wrap(vars(Pos))+
  xlim(0, 2500)+
  ylim(0, 250)
ggplot(Seasons_Stats_2000) +
  geom_histogram(aes(x = PTS, binwidth = 200, fill = Pos)) +
  labs(x = "Points", y= "Number of Players", title = "Distribution of Points Scored in decade of 2000 by Position") +
  facet_wrap(vars(Pos))+
  xlim(0, 2500)+
  ylim(0, 250)
ggplot(Seasons_Stats_2010) +
  geom_histogram(aes(x = PTS, binwidth = 200, fill = Pos)) +
  labs(x = "Points", y= "Number of Players", title = "Distribution of Points Scored in decade of 2010 by Position") +
  facet_wrap(vars(Pos))+
  xlim(0, 2500)+
  ylim(0, 250)

I decided to look at the decades starting from 1970 and then on because earlier wouldn’t provide much information since the 3 point line being invented in 1980s. The decade of 1970s will represent the other previous years. Looking at all the graphs, all positions are right skewed. However, over the decades the number of Players had to have increased since the bar on the far left increased each decade from 1980s to the 2000s. One minor small difference is that after the 1990s, the Center positions didn’t have a bar that score beyond the 2000 points in a season when there were few people that did prior. Other than that, it becomes difficult to identify any more changes. Something to point out is that the shape of the distribution of the points mimic other positions given in which decade is displayed. So I would have to say that the position doesn’t have an effect on how much points the player scored since there’s little change in distribution other than maybe the Center position based on the tail, but I personally feel it is not enough evidence to claim that.

  1. What is missing from my final project?

My words currently are kinda sloppy but I will adjust them to a more formal and practical sentences. Graphs are my best, but I still need fix the legend to state Position instead of just Pos. With this, it might be enough to hit the 8 page requirement, but still not completely certain. I’m this is not what a final rough draft looks like, but I my final project submission will be much cleaner.

  1. What do I hope to accomplish between now and final paper submission?

As much of what I had originally planned to do. I still want to add the section about the effective field goal percentage and see if that is effected by age.