Data Analysis with R Markdown

The NBA free throws dataset contains information about free throws per period for the regular season and the playoffs. It also contains data about when the free throw was taken, who attempted it and whether it was succesful or not.

Understanding the data

We take a quick look into the dataset’s top rows:

end_result game game_id period play player playoffs score season shot_made time
106 - 114 PHX - LAL 261031013 1 Andrew Bynum makes free throw 1 of 2 Andrew Bynum regular 0 - 1 2006 - 2007 1 11:45
106 - 114 PHX - LAL 261031013 1 Andrew Bynum makes free throw 2 of 2 Andrew Bynum regular 0 - 2 2006 - 2007 1 11:45
106 - 114 PHX - LAL 261031013 1 Andrew Bynum makes free throw 1 of 2 Andrew Bynum regular 18 - 12 2006 - 2007 1 7:26
106 - 114 PHX - LAL 261031013 1 Andrew Bynum misses free throw 2 of 2 Andrew Bynum regular 18 - 12 2006 - 2007 0 7:26
106 - 114 PHX - LAL 261031013 1 Shawn Marion makes free throw 1 of 1 Shawn Marion regular 21 - 12 2006 - 2007 1 7:18
106 - 114 PHX - LAL 261031013 1 Amare Stoudemire makes free throw 1 of 2 Amare Stoudemire regular 33 - 20 2006 - 2007 1 3:15

Dimension of the dataset

. type
618019 rows
11 cols

The dataset has 618019 rows and 11 columns

Number of unique values in each column

end_result game game_id period play player playoffs score season shot_made time
2725 988 12874 8 10482 1098 2 8753 10 2 534

We can see that:

  • There are 12874 unique games in this dataset
  • They are played within 10 seasons by 1098 players
  • The data contains 2 playoff types (regular and playoffs)

Checking if there any missing values?

x
end_result 0
game 0
game_id 0
period 0
play 0
player 0
playoffs 0
score 0
season 0
shot_made 0
time 0

The dataset does not have any missing data.

Exploratory Data Analysis

How many games do we have per season / playoff?

The number of games during playoffs and regular seem to vary slightly throughout the seasons. However during 2011-2012 there is a major decline in regular games. With the help of a quick google search, we can find that this is due to the NBA lock out. More details available in this link.

What is the total number of shots per season / playoffs?

The above graph shows that the total number of shots are representative of number of games with the obvious drop in 2011. Let’s dive deeper in what successful shots look like later.

What is the average number of throws per season / playoffs?

The average throws of regular games are slightly lower than playoffs throughout the seasons. Also it seems to be declining from 2006 to 2012.

What is the free throw success percentage per season/playoffs

From the above graph we can see that success percent in playoffs slowly increase from 2006 to 2010 and decrease in the later years. Success percentage from regular games are higher in the earlier years and reduce in the later years. In the 2014-2015 playoffs have drastically low success percentage.

Note: The above graph has been zoomed in to look closely. As it is chopped off in the top 70% to 80%, we will keep in mind that the difference might not be as large as it seems.

How does distribution of success percentage look like?

The distribution is has a left skew with mean percentage around 71% success

Who are the players with highest percent of success per season?

## Selecting by success_percent
season player total_shots shots_made success_percent
2006 - 2007 Kyle Korver 209 191 91.38756
2007 - 2008 Peja Stojakovic 167 155 92.81437
2008 - 2009 Jose Calderon 154 151 98.05195
2009 - 2010 Steve Nash 300 278 92.66667
2010 - 2011 Stephen Curry 227 212 93.39207
2011 - 2012 Jamal Crawford 206 191 92.71845
2012 - 2013 J.J. Redick 193 174 90.15544
2013 - 2014 Brian Roberts 133 125 93.98496
2014 - 2015 Jodie Meeks 160 145 90.62500
2015 - 2016 Stephen Curry 483 439 90.89027

For the above graph a threshold of 125 free throws attempted was used. This was adopted as a good threshold value after referring to this link

Game period and total shots

There are more number of shots taken in the 4th period than any other period.

But are they really succesful?

However most successful throws happens during the third period for regulars. Success is random across all 4 periods in playoffs.

Does time impact success percentages?

Note: The spike in the 12th minute is due to lesser number of throws thereby skewing the success percentage. If we ignore the 12th minute, we can see success percentages reduce after the 8th minute.

Summary of analysis

To summarise all of the above analysis:

  • There are 12874 games within 10 seasons by 1098 players
  • The number of games during 2011-2012 has a major decline in regular games due to the NBA lock out.
  • The average throws of regular games are slightly lower than playoffs throughout the seasons. Also it seems to be declining from 2006 to 2012.
  • The success percent in playoffs slowly increase from 2006 to 2010 and decrease in the later years. Success percentage from regular games are higher in the earlier years and reduce in the later years.
  • In the 2014-2015 season, playoffs have drastically low success percentage.
  • The overall free throws success percentage distribution has a left skew with mean percentage around 71%
  • The top 3 most succesful free throw players were from 2006 to 2009: Kyle Korver, Peja Stojakovic and Jose Calderon. * There are more number of shots taken in the 4th period than any other period.
  • However most successful throws happens during the third period for regulars. Success is random across all 4 periods in playoffs.
  • Success percentages reduce after the 8th minute.