Basketball is one of the top viewed sports all over the world. It has a fan-base of millions of people from all over the world. Though the sport has many viewers cheering their team on in a stadium or on the television, there are people backstage recording every statistic from just about every second of these games in order to create databases for analysts to use for game prediction outcome, sports betting, findings of information and sports trends, and to use this information when interviewing the NBA athletes or coaches. There is much more to the NBA than the players, coaches, and fans. Using the following visualizations will help gather more insight into the work of sports analysts, data analysts, the type of data that is recorded during basketball games, and into the history of NBA free throws over the past 10 years.
The following information contains various forms of data from the NBA (National Basketball Association), such as free throws, plays, players, periods, from 10 consecutive seasons, 2006-2016. Below you will learn information on season statistics, player statistics, period statistics, play statistics and free throw statistics. This data has been compiled into various different visualizations to allow you to receive a better understanding of the statistics from free throws of NBA games over a 10 year span. The dataset used to create these findings includes information on; the end result of various games, the teams who played, game ID, period in the game, the game play, players, playoffs, the score based on the time it was recorded, the season, and if the player made the shot or not. This information was compiled over 10 years throughout various games and seasons played by the NBA. This information is primarily based on the free throws in the games over the span of 10 seasons.
Through these findings, it was found that much of the data was consistent through the 10 seasons, with some fluctuations throughout. These fluctuations could be due to various forms of reasons, which is something to keep in mind while looking through each one. Through these visualizations we will take a closer look at free throws per period, the count of free throws, top 5 players by season, shots made and missed, and shots by season.
Click “code” to view code:
setwd("/Users/shirleymayeregger/NBA")
library(data.table)
filename <- "/Users/shirleymayeregger/NBA/free_throws.csv"
df <- fread(filename)
library(flexdashboard)
library(dplyr)
library(ggplot2)
library(scales)
library(RColorBrewer)
library(ggthemes)
library(plyr)
library(lubridate)
library(scales)
library(plotly)
This histogram allows us viewers to understand which periods over the 10 seasons have the most popularity with free throws. On the Y-axis, there is displayed the count of throws. The x-axis containing the different periods in a game. Lastly the bars which show us how many throws there were in each period. We can interpret this by seeing that period 1 has the least amount out of the typical 4 periods of a game, because it is just starting a game out, players are ready to play. We gradually see these numbers going up because throughout the game, especially in the 4th period, things start to heat up, and there is more chances to get a foul so a player has the opportunity to throw a free throw. We see the numbers significantly shrink in periods 5-8 because a typical basketball game only lasts 4 periods. In this case, periods 5-8 are “overtime” periods which are 5 minutes in length compared to the normal 12 minute periods until a winner is decided. This does not happen often which is why these numbers are significantly lower than periods 1-4.
p1 <- ggplot(df, aes(x= period)) +
geom_histogram(bins = 8, color = "hotpink", fill= "lightpink")+
labs(title = "Histogram of Throws per Period", x= "period", y= "Count of Throws")+
scale_y_continuous(labels=comma) +
stat_bin(binwidth=1, geom= 'text', color= 'purple', aes(label=scales::comma(after_stat(count)), vjust= -0.5))
x_axis_labels <- min(df$period):max(df$period)
p1 <- p1 + scale_x_continuous(labels = x_axis_labels, breaks = x_axis_labels)
p1
This chart examines the top plays that included free throws throughout the 10 seasons. These plays include LeBron James, Kevin Durant’s, Dwight Howard, Kobe Bryant, Dwayne Wade, Carmelo Anthony, Dirk Nowitzki, and Russel Westbrook, who are some of the top 5 players that were discussed preiously, and their plays when they shot either free throw 1 of 2 or 2 of two. There is a play count, which shows how many times they did this play throughout the 10 seasons, and the seasons are sorted by color, which signifies each season, with the total at the end. This graph overall helps us understand which were the most popular plays and by who these plays were made over the span of 10 seasons. We can see that in total of the 10 seasons, plays made by Lebron James are the most popular ones, with plays made by Kevin Durant coming in second, both of whom are listed as some of the top 5 players.
df_reasons <- dplyr::count (df, play)
df_reasons <- df_reasons [order(df_reasons$n, decreasing= TRUE),]
top_reasons <- df_reasons$play[1:15]
new_df <- df %>%
filter(play %in% top_reasons) %>%
select(season, play) %>%
group_by(play, season) %>%
summarise(n = length(play), .groups = 'keep') %>%
data.frame()
other_df <- df %>%
filter(!play %in% top_reasons) %>%
select(play,season) %>%
group_by(play,season) %>%
dplyr::summarise(n = length(play), .groups = 'keep') %>%
data.frame()
agg_tot <- new_df %>%
select(play, n) %>%
group_by(play)%>%
summarise(tot= sum(n), .groups = 'keep') %>%
data.frame()
endresult_df <- df %>%
filter(play %in% top_reasons) %>%
select(play, end_result) %>%
group_by(play) %>%
summarise(totendresult = sum()) %>%
data.frame()
max_y <- round_any(max(agg_tot$tot), 3000, ceiling)
ggplot(new_df, aes(x= reorder(play, n, sum), y=n, fill = season)) +
geom_bar(stat= "identity", position = position_stack(reverse = TRUE)) +
coord_flip()+
labs(title = "Free Throw Plays", x="", y = "Play Count", fill = "Season")+
theme_light() +
theme(plot.title = element_text(hjust = 0.5))+
scale_fill_brewer(palette="Paired")+
geom_text(data = agg_tot, aes(x= play, y = tot, label = scales::comma(tot), fill= NULL), hjust = -0.1, size=3)+
scale_y_continuous(labels = comma,
breaks = seq(0, max_y, by = 250),
limits = c(0, max_y))
This multiple pie chart allows viewers to get a better understanding of the performance of the top 5 NBA players by season. The players are color coded by different shades of red which is displayed on the legend to the left and also within the graph. We can also see the percentages that they contributed to the total amount of free throws (if over 1.4%) and although their percentages may seem like small numbers, we can see in the graph below that in some seasons, there are about 60,000 free throws in just one season alone.
top_player <- count(df$player)
top_player <- top_player[order(-top_player$freq),]
#top_player[top_player$player %in% c("LeBron James", "Dwight Howard", "Kevin Durant", "Dwyane Wade", "Kobe Bryant"), "n"] / sum(top_player$freq)
player_df <- df %>%
select(player,season) %>%
mutate(myplayer = ifelse(player== "LeBron James", "LeBron James", ifelse(player== "Dwight Howard", "Dwight Howard",
ifelse(player== "Kevin Durant", "Kevin Durant",
ifelse(player== "Dwyane Wade", "Dwyane Wade",
ifelse(player== "Kobe Bryant", "Kobe Bryant", "other")))))) %>%
group_by(season, myplayer) %>%
dplyr::summarise(n=length(myplayer), .groups='keep') %>%
group_by(season) %>%
dplyr::mutate(percent_of_total= round(100*n/sum(n),1)) %>%
ungroup() %>%
data.frame()
ggplot(data = player_df, aes(x="", y=n, fill= myplayer)) +
geom_bar(stat="identity", position="fill") +
coord_polar(theta= "y", start=0) +
labs(fill = "Players", x= NULL, y= NULL,
title = "Top 5 Players By Season",
caption = "Slices under 1% are not labeled")+
theme(plot.title = element_text(hjust=0.5),
axis.text = element_blank(),
axis.ticks = element_blank(),
panel.grid= element_blank()) +
facet_wrap(~season, ncol=5, nrow=5) +
scale_fill_brewer(palette="Reds") +
geom_text(aes(x=1.7, label=ifelse(percent_of_total>1.4,paste0(percent_of_total, "%"), "")),
size= 2.2,
position=position_fill(vjust=.1))
The graph below displays a nested pie chart which contains information about the amount of free throws made and the free throws missed missed. On the pie to the left, we have listed both the percentages of shots made and missed and the total amount of the free throws made and missed, which were accumulated over the span of 10 seasons. The pie to the left shows simply just the percentages of the ones made and missed. We can gather from this information that NBA players only really miss about 24.3% of the shots they make while actually making 75.7% of these shots, which contribute to their teams score for the game. From this information we can identify that it is very rare that NBA players miss free throws.
pct_made <- sum(df$shot_made) / nrow(df)
plot_ly(df, labels = c("Shots Made", "Shots Missed"), values = c(sum(df$shot_made), nrow(df) - sum(df$shot_made)), type = "pie", hole = 0.5,
marker = list(colors = c("pink", "blue")), domain = list(x = c(0, 0.5), y = c(0, 1))) %>%
add_trace(labels = c("Percentage Made", "Percentage Missed"), values = c(pct_made, 1 - pct_made), type = "pie", hole = 0.8,
marker = list(colors = c("purple", "lightblue")), domain = list(x = c(0.6, 1), y = c(0.5, 1))) %>%
layout(title = "Nested Pie Chart of Shots Made and Missed",
annotations = list(text = paste0("Shots Made: ", sum(df$shot_made), "<br>Shots Missed: ", nrow(df) - sum(df$shot_made)),
font = list(size = 12),
x = 0.25, y = 0.5,
showarrow = FALSE))
The follow chart displays the total number of free shots thrown per season, both made and missed. we can see the total free throw shots per season, both made and missed. On the Y-axis there is the number of total shots, to give viewers an idea of how many shots there were in each season. The legend on the right displays which season the data relates to. This allows viewers to get a better understanding of how many free throw shots there are per season, along with using this information to get a better idea of how many shots the players in “Top 5 Players By Season” have made in a season. Most seasons have averaged around 50,000 shots in a season, and these numbers have been pretty consistent throughout all 10 seasons. There has been a slight decline in free shots since 2006, but this could be for various reasons such as new coaching technique, new NBA rules, or simply being cautious on getting a penalty so the opposite team does not get the chance at a free throw.
shots_by_season <- df %>%
group_by(season) %>%
summarise(total_shots = sum(shot_made))
ggplot(shots_by_season, aes(x= season , y=total_shots, fill= season)) +
geom_bar(stat="identity", position="dodge") +
geom_text(aes(label = total_shots), vjust = -0.5, position = position_dodge(0.9)) +
scale_fill_brewer(palette= "Paired") +
theme_light() +
labs(title = "Shots by Season",
x= "Season",
y= "Total Shots",
fill = "Season")
Through the visualization findings, we can conclude that much does change over the span of 10 years when it comes to sports. There are constantly new rules and regulations either being made or thrown out in the NBA. Many of these changes could also be due to new changes within teams themselves such as new or different coaching and players. Throws per period may come to a surprise to many due to having a period 5-8 which is oftentimes not common at all in basketball, and many do not even know that it was possible to go into an eigth period. We could also gather that the performance of the top players has been consistent all throughout the years and also with their numbers of free throws made which helped contribute leading their team into a (hopeful) victory since there is roughly a 25% chance of NBA players missing a free throw shot due to the amount of experience and practice they gain through their years of training following up to NBA. Overall these visualations have helped to compile more insight into the statistics behind the games and the history of the games, to allow viewers to grow deeper questions as to why some numbers look the way they do, to lead into more research behind the sport of Basketball, and the teams that carry this sport.