This document contains a walk-through of a post-season analysis of the 2008-09 NBA LA Lakers team. The LA Lakers’ win in the 2008-09 NBA Championships was a significant one for the team, winning 65 of the 82 games played. The question that forms the basis of this analysis is to assess if shooting abilities had contributed to the team’s success. For this analysis, we will be looking into the team’s shooting statistics and patterns. Shooting actions and shooting strength of the team will be analysed and evaluated. In particular, we will be looking at shot locations, shot types, and shooting outcomes.
library(ggplot2)
library(dplyr)
library(ggthemes)
library(tidyverse)
library(hexbin)
library(ggforce)
library(rmarkdown)
library(kableExtra)
library(gridExtra)
library(ggrepel)
library(stringr)
Dimensions here are taken in feet.
court <- ggplot(data = data.frame(c(0, 0)), xlim=c(0,50), ylim=c(0,50)) +
#NBA Court Boundaries https://www.msfsports.com.au/basketball-court-dimensions/
geom_segment(aes(x = 0, y = 0, xend = 0, yend = 47)) + #Left
geom_segment(aes(x = 50, y =47, xend = 50, yend = 0)) + #Right
geom_segment(aes(x = 0, y = 47, xend = 50, yend = 47)) + #Top/Half-court line
geom_segment(aes(x = 0, y = 0, xend = 50, yend = 0), color = 'black') + #Bottom
#Free-throw
#Line
geom_segment(aes(x = 19, y = 19, xend = 31, yend = 19)) +
#Top circle
geom_curve(aes(x = 19, y = 19, xend = 31, yend = 19), curvature = -1,
lineend = 'round', color = 'black') +
#Bottom circle
geom_curve(aes(x = 19, y = 19, xend = 31, yend = 19), curvature = 1,
linetype = 2, lineend = 'round') +
#Key
geom_segment(aes(x = 17, y = 19, xend = 33, yend = 19)) + #Top
geom_segment(aes(x = 17, y = 0, xend = 17, yend = 19)) + #Left
geom_segment(aes(x = 33, y = 0, xend = 33, yend = 19)) + #Right
#3-point Line
geom_segment(aes(x = 3, y = 0, xend = 3, yend = 14)) + #Left
geom_segment(aes(x = 47, y = 0, xend = 47, yend = 14)) + #Right
geom_curve(aes(x = 3, y = 14, xend = 47, yend = 14), curvature = -.75,
angle = 90) + #Curve
#Backboard
geom_segment(aes(x = 22, y = 4, xend = 28, yend = 4)) +
#hoop
geom_circle(aes(x0 = 25, y0 = 5.25, r = 0.75), inherit.aes = FALSE) +
#restricted area
geom_curve(aes(x =21, y = 5.25, xend = 29, yend = 5.25), curvature = -1,
angle = 90) +
#Themes
coord_fixed() +
theme_bw()
plot(court)
Data for the LA Lakers team was extracted from the lakers.csv dataset. Outliers that exceeded the limits of the half court were removed and shot type was narrowed down to include only field goals and free throws. Additionally, shot distance was calculated by taking the Euclidean distance between (x,y) coordinates and centre point of the hoop (5,25), and further categorised into shot range as follows:
Field goals were further categorised into eight categories as follows:
Jump Shot
Three Pointers
Hook Shot
Dunks
Bank Shot
Layups
Tips
Free Throws
By streamlining the data in this way, it simplifies the process of analysis and allows us to focus on the relevant and meaningful information. Filtering out the outliers that are irrelevant allows more accurate identification of patterns and trends in the data.
#import data
nbadata <- read.csv('data/lakers.csv')
#data manipulation
LAL <- nbadata %>%
filter(team == 'LAL') %>%
filter(etype == 'shot' | etype == 'free throw' )%>%
mutate(player = replace(player, player == "Yue Sun", "Sun Yue")) %>% # clean up names
mutate(y = case_when(is.na(y) & etype == 'free throw' ~ 19,
TRUE ~ y)) %>%
mutate(x = case_when(is.na(x) & etype == 'free throw' ~ 25,
TRUE ~ x)) %>%
filter(y < 45) %>% # filter out outliers exceeding court boundaries
filter(x > 0 & x <50) %>%
mutate(shot_distance = sqrt((x-50/2)^2 + (y-5.25)^2)) %>% # Euclidean Distance
mutate(shot_type = case_when(
str_detect(type, "jump") ~ "jump",
str_detect(type, "layup") ~ "layup",
str_detect(type, "dunk") ~ "dunk",
str_detect(type, "bank") ~ "bank",
str_detect(type, "hook") ~ "hook",
str_detect(type, "fade away") ~ "jump",
str_detect(etype, "free throw") ~ "free throw",
TRUE ~ type)) %>% # categorise shots into broader categories
mutate(shot_made_numeric = as.integer(case_when(
str_detect(result, "missed") ~ "1",
str_detect(result, "made") ~ "2"))) %>% # convert chr categories to numerical
mutate(shot_range = case_when(
shot_distance < 8 ~ "Less than 8 ft",
shot_distance >= 8 & shot_distance <= 16 ~ "8-16 ft",
shot_distance >= 16 & shot_distance <= 24 ~ "18-24 ft",
shot_distance > 24 ~ "More than 24 ft")) %>% # get shot range
mutate(shot_side = case_when(
x < 17 ~ "right",
x >= 17 & x <= 33 ~ "center",
x > 33 ~ "left")) %>% # get shot side
mutate(shot_zone = paste(shot_range, shot_side, sep = ", ")) # get shot zone
Here, we will want to filter field shots and calculate the field goal percentages. To get a visual of the locations of each shot, we can overlay it over the court we have previously saved.
# filter field shots
LAL_fieldshots <- LAL %>%
filter(etype == 'shot')
# FG Percent
percent <- round(sum(LAL_fieldshots$result == "made") / length(LAL_fieldshots$result) * 100, 1)
court +
geom_jitter(data =subset(LAL_fieldshots, result == 'missed'),
aes(x = x, y = y, color = 'tomato3'),
alpha= 0.55, size = 0.8)+ # plot missed shot as red, with lower transparency
geom_jitter(data=subset(LAL_fieldshots, result == 'made'),
aes(x = x, y = y, color = 'springgreen4'),
size = 0.8) + # plot made shot in green
coord_equal() +
scale_colour_manual(name = '',
values =c('springgreen4'='springgreen4','tomato3'='tomato3'),
labels = c('made','missed')) +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
legend.text=element_text(size=10)) + # clean up graph
geom_label(aes(label = paste("field goal perc:", percent, "%")),
x = 38, y = 44, size = 4) # add a label FG%
The shot chart visualises the team’s made and missed goals across the entire season. There appears to be a balanced spread of shots attempted and made across the court. There is some evidence of shot clusters around the court, particularly within the paint and along the three-point line. In terms of shooting percentage, the team ranked third highest in the league, securing a field goal percentage of 47.6%.
We’ve gotten a rough overview of shot locations, we now want to further investigate the density of shots taken to see if there are any hot spots.
#density overview (made and missed)
Density_total <- court+
stat_density2d(data = LAL_fieldshots, aes(x=x, y=y,fill = ..level..),
contour_var = "ndensity",
bins=30, geom='polygon') +
scale_fill_viridis_c()+
guides(fill = F,alpha = F) +
labs(title = 'Total Attempted') +
coord_equal() +
theme_bw() +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
plot.title = element_text(face='bold', size=13, vjust =-3, hjust = 0.05))
#density (made only)
Density_total_made <- court +
stat_density2d(data=subset(LAL_fieldshots, result == 'made'),
aes(x=x, y=y,fill = ..level..),
contour_var = "ndensity",
bins=60, geom='polygon') +
scale_fill_viridis_c()+
guides(fill = F, alpha = F) +
labs(title = 'Made shots') +
coord_equal() +
theme_bw() +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
plot.title = element_text(face='bold', size=13, vjust =-3, hjust = 0.065))
grid.arrange(Density_total, Density_total_made, nrow = 1)
The high concentration of shots attempted and made in the paint and closest to the hoop indicates that the team was successful in driving to the basket and making the most of high percentage shots near the rim. Shots that were made less than 8 feet from the hoop accounted for 40.9% of all shots, suggesting a strong offensive presence in the post. The presence of clusters at both corners and wings of the three-point line, and the right elbow suggests that the team had a good distribution of mid and long-range shots as well. Comparing between both sides of the wings, the team had a greater accuracy on the left side of the court.
We can also analyse shooting patterns together with FG% of each shot type.
# field goal shots by type and calculate FG%
table1 <- LAL_fieldshots %>%
group_by(shot_type) %>%
summarise(FGA = sum(etype == "shot"),
FGM = sum(result == "made"),
FGP = round(FGM / FGA * 100, 1)) %>%
mutate(freq = round(FGA / sum(FGA)*100,1)) %>%
arrange(desc(freq))
court +
geom_jitter(data =subset(LAL_fieldshots, result == 'missed'), aes(x = x, y = y, color = 'tomato3'), size = 0.5, alpha = 0.6)+
geom_jitter(data=subset(LAL_fieldshots, result == 'made'), aes(x = x, y = y, color = 'forestgreen'), size = 0.5,alpha = 0.6) +
coord_equal() +
scale_colour_manual(name = '',
values =c('forestgreen'='forestgreen','tomato3'='tomato3'), labels = c('made','missed')) +
facet_wrap(~factor(shot_type, levels = c('jump', '3pt', 'hook', 'bank', 'layup', 'dunk', 'tip')), nrow = 2) +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
legend.text=element_text(size=11),
strip.text.x = element_text(size = 11)) +
geom_label(data = table1,aes(label = paste(freq, "%")), x = 40, y = 43, size = 2, fill = "steelblue1") +
geom_label(data = table1,aes(label = paste(FGP, "%")), x = 40, y = 37, size = 2, fill = "tan1")
The chart above categorises attempted shots into shot types with percentage of attempts and accuracy highlighted in blue and orange respectively. Notably, jump shot were the most popular type of shot attempted by the team, accounting for 41.9% of all attempted shots. The team was able to convert those attempts into points with a shot accuracy rate of 41.4%, an impressive stat considering that jump shots are lower percentage shots. Other popular shots taken by the team include layups and three-pointers, accounting for 23.3% and 21.3% respectively.
The focus for the next portion of analysis will be on jump shots and three-pointers. We are interested in identifying where were the most frequent locations of these two shots taken by the team.
#jumps and three-points
court+
stat_density2d(data=subset(LAL_fieldshots, shot_type == 'jump' | shot_type == "3pt"),
aes(x=x, y=y,fill = ..level.., alpha = ..level..),
contour_var = "ndensity",
bins=25, geom='polygon') +
scale_fill_viridis_c()+
guides(fill = F, alpha = F) +
coord_equal() +
theme_bw() +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
plot.title = element_text(face='bold', size=13, vjust =-3, hjust = 0.05))
We can see a high concentration of shots attempted and made from the centre at close range and on the right of the court at mid-range. We can also see triangles forming within the clusters. This could likely be because of the team’s usage of the “triangle offense” as one of their game strategies. This play emphasizes constant movement of players and the ball to create open pockets of space for jump shots, particularly mid-range and from the elbow. Having some evidence of more attempted shots taken in these areas shows us that the team was successful in executing their strategy and that might have given them an advantage over the other teams. Being able to convert shot attempts into goals is an important success factor in any sport.
density_jump_made <- court+
stat_density2d(data=subset(LAL_fieldshots, shot_type == 'jump' & result == "made"),
aes(x=x, y=y,fill = ..level.., alpha = ..level..),
contour_var = "ndensity",
bins= 10, geom='polygon') +
scale_fill_viridis_c()+
guides(fill = F, alpha = F) +
labs(title = 'Jump shots (Made)') +
coord_equal() +
theme_bw() +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
plot.title = element_text(face='bold', size=13, vjust =-3, hjust = 0.05))
density_3pt_made <- court+
stat_density2d(data=subset(LAL_fieldshots, shot_type == '3pt' & result == "made"),
aes(x=x, y=y,fill = ..level.., alpha = ..level..),
contour_var = "ndensity",
bins= 3, geom='polygon') +
scale_fill_viridis_c()+
guides(fill = F, alpha = F) +
labs(title = '3 pt shots (Made)') +
coord_equal() +
theme_bw() +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
plot.title = element_text(face='bold', size=13, vjust =-3, hjust = 0.05))
density_type_made <- grid.arrange(density_jump_made, density_3pt_made, ncol = 2)
The team’s ability to get the ball in these zones and convert the attempt into points might have added to the team’s success. For jump shots, they had scored more goals on the right elbow and just above the restricted area. In terms of three-pointers, the team had the highest concentration of shots at the wings, followed by the corners. Made shots mostly occurred at the wings, with a bigger spread on the right side of the court.