Importance of Context in Data Analysis

When analyzing play information an individual is often provided with the raw data. This, while potentially useful and of interest to an expert in the field, offers limited value to coaches and performance staff with minimal expertise in interpreting data. As a data analyst in sport, a fundamental aspect of the role is to create context surrounding the data and to present the data in a way that is both meaningful and comprehensible for relevant coaching and performance staff as well as key stake holders.

Using Plots to provide context

The use of visualisations through plots provides context to the data and enables a visual understanding of the situation depicted in the data. This can be particularly advantageous for coaches as it enables an analyst to convey a lot of information in a way that is easily digestible and aligns with the perspectives of coaching staff. Using fields in the plots is one way to provide a contextual backdrop for the data.

Creating fields using R

Using ggplot2 we are able to replicate the dimensions of a playing surface to scale. Firstly an individual must source accurate surface and boundary dimensions. For the following example we will be utilising a basketball court for reference, aiming to replicate a half court. The dimensions can be found here: https://www.msfsports.com.au/basketball-court-dimensions/

Firstly we will load in ggplot2 and create the outline of the court:

library(ggplot2)
#Pitch Boundaries (15.24m wide x 14.325m long)
 court <- ggplot(data = data.frame(0,0), xlim=c(0,47), ylim=c(0,50)) +
  
  #Pitch Boundaries (15.24m wide x 14.325m long)
  geom_segment(aes(x = 0, y = 0, xend = 50, yend = 0)) +
  geom_segment(aes(x = 50, y = 0, xend = 50, yend = 47)) +
  geom_segment(aes(x = 50, y = 47, xend = 0, yend = 47)) +
  geom_segment(aes(x = 0, y = 47, xend = 0, yend = 0))
 
 court

Next we will add in the 3 point circle:

court <- court +
#3 point circle lines (3ft in by 14ft)

  geom_segment(aes(x = 47, y = 0, xend = 47, yend = 14)) +
  geom_segment(aes(x = 3, y = 0, xend = 3, yend = 14)) +
  #3 point circle (2.9)

  geom_curve(aes(x = 47, y = 14, xend = 3, yend = 14), curvature = 0.7) 
 
 court

Now the key and the 3 point circle

court <- court +
  #key
  geom_segment(aes(x = 33, y = 0, xend = 33, yend = 19)) +
  geom_segment(aes(x = 17, y = 0, xend = 17, yend = 19)) +
  geom_segment(aes(x = 17, y = 19, xend = 33, yend = 19)) +

 #free throw circle

  geom_curve(aes(x = 19, y = 19, xend = 31, yend = 19), curvature = -1) +
  geom_curve(aes(x = 19, y = 19, xend = 31, yend = 19), curvature = 1) 
 
 court

Finally we are going to ensure that the x and y scales are equal using coord_equal() and set the theme to black and white.

 court <- court +
    theme_bw()

  court+
    coord_equal()

Adding data to created fields using R

Now that we have created our fields, we need to add the data onto the plots.

First of all lets create a data frame with some data points to trial.

lakers <- data.frame(
  player = c(
    "Vladimir Radmanovic", "Lamar Odom", "Trevor Ariza", "Sasha Vujacic", "Trevor Ariza", "Vladimir Radmanovic", "Kobe Bryant", "Vladimir Radmanovic", "Jordan Farmar", "Vladimir Radmanovic", "Vladimir Radmanovic", "Lamar Odom", "Derek Fisher", "Kobe Bryant", "Kobe Bryant", "Pau Gasol", "Andrew Bynum", "Kobe Bryant", "Kobe Bryant", "Kobe Bryant"),
  type = c("3pt", "3pt", "3pt", "3pt", "3pt", "3pt", "3pt", "3pt", "3pt", "3pt", "jump", "jump", "jump", "jump", "jump", "jump", "jump", "jump", "jump", "jump"),
  x = c(1, 41, 46, 37, 43, 8, 16, 1, 41, 9, 4, 6, 17, 20, 44, 26, 27, 6, 14, 23),
  y = c(6, 26, 17, 30,  24, 28, 29, 5, 29, 28, 9, 5, 15, 27, 9, 12, 15, 7, 21, 13),
  shotoutcome = c("missed", "missed", "made", "missed", "made", "made", "made", "made", "made", "missed", "missed", "missed", "missed", "made", "made", "made", "missed", "made", "missed", "missed"))

Now lets plot 3 point shots and jump shots on two separate courts, coloured by the outcome with a legend for the player.

court +
  geom_point (data = lakers,(aes(x = x, y = y, col = shotoutcome, alpha = 0.8))) +
  facet_wrap(~ type)+
  ggtitle("LAKERS SHOT LOCATION - BY TYPE") +
  coord_equal()

Lets now plot all shot types for each player.

court +
  geom_point (data = lakers,(aes(x = x, y = y, col = shotoutcome, alpha = 0.8))) +
  facet_wrap(~ player)+
  ggtitle("LAKERS SHOT LOCATION - BY PLAYER") +
  coord_equal()