Alfredo Martinez Jr
9/1/2020
In the past few years there has been an increased interest in advanced stats for the sport of soccer that describe and summarize player actions in a far more comprehensive way than traditional box scores do. The state of advanced soccer stats is still in a relatively young phase compared to other sports such as baseball or American football. This need is even more dire for women’s soccer, which until a few years ago had virtually no advanced stats available at almost any level.
The WoSo Stats project filled in some of these gaps by logging comprehensive advanced stats for each match of the National Women’s Soccer League (NWSL) 2016 season. This presentation proposes a Shiny app that visualizes some of the data logged by this project.
The WoSo Stats dataset includes data for each player of each match that was logged, with every match of the 2016 NWSL season in addition to other matches. For this presentation, I’ll filter down to only matches from the NWSL 2016 season, and then sum all columns that represent match metrics such as passes and shots. This tidies the dataset down to a table with 226 rows and 93 columns.
suppressMessages(library(tidyverse))
m_stats <- read_csv("https://raw.githubusercontent.com/alfredomartinezjr/wosostats/master/data/summary/tbl_all.csv", col_types = cols())
m_stats <- filter(m_stats, competition_slug == "nwsl-2016") %>%
select(-match_id, -competition_slug, -date,
-matchup, -position, -lineup_player_id) %>%
group_by(team, player) %>%
summarise_all(sum)
dim(m_stats)## [1] 226 93
An example of one of these stats is op_pass_att, which represents the number of times a player attempted a pass in open play (excluding free kicks, throw-ins, goal kicks). To adjust for differences in playing time, I can calcualte a “per 90” stat by dividing by total minutes played, MP, and then multiplying by 90. The resulting histogram shows a largely Gaussian distribution.
qplot(x = 90*op_pass_att/MP, data = m_stats, binwidth = 2.5,
xlab = "Open play pass attempts per 90 minutes")Plotting the previous stat against another stat, such as the pass completion percentage, would add even more clarity to how certain players pass the ball. A Shiny app incorporating these visualizations would illuminate these player attributes even further. The code below represents the server.R file of such a Shiny app.
suppressMessages(library(shiny))
suppressMessages(library(tidyverse))
suppressMessages(library(plotly))
m_stats <- read_csv("https://raw.githubusercontent.com/alfredomartinezjr/wosostats/master/data/summary/tbl_all.csv", col_types = cols())
m_stats <- filter(m_stats, competition_slug == "nwsl-2016") %>%
group_by(team, player) %>%
summarise(MP = sum(MP),
pass_att = sum(pass_att),
op_pass_att = sum(op_pass_att),
op_pass_comp = sum(op_pass_comp),
interceptions = sum(interceptions),
blocks = sum(blocks),
clearances = sum(clearances),
key_passes = sum(key_passes))
shinyServer(function(input, output) {
x <- reactive({
if (input$selection == 1) {
~op_pass_att/MP*90
}
else if (input$selection == 2) {
~pass_att/MP*90
}
else if (input$selection == 3) {
~(interceptions+blocks+clearances)/MP*90
}
})
y <- reactive({
if (input$selection == 1) {
~op_pass_comp/op_pass_att
}
else if (input$selection == 2) {
~key_passes/MP*90
}
else if (input$selection == 3) {
~interceptions/(interceptions+blocks+clearances)
}
})
x_label <- reactive({
if (input$selection == 1) {
"Open Play Pass Attempts per 90 minutes"
}
else if (input$selection == 2) {
"Pass Attempts per 90 minutes"
}
else if (input$selection == 3) {
"Ball disruptions (interceptions, blocks, or clearances)"
}
})
y_label <- reactive ({
if (input$selection == 1) {
"Open Play Pass Completion Pct"
}
else if (input$selection == 2) {
"Key Passes per 90 minutes"
}
else if (input$selection == 3) {
"Interceptions as a Pct of all ball disruptions"
}
})
output$selectedPlot <- renderPlotly(
myplot <- plot_ly(filter(m_stats, MP > 360),
x = x(),
y = y(),
color = ~team,
text = ~paste("Player: ", player, '<br>Team:', team),
type = "scatter",
mode = "markers") %>%
layout(yaxis = list(title = y_label()),
xaxis = list(title = x_label()))
)
})