Network Science field has been expanding its boundary to various fields including sports nowadays. However football (soccer in American) is still believed to be complicated when interpreting the invidiual(player)’s ability directly as the team’s performance. But the development of data collecting technology enabled to gather all the events from the play, and allowed network science to join the field and apply passing network for more deliberate analyses of football match(Gudmundsson and Horton, 2017). To see if passing network is a reliable data to interpret team’s strategies, two variables(closeness centrality and player ratings) are used to derive the relationship between them. From one specific match, I show that from both teams which are FC Barcelona and AC Milan, closeness centrality and player ratings of each player have a strong correlation. It indicated that passing network in football connected with the key players in the team, and therefore subjective player ratings are relevant with closeness centrality score. The research will be further developed with more conditions that could quantify the importance of each player.
Unlike other teamwork-based sports, many past sport analysts concerned about analyzing football since it requires different aspects for the team organization and performance not captured by classical analyses of individual players’ ability. The reason is the complex of the game, which means that the data collected individually cannot be interpreted directly to the performance of the team. Football is a sport that can not be anaylzed by looking only at its players, it should be considered as the whole at the same time.
Due to this nature of football, recently the ability of obtaining datasets of all events occurring during a football match allows analysing and quantifying the behavior of a team as a whole, together with the role of each single player (Gudmundsson and Horton, 2017). Under this framework, football teams can be considered as the result of the interaction between its players, creating passing networks, which are directed (passes between players from direction to another), weighted (the weight of the edge based on the number of passes between players) and time evolving (the network continuously changes its structure).
From the passing network in football, it shows how all players contribute to their team strategy. By using closeness centrality based on the passes between players as the measure of how important each players are in the game, I will make two passing network from one particular match of Champions League in March 12th, 2013, and the two teams are FC Barcelona from La Liga (Spain League) and AC Milan from Serie A (Italy League). Closeness centrality measure being one variable, and player ratings being the other, I will see the correlation between closeness centrality and ratings of the match from football analysts.
The dataset is a total passing distribution of each player in one game between FC Barcelona and AC Milan, Champions league match held on March 12th, 2013. Barcelona won by 4:0 in this game and made it to semi-finals. Since passing network differs by every match due to the change of team strategies and the use of different players by games, dataset was based on only one match. Barcelona won by 4:0 in this game and made it to semi-finals. Total passing distributions of every match in Champions League are uploaded in press kits of UEFA.COM in a pdf format, so I re-typed the datas to excel table format for network analyzation.
Since it is a comparison between two teams competing in one game, there are two datasets of each team which are FC Barcelona and AC Milan. Each team has 11 starting players with 3 substitute players, giving a total number of 14 players. Players of a team represent each nodes in a team network, and the node of each numbers indicate player’s back number on their own team. Edges of the dataset shows the successful passes made between players on the pitch. Both datasets are directed and weighted network; it is directed because the pass has a direction from one player to another, and weighted because the edges represents the number of successful passes.
With the two passing network datasets, they are compared with the player ratings from www.whoscored.com, one of the most reliable football analyzing organization. The rating is out of 10, rounded upto two decimals. The higher the number, the greater their influence to the play.
I drew the two passing networks of two teams, FC Barcelona vs AC Milan in March 12,2013 respectively. From Figure 1 and 2, the size of each node indicates the score of closeness centrality, therefore higher the closeness centrality, bigger the node gets. Labels represent the back number of each player in their own team.
library(igraph)
#>
#> Attaching package: 'igraph'
#> The following objects are masked from 'package:stats':
#>
#> decompose, spectrum
#> The following object is masked from 'package:base':
#>
#> union
getwd()
#> [1] "/Users/ynwahyobin/Desktop/14"
bcpass<-read.csv("BCpass.csv") # Read the csv of each team's pass distribution
bclabel<- read.csv("bclabel.csv") #Label the player number as the each node's name
rownames(bcpass) <- bcpass$X
bcpass <- bcpass[,-c(1)]
as.matrix(bcpass)
#> X1 X2 X3 X6 X7 X8 X10 X14 X16 X17 X18 X5 X9 X21
#> X1 0 2 0 2 2 1 0 2 1 0 0 0 0 1
#> X2 0 0 5 11 1 4 12 0 6 2 0 0 2 0
#> X3 3 6 0 7 1 7 1 12 5 1 4 0 1 0
#> X6 0 9 13 0 5 19 16 3 21 6 6 0 1 0
#> X7 0 1 0 2 0 0 3 0 0 0 0 0 0 0
#> X8 0 6 4 16 1 0 5 5 16 10 13 0 2 1
#> X10 0 10 1 18 2 9 0 1 7 2 3 1 0 0
#> X14 2 2 3 13 0 6 4 0 11 2 6 0 0 0
#> X16 0 8 11 26 1 11 16 8 0 5 3 0 0 0
#> X17 0 0 0 3 2 4 3 2 5 0 12 0 1 0
#> X18 0 0 4 5 0 14 2 7 10 12 0 0 1 1
#> X5 1 0 0 0 0 0 0 0 0 0 0 0 0 0
#> X9 0 2 1 0 0 0 0 0 0 0 2 0 0 0
#> X21 0 0 0 0 0 1 0 0 0 0 0 0 0 0
barcelona <- graph_from_adjacency_matrix(as.matrix(bcpass))
V(barcelona)$name <- bclabel$NUMBER
V(barcelona)$cls <- closeness(barcelona, mode = c("out", "in", "all",
"total"), weights = NULL, normalized = FALSE)*600
lb <-layout.drl(barcelona) # need to fix the layout
plot(barcelona, layout = lb, vertex.color="blue", vertex.size=V(barcelona)$cls, vertex.label.color="black",
vertex.label.cex=.7, edge.color="gray80",edge.arrow.size=0.2, edge.arrow.width=0.5,
main="Passing Network of FC Barcelona")Fig.1. Passing Network of FC Barcelona
getwd()
#> [1] "/Users/ynwahyobin/Desktop/14"
acmpass<-read.csv("ACMpass.csv")
acmlabel<- read.csv("acmlabel.csv")
rownames(acmpass) <- acmpass$X
acmpass <- acmpass[,-c(1)] #Label the player number as the each node's name
as.matrix(acmpass) # see the matrix of ACM pass
#> X32 X5 X10 X16 X17 X18 X19 X20 X21 X23 X92 X4 X7 X22
#> X32 0 1 0 0 1 2 1 0 1 2 0 0 0 0
#> X5 0 0 1 2 7 2 0 2 4 0 1 3 0 0
#> X10 0 2 0 1 1 2 3 5 1 1 3 0 2 0
#> X16 0 0 3 0 1 2 1 3 0 0 1 0 1 0
#> X17 2 1 0 2 0 5 2 8 2 0 2 0 0 1
#> X18 0 3 2 0 3 0 1 5 3 1 4 3 4 1
#> X19 0 0 1 1 0 3 0 0 1 0 1 0 0 0
#> X20 0 1 3 3 4 2 4 0 1 2 0 0 1 0
#> X21 1 1 0 0 2 5 2 1 0 2 8 2 2 1
#> X23 0 2 6 1 1 1 4 2 0 0 1 0 0 0
#> X92 0 1 3 0 2 3 2 0 5 1 0 1 2 0
#> X4 0 1 0 1 1 1 0 0 3 0 1 0 1 3
#> X7 0 1 2 0 0 4 0 1 1 0 1 0 0 0
#> X22 0 1 0 0 1 0 0 0 1 0 0 2 0 0
acmilan <- graph_from_adjacency_matrix(as.matrix(acmpass))
V(acmilan)$name <-acmlabel$NUMBER #change label name to player's back number
V(acmilan)$cls <- closeness(acmilan, mode =c("out","in","all","total"),weights = NULL, normalized = FALSE)*600
#closeness of AC Milan's each node(player)
l<-layout_with_lgl(acmilan)
plot(acmilan, layout = l, vertex.color="red", vertex.size=V(acmilan)$cls, vertex.label.color="black",
vertex.label.cex=.7, edge.color="gray80",edge.arrow.size=0.2, edge.arrow.width=0.3,
main="Passing Network of AC Milan")Fig.2. Passing Network of AC Milan
Closeness centrality is used for the anaylization of passing network. Closeness centrality is a way of detecting nodes which are able to spread information efficiently through a graph (network). It can assess the number of passes played and received by a given player. This measure would be greater if the number of passes received and played by a player are distributed more evenly across the team. Therefore players with a large closeness centrality measure have higher impact on the passing network by means of the movement of the ball among the teammate (Amir Rahnamai Barghi, 2015).
The normalized closeness centrality of a node is the avearge length of the shortest path between the node and all other nodes in the graph (Newman, 2018). The average shortest distance from player(node) \(i\) to every other player in the network is \[l_i=\frac{1}{n}\sum_{j}d_{ij}\] where \(d_{ij}\) represents the shortest distance of player \(i\) to player \(j\). Therefore closeness centrality \(C_i\) is \[C_i=\frac{1}{l_i}=\frac{n}{\sum_jd_{ij}}\] Not only the player’s \(degree\) which shows the number of passes made by a player, the importance of each player is also related to \(closeness\) \(centrality\) since it measures the minimum number of steps that the ball has to undergo from one player to reach any other player in the team(López-Peña and Touchette, 2012).
close_bc <- closeness(barcelona, mode = c("out", "in", "all",
"total"), weights = NULL, normalized = FALSE)*600
sort(close_bc,decreasing = T)
#> 3 8 6 10 14 16 18 2
#> 40.00000 40.00000 37.50000 37.50000 35.29412 35.29412 35.29412 33.33333
#> 17 1 9 7 21 5
#> 33.33333 30.00000 25.00000 24.00000 22.22222 20.00000Table.1. Closeness Centrality of FC Barcelona
Table.2. Player ratings of FC Barcelona
Table.3. Correlation Table of FC Barcelona
close_acm <- closeness(acmilan, mode =c("out","in","all",
"total"),weights = NULL, normalized = FALSE)*600
sort(close_acm,decreasing = T)
#> 18 21 10 17 20 92 5 23
#> 40.00000 40.00000 37.50000 35.29412 35.29412 35.29412 33.33333 33.33333
#> 4 16 32 7 19 22
#> 33.33333 31.57895 30.00000 30.00000 28.57143 27.27273Table.4. Closeness Centrality of AC Milan
Table.5. Player ratings of AC Milan
Table.6. Correlation table of AC Milan
FC Barcelona indicated \(strong\) \(correlation\) (0.74) between the closeness centrality score and player rating, with positive coefficient. AC Milan showed \(relatively\) \(strong\) \(correlation\) (0.45) also with positive coefficient. AC Milan showed weaker correlation compared to winning team FC Barcelona, but from two correlations, there was clearly a positive correlation between closeness centrality score and player rating, with at least relatively strong relationship(>0.4) between the two variables.
Because Barcelona won by 4:0, the winning team’s average player rating was much higher than the losing team , FC Barcelona by 7.59 and AC Milan by 6.26. Barcelona’s range of ratings was from 6.02 ~ 9.66 while AC Milan was from 5.76~7.37. The median value of rating in AC Milan was 6.15, which was close to the value of second lowest rating(6.14) in FC Barcelona(21,Adriano). Since the mean of two team’s closeness centrality are subtle\(-\)FC Barcelona by 32.05 and AC Milan by 33.62, it was expectable for Barcelona to have stronger correlations than AC Milan.
Barcelona’s key players according to the Table.1 and 2 were number 3(Gerard Pique), 8(Andrés Iniesta), 6(Xavi Hernández) and 10(Lionel Messi). According to the Table.4 and 5, AC Milan’s key players were number 18(Riccardo Montolivo) and 10(Kevin-Prince Boateng).
It indicated that players having higher closeness centrality shows how well the player is connected in his own team, and thus it leads to his higher contribution to the result of the game, therefore players with more successful passes are more likely to be highly rated.
The distance of pass were not calcuated in this anaylsis, therefore I could not sort out out which pass was more relevant and directly contributed to the result of the game. Short and meaningless pass were all included in the passing network. Also, even though football relies highly on passing strategy, successful pass do not always represent the strongness of one’s team. FC Barcelona is highly known for well-structured passing teamwork also known as \(Tiqui-taca\). However, there are also strong teams which aim for quick counter-attacks with long and fast passes, therefore the analysis of higher passing rate depends by different team strategies. Because the players are used flexibly, each players undertake different roles and positions every match. Therefore, there are limits to generalize the passing network analysis and point out few key players.
Secondly, this analysis lacks the position of each player. In football, there are mainly 4 positions - Goal-Keeper, Defender, Midfielder and Forward. Usually passes are distributed to midfielder whose main role is to find new spaces and pass the ball up to the forwards. Therefore closeness centrality based on passing network might give higher scores to midfielders than to other positions even though higher passing rates does not always indicate how the players are contributing well into the play.
Also, player ratings are highly affected by the result of the game. MOTM(Man Of The Match) is mostly given to a player who scored, thus forwards have higher probablity of getting better ratings than midfielders even though their number of passes were insufficient. Contribution to the team connotes various meanings in soccer. It could be a decisive goal, enormous tackles, or collaboration with other members. Because rating is subjective, we face limitations of getting the accurate correlation between the two variables.
My finding suggests that with more conditions(betweeness,pagerank,eigenvector centrality etc.) are used as the measure of passing network, it will show better contribution of each player to the team and derive more accurate correlation.
ArXiv, Emerging Technology from the. “PageRank Algorithm Reveals Soccer Teams’ Strategies.” MIT Technology Review, MIT Technology Review, 22 Oct. 2012, www.technologyreview.com/s/428399/pagerank-algorithm-reveals-soccer-teams-strategies/.
“Barcelona 4-0 Milan: Villa Plays Centrally and Allows Messi Space between the Lines.” Zonal Marking, www.zonalmarking.net/2013/03/13/barcelona-4-0-milan-tactics/.
Barghi, Amir Rahnamai. “Analyzing Dynamic Football Passing Network.” 2015, doi:10.3897/bdj.3.e4541.figure2f.
Fewell, Jennifer H., et al. “Basketball Teams as Strategic Networks.” PLOS ONE, Public Library of Science, journals.plos.org/plosone/article?id=10.1371%2Fjournal.pone.0047445.
Gurpinar-Morgan, Will. “Network Analysis.” 2+2=11, 4 Sept. 2015, 2plus2equals11.com/category/network-analysis/.
Mclean, Scott, et al. “Integrating Communication and Passing Networks in Football Using Social Network Analysis.” Science and Medicine in Football, vol. 3, no. 1, 2018, pp. 29–35., doi:10.1080/24733938.2018.1478122.
Newman, M. (2018). Networks. Oxford university press.
Whoscored.com, www.whoscored.com/Matches/684964/PlayerStatistics/Europe-Champions-League-2012-2013-Barcelona-AC-Milan.
WillTGM. “WBA Vs Liverpool Stats: Passing Network Analysis for Liverpool.” EPLindex.com, EPL Index, eplindex.com/17902/wba-liverpool-stats-passing-network-analysis-liverpool.html.