This is a calculation based around the probability of a goal being scored from a certain point in time of play when a player is in possession of the ball
By obtaining a measured value for this, coaches can use this knowledge to create potential tactics and actions around these positional plays to ultimately create better chances of creating a score
A snippet of code showing the data processing, specifically looking at the x and y coordinates to create metrics which are used for the influence of the dangerousity score. These coordinates have been taken from GPS units from each player from each play
## id_play player_id team clock x y v on_ball
## 1 2240 250267 att 89.6 51.24058 2.879112 3.5332719 0
## 2 2240 260113 att 89.6 72.13534 10.935754 6.2456039 0
## 3 2240 290073 def 89.6 14.11606 -7.475015 3.8921803 0
## 4 2240 290641 att 89.6 19.88332 38.113330 2.2281301 0
## 5 2240 290797 def 89.6 66.94921 17.038943 0.7671663 0
## 6 2240 291492 def 89.6 23.31737 37.820513 1.4287466 0
## play_phase
## 1 GENERAL_PLAY
## 2 GENERAL_PLAY
## 3 GENERAL_PLAY
## 4 GENERAL_PLAY
## 5 GENERAL_PLAY
## 6 GENERAL_PLAY
Values such as Pressure and Control need to be calculated
From some data wrangling and cleaning. Using the GPS data, two metrics that i have looked at for this example are DISTANCE FROM GOAL and NUMBER OF DEFENDERS AHEAD as a way of deriving a figure and will influence the overall score.
## # A tibble: 102 × 2
## # Groups: id_play [102]
## id_play each_player_dangerousity_score
## <int> <dbl>
## 1 2050 5.83
## 2 8929 5.70
## 3 12995 5.24
## 4 6970 5.21
## 5 6253 5.08
## 6 1263 4.97
## 7 23336 4.95
## 8 19967 4.90
## 9 20556 4.72
## 10 25659 4.65
## # ℹ 92 more rows
## # A tibble: 102 × 2
## # Groups: id_play [102]
## id_play each_player_dangerousity_score
## <int> <dbl>
## 1 22365 2.12
## 2 33302 2.24
## 3 34417 2.37
## 4 400 2.45
## 5 1275 2.48
## 6 30905 2.48
## 7 19594 2.53
## 8 572 2.54
## 9 525 2.56
## 10 8431 2.66
## # ℹ 92 more rows