The dataset includes information for seasons between 2015-16 through 2018-19 but does not include goalkeeper information or players with less than 10 appearances in a season. Using this iformation, I’ve created a separate dataframe for each season. Given that I follow the Premier League and a huge soccer fan, I will create expectations and questions using my own personal knowledge of the game. They are the following:
1. I’d expect the league to be mostly concentrated by players who do not originate from the UK.
2. I’d expect the ages of players in the league to be normally distributed and centered around an average of 25.
3. Are the market value of players right skewed?
4 Forwards (FW) are generally the most expensive players in the league.
4. Important metrics such goals, assists, and tackles normally distributed?
5 Who were the most important players from each season?
## [1] "Player" "Season"
## [3] "Born" "Age"
## [5] "Squad" "Nation"
## [7] "Previous.Market.Value" "Market.Value"
## [9] "Position" "App"
## [11] "Minutes" "Goals"
## [13] "Passes" "Assists"
## [15] "Yellow" "Red"
## [17] "SubOn" "SubOff"
## [19] "Shots" "SOT"
## [21] "HitPost" "HeadClear"
## [23] "HeadGoal" "PKScored"
## [25] "FKGoal" "Offsides"
## [27] "ThrBall" "Misses"
## [29] "Corners" "Crosses"
## [31] "Blocks" "Interceptions"
## [33] "Fouls" "Last.man"
## [35] "Tackles" "ELG"
## [37] "OwnGoal" "Clears"
## [39] "ABW" "ABL"
## [41] "ggratio"
The Premier League is the most popular league in football, attracting players from all over the world. Even though the league is based out of the UK, I would expect most the players to be of foreign descent becuase some of the biggest teams sign foreign talent and global superstars. Visually the break out of pllayers tends to be on average 70% of foreign descent and 30% domestic talent.
The median age of total players is between 25 and 26 years old. Generally professional players usually hit their prime around 26 and retire by 35. It is rare to see players over 35 playing consistently ever week as a starter, let alone someone who is a teenager and new to the league. We see this is consistent across the 4 seasons, so that must mean that teams are always bringing in new talent to reinforce the teams.
## Season Median Age
## 1 2015-2016 26.13564
## 2 2016-2017 26.31421
## 3 2017-2018 26.49861
## 4 2018-2019 25.93460
As expected, the market values are right skewed. Most players not worth more than 220k (euros). This could be due to the fact that a little more than half the teams are worth less than 10 million euros.
## Team Avg_Team_Value
## 1 AFC Bournemouth 48.71284
## 3 Aston Villa 54.75000
## 26 West Bromwich Albion 54.93617
## 5 Burnley 55.51000
## 22 Sunderland 56.18182
## 6 Cardiff City 56.47368
## 4 Brighton and Hove Albion 59.75143
## 11 Huddersfield Town 59.86842
## 25 Watford 68.69178
## 21 Stoke City 69.21071
## 23 Swansea City 71.88898
## 19 Norwich City 72.97500
## 18 Newcastle United 79.54182
## 12 Hull City 82.70000
## 8 Crystal Palace 93.41901
## 17 Middlesbrough 94.08333
## 27 West Ham United 94.87179
## 10 Fulham 96.05000
## 28 Wolves 98.91250
## 13 Leicester City 102.88406
## 20 Southampton 106.29795
## 9 Everton 119.92123
## 2 Arsenal 215.76351
## 14 Liverpool 223.58219
## 16 Manchester United 223.76299
## 24 Tottenham 235.36957
## 7 Chelsea 266.63571
## 15 Manchester City 289.52078
## [1] "Values in 100,000s (Euros)"
As we see above the average market value for defenders (DF) and forwards (FW) steadily increased through the years. For some reason the average value of midfielders (MF) drastically dropped in the 2018-19 season. Though history for these past 4 seasons does not prove that forwards are always the clear cut most expensive position in the league, it is generally MF or FW that on average more expensive than the other two positions.
Players from each position are judged on different metrics such as tackles, passes, goals, etc. We will take a look at metrics specific to each position and analyze my hypothesis. ### Forwards
Forwards are typically judged on the amount of goals they score. Judging by the distribution of goals scored, we see that the data is heavily skewed to the right. It might be unfair to judge forwards solely on goal figures since certain forwards play more games than others. A popular metric to judge a forward’s efficiency in front of goal is the goal to game ratio. After reviewing the histograms, it appears that these stats don’t necessarily follow a specific type of distribution. The max ratio in the league tends to fall under 80%. During the 2016-2017 and 2017-2018 there were players who hit an average greater than 80%. We see that these players (3) have been consistently hit this mark during the past seasons.
forwards<- data.frame(matrix(ncol = 0, nrow = 0))
for ( i in 1:nrow(data))
{ if ( data$ggratio[i] >=80) print(data$Player[i]) }
## [1] Harry Kane
## 711 Levels: Aaron Cresswell Aaron Lennon Aaron Mooy ... Zlatan Ibrahimovic
## [1] Harry Kane
## 711 Levels: Aaron Cresswell Aaron Lennon Aaron Mooy ... Zlatan Ibrahimovic
## [1] Mohamed Salah
## 711 Levels: Aaron Cresswell Aaron Lennon Aaron Mooy ... Zlatan Ibrahimovic
## [1] Sergio Agüero
## 711 Levels: Aaron Cresswell Aaron Lennon Aaron Mooy ... Zlatan Ibrahimovic
## [1] Sergio Agüero
## 711 Levels: Aaron Cresswell Aaron Lennon Aaron Mooy ... Zlatan Ibrahimovic
The number of midfielder’s assists carry the same weight as goals scored by forwards. By using the 2015-16 season as a preliminary analysis, we see that the data is extremely right-skewed. Perhaps it would make sense to judge midfielders based on the number of passes they complete. I would still expect the data to be skewed to the right but not as much compared to assists.
We see that during the 2015-2016 and 2016-2017 seasons, the data was distributed a bit more evenly compared to the 2017-18 and 2018-19 season. During the 2017-18 and 2018-19, there were midfielders who completed more than 3,000 passes. Noticed as we extracted players with more than 3,000 passes a couple defenders appear in our result. So it appears that even defensive players can put up passing figures similar to midfielders.
midfielders<- which(season18$Passes >3000)
season18$Player[midfielders]
## [1] Granit Xhaka Nicolás Otamendi
## 711 Levels: Aaron Cresswell Aaron Lennon Aaron Mooy ... Zlatan Ibrahimovic
midfielders<- which(season19$Passes >3000)
season19$Player[midfielders]
## [1] Jorginho Virgil van Dijk
## 711 Levels: Aaron Cresswell Aaron Lennon Aaron Mooy ... Zlatan Ibrahimovic
Defenders are typically judged on total number of interceptions, tackles, and clearances they make. I will analyze tackles and interceptions. Given that tackles and interceptions occur more frequently than goals or assists, I would expect the distribution of these stasts to resemble a uniform distribution.
## [1] Wilfred Ndidi
## 711 Levels: Aaron Cresswell Aaron Lennon Aaron Mooy ... Zlatan Ibrahimovic
## [1] Aaron Wan-Bissaka Idrissa Gueye Wilfred Ndidi
## 711 Levels: Aaron Cresswell Aaron Lennon Aaron Mooy ... Zlatan Ibrahimovic
## [1] Idrissa Gueye Laurent Koscielny N'Golo Kanté
## 711 Levels: Aaron Cresswell Aaron Lennon Aaron Mooy ... Zlatan Ibrahimovic
The data related to tackles appears to be right skewed at a larger rate compared to data related to interceptions. Wilfred Ndidi consistently impressed during the 2017-18 and 2018-19 season, ranking up a large number of tackles. Interceptions appear to be distributed at a even rate but it is still apparent that most players will struggle to complete more than 80 interceptions during the season. Idriss Gueye (DF) was the only player to complete more than 120 interceptions during all 4 seasons. ### The Leagues Most Important Players Now that we’ve analyzed the different metrics used to judge players. Let’s extract the most important players from the last 4 seasons.
## Most Goals Player Most Assists Player MVP
## season1516 25 Harry Kane 19 Mesut Özil 88000000
## season1617 29 Harry Kane 18 Kevin De Bruyne 165000000
## season1718 32 Mohamed Salah 16 Kevin De Bruyne 143000000
## season1819 22 Mohamed Salah 15 Eden Hazard 165000000
## Player
## season1516 N'Golo Kanté
## season1617 Kevin De Bruyne
## season1718 Kevin De Bruyne
## season1819 Mohamed Salah
It appears Harry Kane and Mohamed Salah have been going head to head in the goal charts. Despite this, neither have won the league. Mesut Ozil had an impressive run in 2015-16 with 19 assists. As far as the MVP of the league, N’Golo Kante was the league’s most valuable player after winning the league in 2015-16, but his value was overtaken by Kevin De Bruyne and Mohamed Salah in the following seasons.