The Premier League is the most-watched sports league in the world, broadcast in 212 territories to 643 million homes and a potential TV audience of 4.7 billion people.In the 2014-15 season, the average Premier League match attendance exceeded 36,000,second highest of any professional football league behind the Bundesliga’s 43,500.Most stadium occupancies are near capacity.The Premier League ranks third in the UEFA coefficients of leagues based on performances in European competitions over the past five seasons.
There are 20 clubs in the Premier League. During the course of a season (from August to May) each club plays the others twice (a double round-robin system), once at their home stadium and once at that of their opponents’, for a total of 38 games. Teams receive three points for a win and one point for a draw. No points are awarded for a loss. Teams are ranked by total points, then goal difference, and then goals scored. If still equal, teams are deemed to occupy the same position. If there is a tie for the championship, for relegation, or for qualification to other competitions, a play-off match at a neutral venue decides rank.The three lowest placed teams are relegated into the EFL Championship, and the top two teams from the Championship, together with the winner of play-offs involving the third to sixth placed Championship clubs, are promoted in their place.
Our field of study concerns the market value of the players of English Premier League(EPL)-The competition was formed as the FA Premier League on 20 February 1992 following the decision of clubs in the Football League First Division to break away from the Football League, founded in 1888, and take advantage of a lucrative television rights deal.The deal was worth £1 billion a year domestically as of 2013-14, with BSkyB and BT Group securing the domestic rights to broadcast 116 and 38 games respectively.The league generates ???2.2 billion per year in domestic and international television rights.In 2014-15, teams were apportioned revenues of £1,600 million,rising sharply to £2,400 million in 2016-17.
The Premier League is the most-watched football league in the world, broadcast in 212 territories to 643 million homes and a potential TV audience of 4.7 billion people. The Premier League’s production arm, Premier League Productions, is operated by IMG Productions and produces all content for its international television partners.
The Premier League is particularly popular in Asia, where it is the most widely distributed sports programme.In Australia, Optus telecommunications holds exclusive rights to the Premier League, providing live broadcasts and online access (Fox Sports formerly held rights).In India, the matches are broadcast live on STAR Sports. In China, the broadcast rights were awarded to Super Sports in a six-year agreement that began in the 2013-14 season.As of the 2013-14 season, Canadian broadcast rights to the Premier League are jointly owned by Sportsnet and TSN, with both rival networks holding rights to 190 matches per season.
The specific objective of this Study was to investigate the Market Values variation of different players and what things they depend on/what are the factors affecting them.Our goal was to compare market value of the players.The rationale behind this is summarized next.The Premier League is the top level of the English football league system. Contested by twenty clubs, it operates on a system of promotion and relegation with the English Football League (EFL).
The Premier League is a corporation in which the member clubs act as shareholders. Seasons run from August to May with each team playing 38 matches (playing each other home and away).Most games are played on Saturday and Sunday afternoons. It is known outside the UK as the English Premier League (EPL).
For the study we taken up the data from kaggle(https://www.kaggle.com/mauryashubham/english-premier-league-players-dataset). We have the following variables in the Dataset:
–name: Name of the player
–club: Club of the player
–age : Age of the player
–position : The usual position on the pitch
–position_cat :
1 for attackers
2 for midfielders
3 for defenders
4 for goalkeepers
–market_value : As on transfermrkt.com on July 20th, 2017
–page_views : Average daily Wikipedia page views from September 1, 2016 to May 1, 2017
–fpl_value : Value in Fantasy Premier League as on July 20th, 2017
–fpl_sel : % of FPL players who have selected that player in their team
–fpl_points : FPL points accumulated over the previous season
–region:
1 for England
2 for EU
3 for Americas
4 for Rest of World
–nationality
–new_foreign : Whether a new signing from a different league, for 2017/18 (till 20th July)
–age_cat
–club_id
–big_club: Whether one of the Top 6 clubs
–new_signing: Whether a new signing for 2017/18 (till 20th July)
In order to test Hypothesis, we proposed the following model:
#Reading the dataset
epl<-read.csv(paste("epldata_final.csv",sep=" "))
attach(epl)
fit1<-lm(market_value~age+page_views+new_foreign+big_club+new_signing,data=epl)
summary(fit1)
##
## Call:
## lm(formula = market_value ~ age + page_views + new_foreign +
## big_club + new_signing, data = epl)
##
## Residuals:
## Min 1Q Median 3Q Max
## -46.766 -3.238 -0.530 2.896 37.532
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.6328483 2.5026212 2.650 0.008321 **
## age -0.1619876 0.0899644 -1.801 0.072432 .
## page_views 0.0078490 0.0004401 17.833 < 2e-16 ***
## new_foreign 6.8644448 1.9458643 3.528 0.000462 ***
## big_club 7.3710543 0.8912677 8.270 1.49e-15 ***
## new_signing 1.7176598 1.0067450 1.706 0.088662 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.506 on 455 degrees of freedom
## Multiple R-squared: 0.6291, Adjusted R-squared: 0.625
## F-statistic: 154.4 on 5 and 455 DF, p-value: < 2.2e-16
We established the effect of Age,Page views,New foreign,Big Club,New Signing on the Market Value of a player with the simplest model. We carried out a linear regression, on the specified fields ,against the Market Value.
By this we can clearly see that p<0.05 in 3 variables namely-page_views,new_foreign,big_club which makes it clear that market_value and these variables are dependant.The regression analysis also yielded that the age of a player is independant of his market_value.And as we can see that a5 is the heighest , therefore amongst these variables,playing for a big_club do really increase your market_value.
This paper was motivated by the need for research that could improve our understanding of what is really affecting the market price of players playing in the English Premier League(EPL). The unique contribution of this paper is that we investigated the variables that do affect the value of a player.We came to know many insights on how a player is valued high or low.We came to know that a Big Club rally does matter and does affect your market value greatly in the League. So this is how We statistically analysed the Beautiful Game.
http://rpubs.com/SameerMathur/HTNF
https://www.premierleague.com/
https://en.wikipedia.org/wiki/Premier_League
http://www.goal.com/en-in/premier-league/2kwbbcootiqqgmrzs6o5inle5
epl<-read.csv(paste("epldata_final.csv",sep=" "))
View(epl)
dim(epl)
## [1] 461 17
There are 461 rows(players to be analysed) and 17 columns (the variables on which the players are analysed).
summary(epl)
## name club age
## Ã
Âukasz FabiaÃ
âski: 1 Arsenal : 28 Min. :17.0
## Aaron Cresswell : 1 Everton : 28 1st Qu.:24.0
## Aaron Lennon : 1 Huddersfield : 28 Median :27.0
## Aaron Mooy : 1 Liverpool : 27 Mean :26.8
## Aaron Ramsey : 1 Manchester+United: 25 3rd Qu.:30.0
## Abdoulaye Doucoure : 1 Swansea : 25 Max. :38.0
## (Other) :455 (Other) :300
## position position_cat market_value page_views
## CB : 85 Min. :1.00 Min. : 0.05 Min. : 3.0
## CM : 63 1st Qu.:1.00 1st Qu.: 3.00 1st Qu.: 220.0
## CF : 61 Median :2.00 Median : 7.00 Median : 460.0
## GK : 42 Mean :2.18 Mean :11.01 Mean : 763.8
## DM : 36 3rd Qu.:3.00 3rd Qu.:15.00 3rd Qu.: 896.0
## LW : 36 Max. :4.00 Max. :75.00 Max. :7664.0
## (Other):138
## fpl_value fpl_sel fpl_points region
## Min. : 4.000 0.10% : 64 Min. : 0.00 Min. :1.000
## 1st Qu.: 4.500 0.20% : 41 1st Qu.: 5.00 1st Qu.:1.000
## Median : 5.000 0.40% : 30 Median : 51.00 Median :2.000
## Mean : 5.448 0.30% : 25 Mean : 57.31 Mean :1.993
## 3rd Qu.: 5.500 0.60% : 15 3rd Qu.: 94.00 3rd Qu.:2.000
## Max. :12.500 0.00% : 14 Max. :264.00 Max. :4.000
## (Other):272 NA's :1
## nationality new_foreign age_cat club_id
## England :156 Min. :0.00000 Min. :1.000 Min. : 1.00
## Spain : 28 1st Qu.:0.00000 1st Qu.:2.000 1st Qu.: 6.00
## France : 25 Median :0.00000 Median :3.000 Median :10.00
## Netherlands: 20 Mean :0.03471 Mean :3.206 Mean :10.33
## Belgium : 18 3rd Qu.:0.00000 3rd Qu.:4.000 3rd Qu.:15.00
## Argentina : 17 Max. :1.00000 Max. :6.000 Max. :20.00
## (Other) :197
## big_club new_signing
## Min. :0.0000 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.:0.0000
## Median :0.0000 Median :0.0000
## Mean :0.3037 Mean :0.1453
## 3rd Qu.:1.0000 3rd Qu.:0.0000
## Max. :1.0000 Max. :1.0000
##
library(psych)
describe(epl)[,c(1:5)]
## vars n mean sd median
## name* 1 461 231.00 133.22 231
## club* 2 461 10.33 5.73 10
## age 3 461 26.80 3.96 27
## position* 4 461 5.55 3.32 5
## position_cat 5 461 2.18 1.00 2
## market_value 6 461 11.01 12.26 7
## page_views 7 461 763.78 931.81 460
## fpl_value 8 461 5.45 1.35 5
## fpl_sel* 9 461 27.70 32.14 11
## fpl_points 10 461 57.31 53.11 51
## region 11 460 1.99 0.96 2
## nationality* 12 461 28.38 14.86 23
## new_foreign 13 461 0.03 0.18 0
## age_cat 14 461 3.21 1.28 3
## club_id 15 461 10.33 5.73 10
## big_club 16 461 0.30 0.46 0
## new_signing 17 461 0.15 0.35 0
str(epl)
## 'data.frame': 461 obs. of 17 variables:
## $ name : Factor w/ 461 levels "Ã
Âukasz FabiaÃ
âski",..: 23 306 352 422 254 163 340 326 403 20 ...
## $ club : Factor w/ 20 levels "Arsenal","Bournemouth",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ age : int 28 28 35 28 31 22 30 31 25 21 ...
## $ position : Factor w/ 13 levels "AM","CB","CF",..: 9 1 6 12 2 10 3 7 2 9 ...
## $ position_cat: int 1 1 4 1 3 3 1 3 3 1 ...
## $ market_value: num 65 50 7 20 22 30 22 13 30 10 ...
## $ page_views : int 4329 4395 1529 2393 912 1675 2230 555 1877 1812 ...
## $ fpl_value : num 12 9.5 5.5 7.5 6 6 8.5 5.5 5.5 5.5 ...
## $ fpl_sel : Factor w/ 113 levels "0.00%","0.10%",..: 44 93 94 16 8 37 56 83 77 11 ...
## $ fpl_points : int 264 167 134 122 121 119 116 115 90 89 ...
## $ region : int 3 2 2 1 2 2 2 2 2 4 ...
## $ nationality : Factor w/ 61 levels "Algeria","Argentina",..: 13 27 19 23 26 52 26 52 27 41 ...
## $ new_foreign : int 0 0 0 0 0 0 0 0 0 0 ...
## $ age_cat : int 4 4 6 4 4 2 4 4 3 1 ...
## $ club_id : int 1 1 1 1 1 1 1 1 1 1 ...
## $ big_club : int 1 1 1 1 1 1 1 1 1 1 ...
## $ new_signing : int 0 0 0 0 0 0 0 0 1 0 ...
There are 4 Factor variables to be analysed namely-club,position,nationality and fpl_sel.
mytable1<-with(epl,table(club))
mytable1
## club
## Arsenal Bournemouth Brighton+and+Hove Burnley
## 28 24 22 18
## Chelsea Crystal+Palace Everton Huddersfield
## 20 21 28 28
## Leicester+City Liverpool Manchester+City Manchester+United
## 24 27 20 25
## Newcastle+United Southampton Stoke+City Swansea
## 21 23 22 25
## Tottenham Watford West+Brom West+Ham
## 20 24 19 22
So the heighest number of players is from Arsenal,Everton and Huddersfield (28).
mytable2<-with(epl,table(position))
mytable2
## position
## AM CB CF CM DM GK LB LM LW RB RM RW SS
## 17 85 61 63 36 42 35 8 36 34 5 32 7
The heighest number of Center Backs are being analysed.
mytable3<-with(epl,table(nationality))
mytable3
## nationality
## Algeria Argentina Armenia
## 3 17 1
## Australia Austria Belgium
## 4 4 18
## Benin Bermuda Bosnia
## 1 1 2
## Brazil Cameroon Canada
## 12 3 1
## Chile Colombia Congo DR
## 2 1 4
## Cote d'Ivoire Croatia Curacao
## 4 1 1
## Czech Republic Denmark Ecuador
## 2 6 2
## Egypt England Estonia
## 4 156 1
## Finland France Germany
## 1 25 16
## Ghana Greece Iceland
## 5 1 2
## Ireland Israel Italy
## 17 2 4
## Jamaica Japan Kenya
## 2 2 1
## Mali Morocco Netherlands
## 2 2 20
## New Zealand Nigeria Northern Ireland
## 1 6 6
## Norway Poland Portugal
## 1 3 6
## Romania Scotland Senegal
## 1 14 7
## Serbia Slovenia South Korea
## 5 1 3
## Spain Sweden Switzerland
## 28 3 4
## The Gambia Trinidad and Tobago Tunisia
## 1 1 1
## United States Uruguay Venezuela
## 2 1 1
## Wales
## 12
As obvious the heeghest number of England players(156) play in EPL.
attach(epl)
## The following objects are masked from epl (pos = 4):
##
## age, age_cat, big_club, club, club_id, fpl_points, fpl_sel,
## fpl_value, market_value, name, nationality, new_foreign,
## new_signing, page_views, position, position_cat, region
mytable5<-aggregate(market_value,list(club),mean)
mytable5
## Group.1 x
## 1 Arsenal 19.642857
## 2 Bournemouth 4.895833
## 3 Brighton+and+Hove 2.522727
## 4 Burnley 3.958333
## 5 Chelsea 27.677500
## 6 Crystal+Palace 7.726190
## 7 Everton 10.098214
## 8 Huddersfield 1.791071
## 9 Leicester+City 8.645833
## 10 Liverpool 16.314815
## 11 Manchester+City 28.200000
## 12 Manchester+United 20.564000
## 13 Newcastle+United 5.202381
## 14 Southampton 10.000000
## 15 Stoke+City 6.818182
## 16 Swansea 5.560000
## 17 Tottenham 23.000000
## 18 Watford 5.166667
## 19 West+Brom 5.328947
## 20 West+Ham 8.818182
On an average Manchester City will have the players with the most Market Value.
mytable6<-aggregate(market_value,list(position),mean)
mytable6
## Group.1 x
## 1 AM 26.161765
## 2 CB 8.972353
## 3 CF 13.823770
## 4 CM 10.960317
## 5 DM 12.347222
## 6 GK 7.234524
## 7 LB 8.300000
## 8 LM 4.000000
## 9 LW 13.293056
## 10 RB 7.742647
## 11 RM 10.600000
## 12 RW 12.195312
## 13 SS 11.357143
The Attacking Midfielders have the most Market Value in EPL.
mytable7<-aggregate(market_value,list(nationality),mean)
mytable7
## Group.1 x
## 1 Algeria 22.333333
## 2 Argentina 12.779412
## 3 Armenia 35.000000
## 4 Australia 2.875000
## 5 Austria 4.750000
## 6 Belgium 25.805556
## 7 Benin 5.500000
## 8 Bermuda 5.000000
## 9 Bosnia 11.500000
## 10 Brazil 21.750000
## 11 Cameroon 10.333333
## 12 Canada 3.000000
## 13 Chile 36.500000
## 14 Colombia 7.000000
## 15 Congo DR 10.125000
## 16 Cote d'Ivoire 15.750000
## 17 Croatia 17.000000
## 18 Curacao 2.000000
## 19 Czech Republic 4.250000
## 20 Denmark 10.416667
## 21 Ecuador 7.250000
## 22 Egypt 12.750000
## 23 England 8.328526
## 24 Estonia 3.500000
## 25 Finland 0.250000
## 26 France 17.980000
## 27 Germany 13.781250
## 28 Ghana 8.700000
## 29 Greece 2.000000
## 30 Iceland 13.750000
## 31 Ireland 4.911765
## 32 Israel 1.875000
## 33 Italy 10.500000
## 34 Jamaica 2.000000
## 35 Japan 6.000000
## 36 Kenya 25.000000
## 37 Mali 3.750000
## 38 Morocco 9.750000
## 39 Netherlands 9.575000
## 40 New Zealand 10.000000
## 41 Nigeria 13.000000
## 42 Northern Ireland 4.083333
## 43 Norway 8.000000
## 44 Poland 3.500000
## 45 Portugal 13.191667
## 46 Romania 3.000000
## 47 Scotland 4.196429
## 48 Senegal 14.142857
## 49 Serbia 14.000000
## 50 Slovenia 0.500000
## 51 South Korea 12.833333
## 52 Spain 16.803571
## 53 Sweden 9.500000
## 54 Switzerland 14.375000
## 55 The Gambia 3.000000
## 56 Trinidad and Tobago 0.650000
## 57 Tunisia 1.000000
## 58 United States 4.000000
## 59 Uruguay 3.500000
## 60 Venezuela 15.000000
## 61 Wales 7.708333
Therefore Chile players have the most market value in EPL.
boxplot(age,horizontal = TRUE,col="light blue",xlab="Age(years)")
Most of the players are from 24 to 30 years.The average being 27.
plot(club,col="light blue")
plot(position,col="light blue")
hist(position_cat,col="light blue")
We have 150 attackers(1)and 150 defenders(3) with us,with a bit more than 100 midfielders(2) and a bit less than 50 goalkeepers(4).
hist(big_club,col="light blue")
So most of them are not from big club.
boxplot(market_value,col="light blue",main="Box plot for market value distribution")
hist(new_signing,col="light blue")
boxplot(page_views,col="light blue")
y<-epl[,6]
x<-epl[,3]
cor(x,y)
## [1] -0.1323962
y<-epl[,6]
x<-epl[,7:8]
cor(x,y)
## [,1]
## page_views 0.7396565
## fpl_value 0.7886534
x<-epl[,13:17]
cor(x,y)
## [,1]
## new_foreign 0.09805600
## age_cat -0.11768198
## club_id -0.04606806
## big_club 0.59348296
## new_signing 0.13132060
library(corrgram)
corrgram(epl, order=TRUE, lower.panel=panel.shade,
upper.panel=panel.pie, text.panel=panel.txt,
main="Corrgram of EPL Variable intercorrelations")
library(car)
##
## Attaching package: 'car'
## The following object is masked from 'package:psych':
##
## logit
scatterplotMatrix(formula=~age+market_value+page_views+fpl_points,diagonal="histogram",cex=0.6)
t.test(market_value,page_views)
##
## Welch Two Sample t-test
##
## data: market_value and page_views
## t = -17.344, df = 460.16, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -838.0558 -667.4733
## sample estimates:
## mean of x mean of y
## 11.01204 763.77657
Now as the p-value<0.05 ,therefore We Reject our Null Hypothesis hence Page views and Market value of a player is significant.
t.test(market_value,age)
##
## Welch Two Sample t-test
##
## data: market_value and age
## t = -26.323, df = 555.08, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -16.97121 -14.61425
## sample estimates:
## mean of x mean of y
## 11.01204 26.80477
Now as the p-value<0.05 ,therefore We Reject our Null Hypothesis hence Age and Market value of a player is significant.
t.test(market_value,fpl_value)
##
## Welch Two Sample t-test
##
## data: market_value and fpl_value
## t = 9.6882, df = 471.1, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 4.435555 6.692644
## sample estimates:
## mean of x mean of y
## 11.012039 5.447939
Now as the p-value<0.05 ,therefore We Reject our Null Hypothesis hence fpl_value and Market value of a player is significant.
t.test(market_value,new_signing)
##
## Welch Two Sample t-test
##
## data: market_value and new_signing
## t = 19.027, df = 460.76, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 9.744379 11.989027
## sample estimates:
## mean of x mean of y
## 11.0120390 0.1453362
Now as the p-value<0.05 ,therefore We Reject our Null Hypothesis hence new signing and Market value of a player is significant.
Thus,this concludes our report.