Football players have many different skills. For each player’s position, different skills are needed compared to other positions. For example, most important skills needed by defender (defense player) are totally different from those required by attacker/striker or other positions. For that reason, I want to analyze the skill sets or key attributes most needed for each player’s position. For this case, I use the dataset contained in kaggle FIFA 19 complete player dataset. Detailed attributes for every player registered in the latest edition of FIFA 19 database.
First, we need to import dataset that has been taken from the kaggle dataset then catch a glimpse of the data.
#> Observations: 18,207
#> Variables: 89
#> $ X1 <dbl> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12...
#> $ ID <dbl> 158023, 20801, 190871, 193080, 192985, 1...
#> $ Name <chr> "L. Messi", "Cristiano Ronaldo", "Neymar...
#> $ Age <dbl> 31, 33, 26, 27, 27, 27, 32, 31, 32, 25, ...
#> $ Photo <chr> "https://cdn.sofifa.org/players/4/19/158...
#> $ Nationality <chr> "Argentina", "Portugal", "Brazil", "Spai...
#> $ Flag <chr> "https://cdn.sofifa.org/flags/52.png", "...
#> $ Overall <dbl> 94, 94, 92, 91, 91, 91, 91, 91, 91, 90, ...
#> $ Potential <dbl> 94, 94, 93, 93, 92, 91, 91, 91, 91, 93, ...
#> $ Club <chr> "FC Barcelona", "Juventus", "Paris Saint...
#> $ `Club Logo` <chr> "https://cdn.sofifa.org/teams/2/light/24...
#> $ Value <chr> "€110.5M", "€77M", "€118.5M", "€72M", "€...
#> $ Wage <chr> "€565K", "€405K", "€290K", "€260K", "€35...
#> $ Special <dbl> 2202, 2228, 2143, 1471, 2281, 2142, 2280...
#> $ `Preferred Foot` <chr> "Left", "Right", "Right", "Right", "Righ...
#> $ `International Reputation` <dbl> 5, 5, 5, 4, 4, 4, 4, 5, 4, 3, 4, 4, 3, 4...
#> $ `Weak Foot` <dbl> 4, 4, 5, 3, 5, 4, 4, 4, 3, 3, 4, 5, 3, 2...
#> $ `Skill Moves` <dbl> 4, 5, 5, 1, 4, 4, 4, 3, 3, 1, 4, 3, 2, 4...
#> $ `Work Rate` <chr> "Medium/ Medium", "High/ Low", "High/ Me...
#> $ `Body Type` <chr> "Messi", "C. Ronaldo", "Neymar", "Lean",...
#> $ `Real Face` <chr> "Yes", "Yes", "Yes", "Yes", "Yes", "Yes"...
#> $ Position <chr> "RF", "ST", "LW", "GK", "RCM", "LF", "RC...
#> $ `Jersey Number` <dbl> 10, 7, 10, 1, 7, 10, 10, 9, 15, 1, 9, 8,...
#> $ Joined <chr> "Jul 1, 2004", "Jul 10, 2018", "Aug 3, 2...
#> $ `Loaned From` <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
#> $ `Contract Valid Until` <chr> "2021", "2022", "2022", "2020", "2023", ...
#> $ Height <chr> "5'7", "6'2", "5'9", "6'4", "5'11", "5'8...
#> $ Weight <chr> "159lbs", "183lbs", "150lbs", "168lbs", ...
#> $ LS <chr> "88+2", "91+3", "84+3", NA, "82+3", "83+...
#> $ ST <chr> "88+2", "91+3", "84+3", NA, "82+3", "83+...
#> $ RS <chr> "88+2", "91+3", "84+3", NA, "82+3", "83+...
#> $ LW <chr> "92+2", "89+3", "89+3", NA, "87+3", "89+...
#> $ LF <chr> "93+2", "90+3", "89+3", NA, "87+3", "88+...
#> $ CF <chr> "93+2", "90+3", "89+3", NA, "87+3", "88+...
#> $ RF <chr> "93+2", "90+3", "89+3", NA, "87+3", "88+...
#> $ RW <chr> "92+2", "89+3", "89+3", NA, "87+3", "89+...
#> $ LAM <chr> "93+2", "88+3", "89+3", NA, "88+3", "89+...
#> $ CAM <chr> "93+2", "88+3", "89+3", NA, "88+3", "89+...
#> $ RAM <chr> "93+2", "88+3", "89+3", NA, "88+3", "89+...
#> $ LM <chr> "91+2", "88+3", "88+3", NA, "88+3", "89+...
#> $ LCM <chr> "84+2", "81+3", "81+3", NA, "87+3", "82+...
#> $ CM <chr> "84+2", "81+3", "81+3", NA, "87+3", "82+...
#> $ RCM <chr> "84+2", "81+3", "81+3", NA, "87+3", "82+...
#> $ RM <chr> "91+2", "88+3", "88+3", NA, "88+3", "89+...
#> $ LWB <chr> "64+2", "65+3", "65+3", NA, "77+3", "66+...
#> $ LDM <chr> "61+2", "61+3", "60+3", NA, "77+3", "63+...
#> $ CDM <chr> "61+2", "61+3", "60+3", NA, "77+3", "63+...
#> $ RDM <chr> "61+2", "61+3", "60+3", NA, "77+3", "63+...
#> $ RWB <chr> "64+2", "65+3", "65+3", NA, "77+3", "66+...
#> $ LB <chr> "59+2", "61+3", "60+3", NA, "73+3", "60+...
#> $ LCB <chr> "47+2", "53+3", "47+3", NA, "66+3", "49+...
#> $ CB <chr> "47+2", "53+3", "47+3", NA, "66+3", "49+...
#> $ RCB <chr> "47+2", "53+3", "47+3", NA, "66+3", "49+...
#> $ RB <chr> "59+2", "61+3", "60+3", NA, "73+3", "60+...
#> $ Crossing <dbl> 84, 84, 79, 17, 93, 81, 86, 77, 66, 13, ...
#> $ Finishing <dbl> 95, 94, 87, 13, 82, 84, 72, 93, 60, 11, ...
#> $ HeadingAccuracy <dbl> 70, 89, 62, 21, 55, 61, 55, 77, 91, 15, ...
#> $ ShortPassing <dbl> 90, 81, 84, 50, 92, 89, 93, 82, 78, 29, ...
#> $ Volleys <dbl> 86, 87, 84, 13, 82, 80, 76, 88, 66, 13, ...
#> $ Dribbling <dbl> 97, 88, 96, 18, 86, 95, 90, 87, 63, 12, ...
#> $ Curve <dbl> 93, 81, 88, 21, 85, 83, 85, 86, 74, 13, ...
#> $ FKAccuracy <dbl> 94, 76, 87, 19, 83, 79, 78, 84, 72, 14, ...
#> $ LongPassing <dbl> 87, 77, 78, 51, 91, 83, 88, 64, 77, 26, ...
#> $ BallControl <dbl> 96, 94, 95, 42, 91, 94, 93, 90, 84, 16, ...
#> $ Acceleration <dbl> 91, 89, 94, 57, 78, 94, 80, 86, 76, 43, ...
#> $ SprintSpeed <dbl> 86, 91, 90, 58, 76, 88, 72, 75, 75, 60, ...
#> $ Agility <dbl> 91, 87, 96, 60, 79, 95, 93, 82, 78, 67, ...
#> $ Reactions <dbl> 95, 96, 94, 90, 91, 90, 90, 92, 85, 86, ...
#> $ Balance <dbl> 95, 70, 84, 43, 77, 94, 94, 83, 66, 49, ...
#> $ ShotPower <dbl> 85, 95, 80, 31, 91, 82, 79, 86, 79, 22, ...
#> $ Jumping <dbl> 68, 95, 61, 67, 63, 56, 68, 69, 93, 76, ...
#> $ Stamina <dbl> 72, 88, 81, 43, 90, 83, 89, 90, 84, 41, ...
#> $ Strength <dbl> 59, 79, 49, 64, 75, 66, 58, 83, 83, 78, ...
#> $ LongShots <dbl> 94, 93, 82, 12, 91, 80, 82, 85, 59, 12, ...
#> $ Aggression <dbl> 48, 63, 56, 38, 76, 54, 62, 87, 88, 34, ...
#> $ Interceptions <dbl> 22, 29, 36, 30, 61, 41, 83, 41, 90, 19, ...
#> $ Positioning <dbl> 94, 95, 89, 12, 87, 87, 79, 92, 60, 11, ...
#> $ Vision <dbl> 94, 82, 87, 68, 94, 89, 92, 84, 63, 70, ...
#> $ Penalties <dbl> 75, 85, 81, 40, 79, 86, 82, 85, 75, 11, ...
#> $ Composure <dbl> 96, 95, 94, 68, 88, 91, 84, 85, 82, 70, ...
#> $ Marking <dbl> 33, 28, 27, 15, 68, 34, 60, 62, 87, 27, ...
#> $ StandingTackle <dbl> 28, 31, 24, 21, 58, 27, 76, 45, 92, 12, ...
#> $ SlidingTackle <dbl> 26, 23, 33, 13, 51, 22, 73, 38, 91, 18, ...
#> $ GKDiving <dbl> 6, 7, 9, 90, 15, 11, 13, 27, 11, 86, 15,...
#> $ GKHandling <dbl> 11, 11, 9, 85, 13, 12, 9, 25, 8, 92, 6, ...
#> $ GKKicking <dbl> 15, 15, 15, 87, 5, 6, 7, 31, 9, 78, 12, ...
#> $ GKPositioning <dbl> 14, 14, 15, 88, 10, 8, 14, 33, 7, 88, 8,...
#> $ GKReflexes <dbl> 8, 11, 11, 94, 13, 8, 9, 37, 11, 89, 10,...
#> $ `Release Clause` <chr> "€226.5M", "€127.1M", "€228.1M", "€138.6...
Description :
X1 : row numberID : unique id for every playerName : nameAge : agePhoto : url to the player’s photoNationality : nationalityFlag : url to players’s country flagOveral : loverall ratingPotential : potential ratingClub : current clubClub Logo : url to club logoValue : current market valueWage : current wageSpecial : specialPreferred Foot : left/rightInternational Reputation : rating on scale of 5Weak Foot : rating on scale of 5Skill Moves : rating on scale of 5Work Rate : attack work rate/defence work rateBody Type : body type of playerReal Face : true or falsePosition : position on the pitchJersey Number : jersey numberJoined : joined dateLoaned From : club name if applicableContract Valid Until :contract end dateHeight : height of the playerWeight : weight of the playerLSrating - RBrating : position rating on scale of 100Crossingrating - GKReflexesrating : skill/attribute rating on scale of 100Release Clause : release clause valueWe can eliminate variables that we don’t need and modify the value of certain columns such as Value, Wage, Position, Height, Weight and Release Clause.
fifa <- fifa %>%
select(-c(1:3, 5:7, 9:11, 14:15, 19:21, 23:26)) %>% #crashed with MASS packages
mutate(
Value = gsub("€", "", Value),
Value = as.numeric(gsub("M", "000", Value)), # € 1k value
Wage = gsub("€", "", Wage),
Wage = as.numeric(gsub("K", "", Wage)), # € 1k wage
Position = as.factor(Position),
Height = as.numeric(gsub("\'", ".", Height)),
Weight = as.numeric(gsub("lbs", "", Weight)),
`Release Clause` = gsub("€", "", `Release Clause`),
`Release Clause` = as.numeric(gsub("M", "000", `Release Clause`))
)Removing “+” on Position Ratings
To simplify further analysis, we can remove the “+ …” in the position rating which is in the 11th to 36th column numbers.
Adjust Position
Before modeling, it would be better if we combine several player’s positions into new groups because these positions look similar or close enough.
#> [1] RF ST LW GK RCM LF RS RCB LCM CB LDM CAM CDM LS LCB
#> [16] RM LAM LM LB RDM RW CM RB RAM CF RWB LWB <NA>
#> 27 Levels: CAM CB CDM CF CM GK LAM LB LCB LCM LDM LF LM LS LW LWB RAM ... ST
fifa$Position <- ifelse(fifa$Position == "GK" , "GK",
ifelse(fifa$Position %in% c("LCB", "CB", "RCB"), "CB",
ifelse(fifa$Position %in% c("LB", "LWB", "RWB", "RB"), "LB/RB",
ifelse(fifa$Position %in% c("LDM", "CDM", "RDM"), "DM",
ifelse(fifa$Position %in% c("LCM", "CM", "RCM"), "CM",
ifelse(fifa$Position %in% c("LM", "RM"), "LM/RM",
ifelse(fifa$Position %in% c("LAM", "CAM", "RAM"), "AM",
ifelse(fifa$Position %in% c("LW", "RW"), "LW/RW",
ifelse(fifa$Position %in% c("LF", "CF", "RF"),"CF",
"ST")))))))))
fifa$Position <- factor(fifa$Position, levels = c("GK", "CB", "LB/RB", "DM", "CM",
"LM/RM", "AM", "LW/RW", "CF", "ST"))
levels(fifa$Position)#> [1] "GK" "CB" "LB/RB" "DM" "CM" "LM/RM" "AM" "LW/RW" "CF"
#> [10] "ST"
#>
#> GK CB LB/RB DM CM LM/RM AM LW/RW CF ST
#> 2025 3088 2778 1439 2180 2219 1000 751 105 2562
#> [1] "Age" "Overall"
#> [3] "Value" "Wage"
#> [5] "International Reputation" "Weak Foot"
#> [7] "Skill Moves" "Position"
#> [9] "Height" "Weight"
#> [11] "LS" "ST"
#> [13] "RS" "LW"
#> [15] "LF" "CF"
#> [17] "RF" "RW"
#> [19] "LAM" "CAM"
#> [21] "RAM" "LM"
#> [23] "LCM" "CM"
#> [25] "RCM" "RM"
#> [27] "LWB" "LDM"
#> [29] "CDM" "RDM"
#> [31] "RWB" "LB"
#> [33] "LCB" "CB"
#> [35] "RCB" "RB"
#> [37] "Crossing" "Finishing"
#> [39] "HeadingAccuracy" "ShortPassing"
#> [41] "Volleys" "Dribbling"
#> [43] "Curve" "FKAccuracy"
#> [45] "LongPassing" "BallControl"
#> [47] "Acceleration" "SprintSpeed"
#> [49] "Agility" "Reactions"
#> [51] "Balance" "ShotPower"
#> [53] "Jumping" "Stamina"
#> [55] "Strength" "LongShots"
#> [57] "Aggression" "Interceptions"
#> [59] "Positioning" "Vision"
#> [61] "Penalties" "Composure"
#> [63] "Marking" "StandingTackle"
#> [65] "SlidingTackle" "GKDiving"
#> [67] "GKHandling" "GKKicking"
#> [69] "GKPositioning" "GKReflexes"
#> [71] "Release Clause"
Next, we need to merge position rating variables into position_names, and attribute or skill set variables into skill_set.
NA value Handling
To replace NA values, we can use the mean of each variable.
rNA <- function(x){
x = replace_na(data = x, replace = mean(x, na.rm = TRUE))
}
fifa[,range] <- lapply(fifa[,range], rNA)
colSums(is.na(fifa))#> Age Overall Value
#> 0 0 0
#> Wage International Reputation Weak Foot
#> 0 0 0
#> Skill Moves Position Height
#> 0 60 0
#> Weight LS ST
#> 0 0 0
#> RS LW LF
#> 0 0 0
#> CF RF RW
#> 0 0 0
#> LAM CAM RAM
#> 0 0 0
#> LM LCM CM
#> 0 0 0
#> RCM RM LWB
#> 0 0 0
#> LDM CDM RDM
#> 0 0 0
#> RWB LB LCB
#> 0 0 0
#> CB RCB RB
#> 0 0 0
#> Crossing Finishing HeadingAccuracy
#> 0 0 0
#> ShortPassing Volleys Dribbling
#> 0 0 0
#> Curve FKAccuracy LongPassing
#> 0 0 0
#> BallControl Acceleration SprintSpeed
#> 0 0 0
#> Agility Reactions Balance
#> 0 0 0
#> ShotPower Jumping Stamina
#> 0 0 0
#> Strength LongShots Aggression
#> 0 0 0
#> Interceptions Positioning Vision
#> 0 0 0
#> Penalties Composure Marking
#> 0 0 0
#> StandingTackle SlidingTackle GKDiving
#> 0 0 0
#> GKHandling GKKicking GKPositioning
#> 0 0 0
#> GKReflexes Release Clause
#> 0 0
After we are finished with the data preparation, we can create a model for one of the positions first (GK). Filter players based on this position for our model and then get the summary.
GK_model <- fifa %>%
filter(Position == "GK") %>%
select(Overall, skill_set) %>%
lm(formula = Overall~., -1) %>%
summary()
GK_model$r.squared#> [1] 0.9981933
A very good results. Now we have a model with 0.9982 R-squared.
We can take the most important skill_set based on their p-value.
GK_model$coefficients %>%
as.data.frame() %>%
rownames_to_column("Skill set") %>%
arrange(`Pr(>|t|)`)fifa %>%
filter(Position == "GK") %>%
select(Overall, skill_set) %>%
lm(formula = Overall~., -1) %>%
plot()First, linear regression needs the relationship between the independent and dependent variables to be linear. It is also important to check for outliers since linear regression is sensitive to outlier effects. The linearity assumption can best be tested with scatter plots. Linearity test can be seen in the first plot of our GK’s model.
From plot 1 we can conclude that the model that has been created has a linearity between the variables
Secondly, the linear regression analysis requires all variables to be multivariate normal. This assumption can best be checked with a histogram, a Q-Q Plot, or shapiro.test. Q-Q plot can be seen in the second plot of our model.
#>
#> Shapiro-Wilk normality test
#>
#> data: GK_model$residuals
#> W = 0.96115, p-value < 0.00000000000000022
Because p-value (0.00000000000000022) < 0.05, the decision is to decline H0 argument, with the conclusion that the residuals of the model are not normally distributed.
When the data or residual is not normally distributed a non-linear transformation (e.g., log-transformation) might fix this issue.
GK_model_NR <- fifa_NR %>%
filter(Position == "GK") %>%
select(Overall, skill_set) %>%
lm(formula = Overall~., -1) %>%
summary()
GK_model_NR$r.squared#> [1] 0.9917265
Thirdly, linear regression assumes that there is little or no multicollinearity in the data. Multicollinearity occurs when the independent variables are too highly correlated with each other.
Multicollinearity may be tested with three central criteria:
library(GGally)
data <- fifa %>%
filter(Position == "GK") %>%
select(Overall, skill_set)
ggcorr(data = data, label = TRUE, label_size = 1)Tolerance – the tolerance measures the influence of one independent variable on all other independent variables; the tolerance is calculated with an initial linear regression analysis. Tolerance is defined as T = 1 – R² for these first step regression analysis. With T < 0.1 there might be multicollinearity in the data and with T < 0.01 there certainly is.
Variance Inflation Factor (VIF) – the variance inflation factor of the linear regression is defined as VIF = 1/T. With VIF > 5 there is an indication that multicollinearity may be present; with VIF > 10 there is certainly multicollinearity among the variables.
library(car)
fifa %>%
filter(Position == "GK") %>%
select(Overall, skill_set) %>%
lm(formula = Overall~., -1) %>%
vif()#> Crossing Finishing HeadingAccuracy ShortPassing Volleys
#> 1.523633 2.452322 1.304925 2.025296 2.388864
#> Dribbling Curve FKAccuracy LongPassing BallControl
#> 1.642408 1.658819 1.483780 2.044370 1.612153
#> Acceleration SprintSpeed Agility Reactions Balance
#> 3.845908 3.695664 1.720606 3.307530 1.653543
#> ShotPower Jumping Stamina Strength LongShots
#> 1.271619 1.824729 1.502421 1.279791 2.422697
#> Aggression Interceptions Positioning Vision Penalties
#> 1.279117 2.031089 2.287187 1.406484 1.447428
#> Composure Marking StandingTackle SlidingTackle GKDiving
#> 1.592778 1.118773 1.647257 1.611256 6.655822
#> GKHandling GKKicking GKPositioning GKReflexes
#> 4.791979 2.504367 5.473787 6.705475
Based on the value of
vif(), we can conclude that there is little or no multicollinearity in the data from our linear regression model.
The last assumption of the linear regression analysis is homoscedasticity. The scatter plot is good way to check whether the data are homoscedastic (meaning the residuals are equal across the regression line). If homoscedasticity is present, a non-linear correction might fix the problem. We want to get a model that produces a constant residual variance, does not form a pattern (trumpet form). If homoscedasticity is present, a non-linear correction might fix the problem.
For testing our errors have a pattern or not, we can use the bptest function from the lmtest package.
#>
#> studentized Breusch-Pagan test
#>
#> data: GK_model
#> BP = 77.053, df = 34, p-value = 0.00003474
Because the p-value (0.00003474) < 0.05, the benefit is to decline H0 argument, with the conclusion that the residuals of the model form a trumpet pattern.
After we build a model for the GK position, we can apply the same thing to all positions. In order to make sure about our models, we can call the value of each model’s summary.
All_models <- fifa$Position %>%
levels() %>%
lapply(function(x){
fifa %>%
filter(Position == x) %>%
select(Overall, skill_set) %>%
lm(formula = Overall~., -1) %>%
summary()
}
)
names(All_models) <- levels(fifa$Position)#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.9581 0.9747 0.9847 0.9825 0.9921 0.9982
As we can see, the minimum All position models R-squared is 0.9581 which is close to 1. I think it seems to be working.
Then we can take Top 5 most important skills from each position to answer the question at the beginning.
Most_important_skillset <- All_models %>%
lapply(
function(x){
x$coefficients %>%
as.data.frame() %>%
rownames_to_column("Skill set") %>%
arrange(`Pr(>|t|)`) %>%
head(5)
}
)
names(Most_important_skillset) <- names(All_models)#> $GK
#> Skill set Estimate Std. Error t value Pr(>|t|)
#> 1 Reactions 0.1092742 0.001277095 85.56460 0
#> 2 GKDiving 0.2124102 0.002403204 88.38623 0
#> 3 GKHandling 0.2149334 0.002115248 101.61148 0
#> 4 GKPositioning 0.2062769 0.001951578 105.69752 0
#> 5 GKReflexes 0.2101123 0.002282277 92.06259 0
#>
#> $CB
#> Skill set Estimate Std. Error t value
#> 1 Strength 0.09642416 0.001860911 51.81557
#> 2 Aggression 0.06614512 0.001493870 44.27770
#> 3 Marking 0.13968667 0.002428166 57.52765
#> 4 HeadingAccuracy 0.09697610 0.002244990 43.19668
#> 5 StandingTackle 0.17727924 0.004220490 42.00442
#> Pr(>|t|)
#> 1 0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
#> 2 0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
#> 3 0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
#> 4 0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000152361
#> 5 0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010537745239349629179
#>
#> $`LB/RB`
#> Skill set Estimate Std. Error t value
#> 1 Stamina 0.07478285 0.002561499 29.19495
#> 2 Reactions 0.10751807 0.004558713 23.58518
#> 3 Interceptions 0.09757275 0.004714205 20.69761
#> 4 Marking 0.08298898 0.004029750 20.59408
#> 5 Crossing 0.06953957 0.003662402 18.98742
#> Pr(>|t|)
#> 1 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002140909
#> 2 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000003893653615398228991142243193834815428999718278646469116211
#> 3 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000001518099134733602694879300853969539275567512959241867065429687500000000000000000000
#> 4 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000009695138100692928704964212993644423477235250174999237060546875000000000000000000000
#> 5 0.0000000000000000000000000000000000000000000000000000000000000000000000000012543991950466931381297924774642638112709391862154006958007812500000000000000000000000000000000
#>
#> $DM
#> Skill set Estimate Std. Error t value
#> 1 Reactions 0.14306689 0.008655330 16.529341
#> 2 Stamina 0.06647317 0.004138496 16.062157
#> 3 ShortPassing 0.18624239 0.012151630 15.326536
#> 4 BallControl 0.12642063 0.011209752 11.277737
#> 5 Interceptions 0.07927926 0.008213798 9.651961
#> Pr(>|t|)
#> 1 0.00000000000000000000000000000000000000000000000000000003262731
#> 2 0.00000000000000000000000000000000000000000000000000002013782641
#> 3 0.00000000000000000000000000000000000000000000000038553414944030
#> 4 0.00000000000000000000000000026776467958860721042813057835019208
#> 5 0.00000000000000000000219009085877532501266187758837133969791466
#>
#> $CM
#> Skill set Estimate Std. Error t value
#> 1 Stamina 0.06474126 0.003360926 19.26292
#> 2 ShortPassing 0.20343390 0.011185294 18.18762
#> 3 Reactions 0.11569548 0.006466956 17.89025
#> 4 BallControl 0.17690054 0.010461058 16.91039
#> 5 Vision 0.08876844 0.006915003 12.83708
#> Pr(>|t|)
#> 1 0.0000000000000000000000000000000000000000000000000000000000000000000000000002166549
#> 2 0.0000000000000000000000000000000000000000000000000000000000000000000074455910847965
#> 3 0.0000000000000000000000000000000000000000000000000000000000000000007956465472105367
#> 4 0.0000000000000000000000000000000000000000000000000000000000025697485693995304831955
#> 5 0.0000000000000000000000000000000000021460554401539527987798072761194134727702476084
#>
#> $`LM/RM`
#> Skill set Estimate Std. Error t value
#> 1 BallControl 0.16393503 0.006772513 24.20594
#> 2 Stamina 0.05098566 0.002208673 23.08429
#> 3 Crossing 0.07921829 0.003463752 22.87066
#> 4 ShortPassing 0.13347797 0.005859468 22.77988
#> 5 Reactions 0.08148669 0.003773038 21.59711
#> Pr(>|t|)
#> 1 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000007293611
#> 2 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011172735224165716
#> 3 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000586214150410233968
#> 4 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000003132883883634506623
#> 5 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000006516053349940478705074053956
#>
#> $AM
#> Skill set Estimate Std. Error t value
#> 1 Reactions 0.1120849 0.007739838 14.481560
#> 2 Vision 0.1163064 0.009023450 12.889347
#> 3 ShortPassing 0.1526780 0.011927551 12.800447
#> 4 Dribbling 0.1217874 0.011888405 10.244215
#> 5 Positioning 0.0741270 0.007824427 9.473794
#> Pr(>|t|)
#> 1 0.0000000000000000000000000000000000000000003780439
#> 2 0.0000000000000000000000000000000000346590126367709
#> 3 0.0000000000000000000000000000000000924272885256413
#> 4 0.0000000000000000000000189936098305544582565129730
#> 5 0.0000000000000000000201835073670912089793989807607
#>
#> $`LW/RW`
#> Skill set Estimate Std. Error t value
#> 1 BallControl 0.17195084 0.010985689 15.65226
#> 2 Finishing 0.10213599 0.006882860 14.83918
#> 3 Positioning 0.10477914 0.007337489 14.27997
#> 4 Reactions 0.08575898 0.006243005 13.73681
#> 5 ShortPassing 0.12773946 0.009414014 13.56907
#> Pr(>|t|)
#> 1 0.00000000000000000000000000000000000000000000001054368
#> 2 0.00000000000000000000000000000000000000000012664176306
#> 3 0.00000000000000000000000000000000000000006957978885332
#> 4 0.00000000000000000000000000000000000002806918100562673
#> 5 0.00000000000000000000000000000000000017430490075182905
#>
#> $CF
#> Skill set Estimate Std. Error t value Pr(>|t|)
#> 1 Positioning 0.15957269 0.02663231 5.991695 0.00000008437078
#> 2 Strength 0.04595170 0.01257264 3.654898 0.00049770594943
#> 3 BallControl 0.17151889 0.04841962 3.542343 0.00071610639909
#> 4 Reactions 0.09439768 0.02701076 3.494817 0.00083337038872
#> 5 SprintSpeed 0.07997774 0.02378144 3.363032 0.00126105749835
#>
#> $ST
#> Skill set Estimate Std. Error t value
#> 1 Finishing 0.17795207 0.003834533 46.40775
#> 2 Positioning 0.13287327 0.003223029 41.22621
#> 3 ShotPower 0.10132496 0.002908205 34.84106
#> 4 HeadingAccuracy 0.07810085 0.002305287 33.87901
#> 5 Reactions 0.08637564 0.002878151 30.01081
#> Pr(>|t|)
#> 1 0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
#> 2 0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001494216
#> 3 0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000015698962394855998146812187110654690513911191374063491821289062500000000000
#> 4 0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000097102277411803981565716070711857810238143429160118103027343750000000000000000000000
#> 5 0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000016724219068233278782248385496700393559876829385757446289062500000000000000000000000000000000000000000000000000000000000000
To make it easier to analyze, I will store the result into a new dataframe/table.
b <- rbind(Most_important_skillset[[1]]$`Skill set`,
Most_important_skillset[[2]]$`Skill set`,
Most_important_skillset[[3]]$`Skill set`,
Most_important_skillset[[4]]$`Skill set`,
Most_important_skillset[[5]]$`Skill set`,
Most_important_skillset[[6]]$`Skill set`,
Most_important_skillset[[7]]$`Skill set`,
Most_important_skillset[[8]]$`Skill set`,
Most_important_skillset[[9]]$`Skill set`,
Most_important_skillset[[10]]$`Skill set`)
b <- as.data.frame(b)
c <- as.data.frame(cbind(a,b))
names(c) <- c("Position", "Attribute1","Attribute2",
"Attribute3", "Attribute4","Attribute5")Now, we have Top 5 important attributes/skills for each position from fifa 19 dataset.