This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
First of all I’m going to address the libraries that will be needed
library(knitr)
library(readxl)
library(readr)
library(dplyr)
##
## Adjuntando el paquete: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
library(FactoMineR)
## Warning: package 'FactoMineR' was built under R version 4.4.3
library(factoextra)
## Warning: package 'factoextra' was built under R version 4.4.3
## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
library(corrplot)
## Warning: package 'corrplot' was built under R version 4.4.3
## corrplot 0.95 loaded
Next I will start with the development of my objective.
My objective is: Determine the importance of dragons and barons in the achievement of gold in the games in JUNGLE players as opposed to the data of the players regardless of their position, this is useful because it is the only position that focuses on killing these characters and therefore allows us to differentiate them.
To do so, I am going to perform two PCAs that allow us to observe how the different variables that show us game data affect the achievement of gold for a player. Therefore, we can take as a key variable ‘gold_earned’, the other most important variables in the analysis will be ‘baron_kills’ and ‘dragon_kills’.
The first step is to choose the variables to be kept for the analysis. From the data_games dataset are the variables: ‘kills’, ‘deaths’, ‘assists’, ‘baron_kills’, ‘dragon_kills’, ‘gold_earned’, ‘total_damage_dealt’, ‘total_damage_taken’. These are the variables that tell us about the player’s performance in the game. While from the data_champions dataset: ‘damage’, ‘toughness’, ‘control’, ‘mobility’, ‘utility’. Which are the variables that give us the stats of the character used by the player.
df_full_cleaned <- read.csv("D:/UPV/2º/Proyecto II/lol_dataset.csv")
View(df_full_cleaned)
variables_a_conservar <- c( "kills",
"deaths", "assists", "baron_kills", "dragon_kills", "gold_earned",
"total_damage_dealt", "total_damage_taken", "damage", "toughness", "control", "mobility", "utility"
)
df_useful <- df_full_cleaned %>% select(all_of(variables_a_conservar))
df_scaled <- as.data.frame(scale(df_useful))
View(df_scaled)
df_jungle <- df_full_cleaned[df_full_cleaned$team_position == "JUNGLE", ]
df_jungle <- df_jungle %>% select(all_of(variables_a_conservar))
df_jungle_scaled <- as.data.frame(scale(df_jungle))
View(df_jungle_scaled)
The first PCA to be performed will be from the dataframe df_scaled for all positions.
pca_result <- PCA(df_scaled, graph = FALSE)
eig.val <- get_eigenvalue(pca_result)
fviz_eig(pca_result, addlabels = TRUE, ylim = c(0, 100))
As we can see in the Scree Plot there are two principal components that stand out in terms of the variability explained but do not explain a large enough percentage to discard the others, therefore I will select the first 5 principal components that explain 74% of the variability.
fviz_pca_var(pca_result,
col.var = "contrib", # Color según contribución
gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
repel = TRUE)
fviz_pca_var(pca_result, axes = c(3,4),
col.var = "contrib", # Color según contribución
gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
repel = TRUE)
fviz_pca_var(pca_result, axes = c(4,5),
col.var = "contrib", # Color según contribución
gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
repel = TRUE)
var_cargas <- pca_result$var$coord
var_contrib <- pca_result$var$contrib
fviz_contrib(pca_result, choice = "var", axes = 1)
fviz_contrib(pca_result, choice = "var", axes = 2)
fviz_contrib(pca_result, choice = "var", axes = 3)
fviz_contrib(pca_result, choice = "var", axes = 4)
fviz_contrib(pca_result, choice = "var", axes = 5)
matriz_covarianza <- cov(df_scaled)
corrplot(matriz_covarianza, method = "color", tl.col = "black")
View(matriz_covarianza)
As can be seen both in the contribution of the principal components and in the covariance matrix, the correlation that exists between baron_kills and gold_earned and between dragon_kills and gold_earned is almost null, in none of the components that gold_earned highlights also highlight the data of barons and dragons and in the covariance matrix the data between baron_kills and gold_earned is 0.26340664 and between dragon_kills and gold_earned is 0.22341106.
We now move on to perform the PCA with the dataframe df_jungle_scaled which only has the observations of the JUNGLE players.
pca_result2 <- PCA(df_jungle_scaled, graph = FALSE)
eig.val <- get_eigenvalue(pca_result2)
fviz_eig(pca_result2, addlabels = TRUE, ylim = c(0, 100))
As we can see in this Scree Plot, the first principal components explain slightly more percentage of variance and the fourth and fifth slightly less, so I will also use 5 principal components that explain 76.3% of the total variability.
fviz_pca_var(pca_result2,
col.var = "contrib", # Color según contribución
gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
repel = TRUE)
fviz_pca_var(pca_result2, axes = c(3,4),
col.var = "contrib", # Color según contribución
gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
repel = TRUE)
fviz_pca_var(pca_result, axes = c(4,5),
col.var = "contrib", # Color según contribución
gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
repel = TRUE)
var_cargas2 <- pca_result2$var$coord
var_contrib2 <- pca_result2$var$contrib
fviz_contrib(pca_result2, choice = "var", axes = 1)
fviz_contrib(pca_result2, choice = "var", axes = 2)
fviz_contrib(pca_result2, choice = "var", axes = 3)
fviz_contrib(pca_result2, choice = "var", axes = 4)
fviz_contrib(pca_result2, choice = "var", axes = 5)
matriz_covarianza2 <- cov(df_jungle_scaled)
corrplot(matriz_covarianza2, method = "color", tl.col = "black")
View(matriz_covarianza2)
As can be seen both in the contribution of the principal components and in the covariance matrix, the correlation that exists both between baron_kills and gold_earned and between dragon_kills and gold_earned increases visibly between these variables, in the first component gold_earned stands out with 22% of the contribution and both baron_kills and dragon_kills are around 8% of the contribution. We can also observe how the covariance matrix has visibly increased the covariances previously mentioned, being now the data between baron_kills and gold_earned is 0.45857103 and between dragon_kills and gold_earned is 0.43702390.
With this data we can observe how the achievement of dragon and baron kills changes to the gold achieved during the games of the players in the JUNGLE position, being precisely this one of their main missions in the games.