LPL 2021 Spring - Player Stat(regular season)

This dataset collects all types of information from the LPL 2021 Spring regular season, separated by professional players who enter the stage at least five times (not incorporated with substitution players). During this project, we also need to attach another dataset that indicates the playoff season’s information to answer my question.

Import the data

#set the environment and import multiple libraries we might exert
library(tidyverse)

## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──

## ✓ ggplot2 3.3.3     ✓ purrr   0.3.4
## ✓ tibble  3.1.2     ✓ dplyr   1.0.6
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1

## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

library(tinytex)
library(treemap)
library(tidyverse)
library(RColorBrewer)


setwd("~/Documents/DATA 110/data")
regular = read_csv("LPL 2021 Spring - Player Stats - OraclesElixir.csv",)

## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   .default = col_double(),
##   Player = col_character(),
##   Team = col_character(),
##   Pos = col_character(),
##   `W%` = col_character(),
##   `CTR%` = col_character(),
##   KP = col_character(),
##   `KS%` = col_character(),
##   `DTH%` = col_character(),
##   `FB%` = col_character(),
##   `CS%P15` = col_character(),
##   `DMG%` = col_character(),
##   `GOLD%` = col_character()
## )
## ℹ Use `spec()` for the full column specifications.

playoff = read_csv("LPL 2021 Spring Playoffs - Player Stats - OraclesElixir.csv")

## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   .default = col_double(),
##   Player = col_character(),
##   Team = col_character(),
##   Pos = col_character(),
##   `W%` = col_character(),
##   `CTR%` = col_character(),
##   KP = col_character(),
##   `KS%` = col_character(),
##   `DTH%` = col_character(),
##   `FB%` = col_character(),
##   `CS%P15` = col_character(),
##   `DMG%` = col_character(),
##   `GOLD%` = col_character()
## )
## ℹ Use `spec()` for the full column specifications.

Revise the dataset and make some appropriate adjustments.

#display the structure of the dataset 
str(regular)

## spec_tbl_df [113 × 25] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ Player: chr [1:113] "Baolan" "Crisp" "Zhuo" "Lucas" ...
##  $ Team  : chr [1:113] "Invictus Gaming" "FunPlus Phoenix" "Top Esports" "Invictus Gaming" ...
##  $ Pos   : chr [1:113] "Support" "Support" "Support" "Support" ...
##  $ GP    : num [1:113] 13 35 35 23 17 25 36 35 37 21 ...
##  $ W%    : chr [1:113] "54%" "69%" "71%" "61%" ...
##  $ CTR%  : chr [1:113] "62%" "46%" "51%" "43%" ...
##  $ K     : num [1:113] 12 28 24 13 7 19 39 17 29 9 ...
##  $ D     : num [1:113] 45 103 77 62 54 95 95 81 93 95 ...
##  $ A     : num [1:113] 112 352 346 206 137 177 281 270 386 130 ...
##  $ KDA   : num [1:113] 2.8 3.7 4.8 3.5 2.7 2.1 3.4 3.5 4.5 1.5 ...
##  $ KP    : chr [1:113] "61.4%" "66.9%" "63.5%" "65.0%" ...
##  $ KS%   : chr [1:113] "5.9%" "4.9%" "4.1%" "3.9%" ...
##  $ DTH%  : chr [1:113] "25.1%" "23.6%" "20.9%" "21.8%" ...
##  $ FB%   : chr [1:113] "38%" "37%" "34%" "17%" ...
##  $ GD10  : num [1:113] -108 42 87 77 -21 -64 83 59 37 -131 ...
##  $ XPD10 : num [1:113] -97 -101 18 129 -23 2 -34 34 21 -108 ...
##  $ CSD10 : num [1:113] -0.8 -2.9 0.5 0.9 -0.4 1.2 0.3 -1.7 0.6 -0.9 ...
##  $ CSPM  : num [1:113] 1.2 1.1 1.3 1.1 1.2 1.2 1.2 1.1 1.2 1.1 ...
##  $ CS%P15: chr [1:113] "2.7%" "2.6%" "3.3%" "2.6%" ...
##  $ DPM   : num [1:113] 125 142 139 143 133 122 150 130 149 139 ...
##  $ DMG%  : chr [1:113] "6.2%" "6.3%" "6.4%" "6.8%" ...
##  $ EGPM  : num [1:113] 108 116 113 107 101 99 104 99 114 89 ...
##  $ GOLD% : chr [1:113] "9.1%" "9.0%" "8.9%" "9.0%" ...
##  $ WPM   : num [1:113] 1.55 1.59 1.65 1.64 1.56 1.56 1.76 1.96 1.71 1.62 ...
##  $ WCPM  : num [1:113] 0.35 0.32 0.5 0.32 0.45 0.38 0.4 0.44 0.37 0.23 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   Player = col_character(),
##   ..   Team = col_character(),
##   ..   Pos = col_character(),
##   ..   GP = col_double(),
##   ..   `W%` = col_character(),
##   ..   `CTR%` = col_character(),
##   ..   K = col_double(),
##   ..   D = col_double(),
##   ..   A = col_double(),
##   ..   KDA = col_double(),
##   ..   KP = col_character(),
##   ..   `KS%` = col_character(),
##   ..   `DTH%` = col_character(),
##   ..   `FB%` = col_character(),
##   ..   GD10 = col_double(),
##   ..   XPD10 = col_double(),
##   ..   CSD10 = col_double(),
##   ..   CSPM = col_double(),
##   ..   `CS%P15` = col_character(),
##   ..   DPM = col_double(),
##   ..   `DMG%` = col_character(),
##   ..   EGPM = col_double(),
##   ..   `GOLD%` = col_character(),
##   ..   WPM = col_double(),
##   ..   WCPM = col_double()
##   .. )

str(playoff)

## spec_tbl_df [50 × 25] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ Player: chr [1:50] "369" "Angel" "beishang" "Bin" ...
##  $ Team  : chr [1:50] "Top Esports" "Suning" "Team WE" "Suning" ...
##  $ Pos   : chr [1:50] "Top" "Middle" "Jungle" "Top" ...
##  $ GP    : num [1:50] 12 10 3 10 3 20 17 9 20 3 ...
##  $ W%    : chr [1:50] "42%" "70%" "0%" "70%" ...
##  $ CTR%  : chr [1:50] "50%" "60%" "0%" "70%" ...
##  $ K     : num [1:50] 47 31 2 37 0 13 56 16 53 3 ...
##  $ D     : num [1:50] 35 13 9 15 8 56 49 22 42 7 ...
##  $ A     : num [1:50] 72 86 5 43 4 157 119 38 140 2 ...
##  $ KDA   : num [1:50] 3.4 9 0.8 5.3 0.5 3 3.6 2.5 4.6 0.7 ...
##  $ KP    : chr [1:50] "57.8%" "71.8%" "100.0%" "49.1%" ...
##  $ KS%   : chr [1:50] "22.8%" "19.0%" "28.6%" "22.7%" ...
##  $ DTH%  : chr [1:50] "17.8%" "14.0%" "22.0%" "16.1%" ...
##  $ FB%   : chr [1:50] "8%" "40%" "33%" "10%" ...
##  $ GD10  : num [1:50] 351 -78 -19 -19 79 -41 50 -301 -97 -582 ...
##  $ XPD10 : num [1:50] 504 -54 -446 -139 48 -52 78 -105 -115 -124 ...
##  $ CSD10 : num [1:50] 10.9 -4.5 -9.3 -2.3 5 4.5 -0.2 -2.2 -2.2 -11.7 ...
##  $ CSPM  : num [1:50] 8 7.2 5.7 8 8.7 1.7 8.2 7.9 9.1 9.5 ...
##  $ CS%P15: chr [1:50] "24.1%" "16.9%" "13.6%" "25.6%" ...
##  $ DPM   : num [1:50] 570 413 160 427 175 119 486 448 473 384 ...
##  $ DMG%  : chr [1:50] "24.5%" "19.9%" "12.3%" "20.6%" ...
##  $ EGPM  : num [1:50] 283 234 156 265 201 112 256 233 282 235 ...
##  $ GOLD% : chr [1:50] "23.8%" "19.9%" "17.6%" "22.2%" ...
##  $ WPM   : num [1:50] 0.36 0.4 0.23 0.52 0.42 1.51 0.41 0.47 0.49 0.48 ...
##  $ WCPM  : num [1:50] 0.21 0.18 0.63 0.14 0.21 0.28 0.19 0.15 0.18 0.17 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   Player = col_character(),
##   ..   Team = col_character(),
##   ..   Pos = col_character(),
##   ..   GP = col_double(),
##   ..   `W%` = col_character(),
##   ..   `CTR%` = col_character(),
##   ..   K = col_double(),
##   ..   D = col_double(),
##   ..   A = col_double(),
##   ..   KDA = col_double(),
##   ..   KP = col_character(),
##   ..   `KS%` = col_character(),
##   ..   `DTH%` = col_character(),
##   ..   `FB%` = col_character(),
##   ..   GD10 = col_double(),
##   ..   XPD10 = col_double(),
##   ..   CSD10 = col_double(),
##   ..   CSPM = col_double(),
##   ..   `CS%P15` = col_character(),
##   ..   DPM = col_double(),
##   ..   `DMG%` = col_character(),
##   ..   EGPM = col_double(),
##   ..   `GOLD%` = col_character(),
##   ..   WPM = col_double(),
##   ..   WCPM = col_double()
##   .. )

#clean the dataset
names(regular) <- tolower(names(regular))
names(regular) <- gsub(" ","",names(regular))
str(regular)

## spec_tbl_df [113 × 25] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ player: chr [1:113] "Baolan" "Crisp" "Zhuo" "Lucas" ...
##  $ team  : chr [1:113] "Invictus Gaming" "FunPlus Phoenix" "Top Esports" "Invictus Gaming" ...
##  $ pos   : chr [1:113] "Support" "Support" "Support" "Support" ...
##  $ gp    : num [1:113] 13 35 35 23 17 25 36 35 37 21 ...
##  $ w%    : chr [1:113] "54%" "69%" "71%" "61%" ...
##  $ ctr%  : chr [1:113] "62%" "46%" "51%" "43%" ...
##  $ k     : num [1:113] 12 28 24 13 7 19 39 17 29 9 ...
##  $ d     : num [1:113] 45 103 77 62 54 95 95 81 93 95 ...
##  $ a     : num [1:113] 112 352 346 206 137 177 281 270 386 130 ...
##  $ kda   : num [1:113] 2.8 3.7 4.8 3.5 2.7 2.1 3.4 3.5 4.5 1.5 ...
##  $ kp    : chr [1:113] "61.4%" "66.9%" "63.5%" "65.0%" ...
##  $ ks%   : chr [1:113] "5.9%" "4.9%" "4.1%" "3.9%" ...
##  $ dth%  : chr [1:113] "25.1%" "23.6%" "20.9%" "21.8%" ...
##  $ fb%   : chr [1:113] "38%" "37%" "34%" "17%" ...
##  $ gd10  : num [1:113] -108 42 87 77 -21 -64 83 59 37 -131 ...
##  $ xpd10 : num [1:113] -97 -101 18 129 -23 2 -34 34 21 -108 ...
##  $ csd10 : num [1:113] -0.8 -2.9 0.5 0.9 -0.4 1.2 0.3 -1.7 0.6 -0.9 ...
##  $ cspm  : num [1:113] 1.2 1.1 1.3 1.1 1.2 1.2 1.2 1.1 1.2 1.1 ...
##  $ cs%p15: chr [1:113] "2.7%" "2.6%" "3.3%" "2.6%" ...
##  $ dpm   : num [1:113] 125 142 139 143 133 122 150 130 149 139 ...
##  $ dmg%  : chr [1:113] "6.2%" "6.3%" "6.4%" "6.8%" ...
##  $ egpm  : num [1:113] 108 116 113 107 101 99 104 99 114 89 ...
##  $ gold% : chr [1:113] "9.1%" "9.0%" "8.9%" "9.0%" ...
##  $ wpm   : num [1:113] 1.55 1.59 1.65 1.64 1.56 1.56 1.76 1.96 1.71 1.62 ...
##  $ wcpm  : num [1:113] 0.35 0.32 0.5 0.32 0.45 0.38 0.4 0.44 0.37 0.23 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   Player = col_character(),
##   ..   Team = col_character(),
##   ..   Pos = col_character(),
##   ..   GP = col_double(),
##   ..   `W%` = col_character(),
##   ..   `CTR%` = col_character(),
##   ..   K = col_double(),
##   ..   D = col_double(),
##   ..   A = col_double(),
##   ..   KDA = col_double(),
##   ..   KP = col_character(),
##   ..   `KS%` = col_character(),
##   ..   `DTH%` = col_character(),
##   ..   `FB%` = col_character(),
##   ..   GD10 = col_double(),
##   ..   XPD10 = col_double(),
##   ..   CSD10 = col_double(),
##   ..   CSPM = col_double(),
##   ..   `CS%P15` = col_character(),
##   ..   DPM = col_double(),
##   ..   `DMG%` = col_character(),
##   ..   EGPM = col_double(),
##   ..   `GOLD%` = col_character(),
##   ..   WPM = col_double(),
##   ..   WCPM = col_double()
##   .. )

#clean the dataset
names(playoff) <- tolower(names(playoff))
names(playoff) <- gsub(" ","",names(playoff))
str(playoff)

## spec_tbl_df [50 × 25] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ player: chr [1:50] "369" "Angel" "beishang" "Bin" ...
##  $ team  : chr [1:50] "Top Esports" "Suning" "Team WE" "Suning" ...
##  $ pos   : chr [1:50] "Top" "Middle" "Jungle" "Top" ...
##  $ gp    : num [1:50] 12 10 3 10 3 20 17 9 20 3 ...
##  $ w%    : chr [1:50] "42%" "70%" "0%" "70%" ...
##  $ ctr%  : chr [1:50] "50%" "60%" "0%" "70%" ...
##  $ k     : num [1:50] 47 31 2 37 0 13 56 16 53 3 ...
##  $ d     : num [1:50] 35 13 9 15 8 56 49 22 42 7 ...
##  $ a     : num [1:50] 72 86 5 43 4 157 119 38 140 2 ...
##  $ kda   : num [1:50] 3.4 9 0.8 5.3 0.5 3 3.6 2.5 4.6 0.7 ...
##  $ kp    : chr [1:50] "57.8%" "71.8%" "100.0%" "49.1%" ...
##  $ ks%   : chr [1:50] "22.8%" "19.0%" "28.6%" "22.7%" ...
##  $ dth%  : chr [1:50] "17.8%" "14.0%" "22.0%" "16.1%" ...
##  $ fb%   : chr [1:50] "8%" "40%" "33%" "10%" ...
##  $ gd10  : num [1:50] 351 -78 -19 -19 79 -41 50 -301 -97 -582 ...
##  $ xpd10 : num [1:50] 504 -54 -446 -139 48 -52 78 -105 -115 -124 ...
##  $ csd10 : num [1:50] 10.9 -4.5 -9.3 -2.3 5 4.5 -0.2 -2.2 -2.2 -11.7 ...
##  $ cspm  : num [1:50] 8 7.2 5.7 8 8.7 1.7 8.2 7.9 9.1 9.5 ...
##  $ cs%p15: chr [1:50] "24.1%" "16.9%" "13.6%" "25.6%" ...
##  $ dpm   : num [1:50] 570 413 160 427 175 119 486 448 473 384 ...
##  $ dmg%  : chr [1:50] "24.5%" "19.9%" "12.3%" "20.6%" ...
##  $ egpm  : num [1:50] 283 234 156 265 201 112 256 233 282 235 ...
##  $ gold% : chr [1:50] "23.8%" "19.9%" "17.6%" "22.2%" ...
##  $ wpm   : num [1:50] 0.36 0.4 0.23 0.52 0.42 1.51 0.41 0.47 0.49 0.48 ...
##  $ wcpm  : num [1:50] 0.21 0.18 0.63 0.14 0.21 0.28 0.19 0.15 0.18 0.17 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   Player = col_character(),
##   ..   Team = col_character(),
##   ..   Pos = col_character(),
##   ..   GP = col_double(),
##   ..   `W%` = col_character(),
##   ..   `CTR%` = col_character(),
##   ..   K = col_double(),
##   ..   D = col_double(),
##   ..   A = col_double(),
##   ..   KDA = col_double(),
##   ..   KP = col_character(),
##   ..   `KS%` = col_character(),
##   ..   `DTH%` = col_character(),
##   ..   `FB%` = col_character(),
##   ..   GD10 = col_double(),
##   ..   XPD10 = col_double(),
##   ..   CSD10 = col_double(),
##   ..   CSPM = col_double(),
##   ..   `CS%P15` = col_character(),
##   ..   DPM = col_double(),
##   ..   `DMG%` = col_character(),
##   ..   EGPM = col_double(),
##   ..   `GOLD%` = col_character(),
##   ..   WPM = col_double(),
##   ..   WCPM = col_double()
##   .. )

#convert the data from char -> numeric
regular$`gold%` = as.numeric(sub("%","",regular$`gold%`))
regular$`dmg%` = as.numeric(sub("%","",regular$`dmg%`))

playoff$`gold%` = as.numeric(sub("%","",playoff$`gold%`))
playoff$`dmg%` = as.numeric(sub("%","",playoff$`dmg%`))

Concentrate the relevant data that we have to use

Since I desire to find out the conversion rate between gold(proportion of each match) and damage(Damage dealt with the enemy), the column we might employ incorporates player, team, position, gold, dmg, and kda.

#extract the data we might use
regular2 = regular %>%
  select(player,team,pos,`gold%`,`dmg%`,kda)%>%
  group_by(team)%>%
  arrange(desc(`gold%`))
regular2

## # A tibble: 113 x 6
## # Groups:   team [17]
##    player     team                pos   `gold%` `dmg%`   kda
##    <chr>      <chr>               <chr>   <dbl>  <dbl> <dbl>
##  1 huanfeng   Suning              ADC      28.8   26.9   8.5
##  2 Light      LNG Esports         ADC      28.5   24.9   4.4
##  3 Kramer     LGD Gaming          ADC      28     25.9   2.8
##  4 Eric       Oh My God           ADC      28     26.6   2.7
##  5 Viper      EDward Gaming       ADC      28     28.6   6.2
##  6 GALA       Royal Never Give Up ADC      27.7   25.3   7.2
##  7 kelin      Rogue Warriors      ADC      27.3   25.3   2.5
##  8 JackeyLove Top Esports         ADC      27.2   27.7   4.7
##  9 SamD       ThunderTalk Gaming  ADC      27.1   24.3   2.4
## 10 Puff       Invictus Gaming     ADC      26.8   20.6   4  
## # … with 103 more rows

#Use the treemap to display the relationship between dmg and gold. Also identify which team have the most valuable conversion rate between dmg and gold.

#use the treemap to display the relationship between dmg and gold. Also identify which team have the most valuable conversion rate between dmg and gold.
treemap(regular2, index="team", vSize="dmg%", 
        vColor="gold%", type="manual", 
        palette="RdYlBu")

#Use the treemap to display the relationship between dmg and gold.

Also identify which player have the most valuable conversion rate between dmg and gold.

#use the treemap to display the relationship between dmg and gold. Also identify which player have the most valuable conversion rate between dmg and gold.
treemap(regular2, index="player", vSize="dmg%", 
        vColor="gold%", type="manual", 
        palette="RdYlBu")

Conversion rate of each Position

Display the distribution of the conversion between gold and damage in each position

# Display the distribution of the conversion between gold and damage in each position
a <-regular2 %>% 
  ggplot(., aes(`gold%`, `dmg%`))+
  geom_point()+
  aes(color = pos)+
  facet_wrap(~pos)+
  ggtitle("Conversion rate of each Position ") +
  xlab("Gold")+
  ylab("Damage")
a

Most valuable player

# display the mvp by using point
regular3 = regular2 %>%
  filter(pos == "ADC"&`dmg%`>= `gold%`)%>%
  arrange(desc("dmg%"))%>%
ggplot(., aes(`gold%`, `dmg%`)) +
    geom_point(aes(color = player, size=kda), shape=19, alpha=0.5)+
  ggtitle("MVP")+
  xlab("The porprotion of the gold")+
  ylab("The damage to the enemy")
regular3

What is the difference between the regular season and playoff season

Since I have already watched the entire playoff season as a feverish spectator, I observed that myriads of the teams switch the scheme from the bottom line to the top line due to the changing of the version. Thus, I deem that the conversion rate at the top position should be greater than the bottom line and middle line.

#exstract the data we might use
playoff2 = playoff %>%
  select(player,team,pos,`gold%`,`dmg%`,kda)%>%
  group_by(team)%>%
  arrange(desc(`gold%`))
playoff2

## # A tibble: 50 x 6
## # Groups:   team [10]
##    player     team                pos   `gold%` `dmg%`   kda
##    <chr>      <chr>               <chr>   <dbl>  <dbl> <dbl>
##  1 Light      LNG Esports         ADC      29.8   23.6   1  
##  2 LokeN      JD Gaming           ADC      29.4   31.5   3.2
##  3 iBoy       Rare Atom           ADC      28.6   26.5   4  
##  4 huanfeng   Suning              ADC      28.2   33.4   7.3
##  5 Viper      EDward Gaming       ADC      28     26.9   5  
##  6 GALA       Royal Never Give Up ADC      26.8   26.3   4.3
##  7 Elk        Team WE             ADC      26.4   30.9   0.7
##  8 Lwx        FunPlus Phoenix     ADC      26.1   24.9   3.3
##  9 Wink       Invictus Gaming     ADC      25.8   21.5   2.7
## 10 JackeyLove Top Esports         ADC      25.5   30.6   3.3
## # … with 40 more rows

Conversion rate of each Position (playoff)

# using facet_wrap do the same thing with the playoff season
b <-playoff2 %>% 
  ggplot(., aes(`gold%`, `dmg%`))+
  geom_point()+
  aes(color = pos)+
  facet_wrap(~pos)+
  ggtitle("Conversion rate of each Position (playoff)") +
  xlab("Gold")+
  ylab("Damage")
b

Mvp of the playoff season

# Using the scatterplot to find out the mvp for playoff season
playoff3 = playoff2 %>%
  filter(pos == "ADC"&`dmg%`>= `gold%`)%>%
  arrange(desc("dmg%"))%>%
ggplot(., aes(`gold%`, `dmg%`)) +
    geom_point(aes(color = player, size=kda), shape=19, alpha=0.5)+
  ggtitle("MVP")+
  xlab("The porprotion of the gold")+
  ylab("The damage to the enemy")
playoff3

Summary

My data visualization is about one of the most famous and prominent video games called league of legends. Before we enter the main content of my visualization, I prefer to briefly introduce this game and the primary strategy of this game. League of Legends is a multiplayer online battle arena (MOBA), a subgenre of strategy video games. Two teams of players compete against each other on a predefined battlefield. The player controls a character (“champion”) with a set of unique abilities from an isometric perspective. Generally speaking, the entire battlefield has been separated by three lines or paths, which also draw forth five positions in one game: top, jungle, mid, ADC (Attack Damage Carry), and support. My data visualization intends to determine the most advantageous position in which the conversion rate is more remarkable between gold and damage. For instance, if a player obtains 30% of income per team, did he take 30% of damage to the enemy or higher? Furthermore, who is the most valuable player in a specific position (Regular season and playoff season)? Additionally, Since I have already watched the entire playoff season as a feverish spectator, I observed that myriads of the teams switch the scheme from the bottom line to the top line due to the changing of the version. Thus, I deem that the conversion rate at the top position should be greater than the bottom line and middle line. The dataset that I found from Oracle Elixir, a prominent website is containing myriads of E-sport competition, is established by some official recorders. There is diversification of various variables existing in the dataset; however, we only need to familiar with several of them incorporate players, teams, pos (position), gold% (income of per team), dmg% (damage to the enemy), and KDA (Kills Deaths Assists). In order to avoid syntax error, I employ sub and tolower function to convert the space to none and switch the capital letter to lower-case letter cleaning up my dataset. Moreover, I noticed that the ‘gold%’ and ‘dmg%’ columns are defined as col_character(); thus, I also exert a numeric statement combined with sub to eliminate the percentage marks (%) and convert the percentage marks type of the data from chr to dbl. Besides, I also concentrate the dataset by using select statement rearranging the column setting the condition focus on the five elements that I mentioned in the previous paragraph.

After treemaps and facet_wrap have illustrated the upgrade dataset, we could perceive that the position, which the conversion rate is higher between gold and damage, can partly belong to ADC either in the regular season or playoff season. One interesting fact is that the MVP plot graph for the regular season could not assist us in identifying who is the most valuable player since a large number of their performs is excessively approaching. Nonetheless, if we observe the playoff season chart, we could instantly notice that one player called huanfeng whose performs is significantly above others. Therefore, I prefer to nominate the MVP designation for him even if the team he served do not win the championship; his highlight performance should be cognized according to the angle of statistic area. Finally, according to my spectator experience of playoff season, a large number of the teams seems to switch the center of strategy from the bottom line to the top line; nevertheless, the playoff’s facet_wrap chart does not indicate that the damage comes from the top line has significant ascent compare to the regular season. Thus, we could conclude that even if the teams seem to emphasize the significance of the top player, the practical effectiveness does not as expect as possible eventually.

In the end, I wish I could be more precise to present my result with more official information, but some of the information could not reflect in the dataframe. For instance, on-the-spot performance, psychology, and interior regulate issues cannot be collected as data. Also, I wonder if an approach could display some approximate proportion. If I intend to set a new column that calculates the division of the gold and damage, the entire data will be visualized in a concentered, which cannot distinguish each player’s identification. Therefore, during the rest of the lecture, I will be focusing on additional enhancement and attempt to coordinate with the previous knowledge.

Project 1

Jiayuan Shen

6/18/2021