Introduction

This Notebook documents the progress of constructing charts in the lead-up to and duration of the 2020 League of Legend Mid-Season Cup between the LCK and LPL. Because of the COVID-19 pandemic, the regular League of Legends MidSeason Invitational (MSI) has been cancelled. This new tournament between the League of Legends Pro League (LPL) and League Champions Korea (LCK) has been created to hold some place of a Mid-Season tournament for players and fans.

Installing Packages

Going into the analysis, we’ll some packages. These packages are recommended by Tom’s Cookbook for better Viz (https://jthomasmock.github.io/nfl_plotting_cookbook/) , which was initially constructed for nflscrapR data. R Markdown doesn’t like running installs more than once, so just copy and paste each line after the # sign. Once you install these packages in your working directory R will be able to use them repeatedly so long as you use the library function.

#install.packages('tidyverse') # Data Cleaning, manipulation, summarization, plotting
#install.packages('gt') # beautiful tables
#install.packages('DT') # beautiful interactive tables
#install.packages('ggthemes') # custom pre-built themes
#install.packages('ggimage') # for the charts
#install.packages('ggforce') # better annotations
#install.packages('ggridges') # many distributions at once
#install.packages('ggrepel') # better labels
#install.packages('ggbeeswarm') # beeswarm plot
#install.packages('extrafont') # for extra fonts

Now that we’ve installed the packages, we’re going to load them into our R session.

library(tidyverse) # Data Cleaning, manipulation, summarization, plotting
library(gt) # beautiful tables
library(DT) # beautiful interactive tables
library(ggthemes) # custom pre-built themes
library(ggimage) # for the charts
library(ggforce) # better annotations
library(ggridges) # many distributions at once
library(ggrepel) # better labels
library(ggbeeswarm) # beeswarm plots
library(extrafont) # for extra fonts

Reading in the Playoff Data

For this analysis we’re going to be using the most recent playoff data for the teams competing.
## LPL
1. JD Gaming (LPL Spring 2020 Champions) 2. Top Esports 3. FunPlus Phoenix (Reigning 2019 World Champions) 4. Invictus Gaming (2018 World Champions)

LCK:

T1 (LCK Spring 2020 Champions, 3x World Champions, 2x MSI Champions)
Gen.G (2017 World Champions)
DragonX
DAMWON Gaming

To read in this data, we’re going to pull in the data from Oracle’s Elixir (https://oracleselixir.com/) , which has been separated to include the playoffs of the LPL, LCK, LCS, and LEC. The LCS and LEC data will be useful for further exploration and ‘what-if’ discussion.

 library(readxl)
LCK_playoff_data <- read_excel("2020 spring match data OraclesElixir 2020-05-11.xlsx", sheet = "LCK-Playoffs")
View(LCK_playoff_data)
  library(readxl)
LPL_playoff_data <- read_excel("2020 spring match data OraclesElixir 2020-05-11.xlsx", sheet = "LPL-Playoffs")
View(LPL_playoff_data)
library(readxl)
LCS_playoff_data <- read_excel("2020 spring match data OraclesElixir 2020-05-11.xlsx", sheet = "LCS-Playoffs")
View(LCS_playoff_data)
library(readxl)
LEC_playoff_data <- read_excel("2020 spring match data OraclesElixir 2020-05-11.xlsx", sheet = "LEC-Playoffs")
View(LEC_playoff_data)

Cleaning the Data

Now that we have the playoff data for each region, we’re going to narrow it down to each team competing, as well as the top four teams from the LCS and LEC.
The top four teams in the LEC and LCS for the Spring Split in 2020 were:
## LEC 1. G2 Esports (LEC 2020 Spring Champions, 2019 MSI Champions) 2. Fnatic (2011 World Champions) 3. Origen 4. MAD Lions
## LCS 1. Cloud9 2. Flyquest 3. Evil Geniuses 4. TeamSoloMid

Now that we have the teams specified, we’ll extract their matches from each playoff_data file

# Breaking down Data for LPL Teams #
JDG.data <- LPL_playoff_data %>% 
  filter(team == "JD Gaming")
TES.data <- LPL_playoff_data %>%
  filter(team == "Top Esports")
FPX.data <- LPL_playoff_data %>%
  filter(team == "FunPlus Phoenix")
IG.data <- LPL_playoff_data %>%
  filter(team == "Invictus Gaming")
#Breaking down data for LCK Teams#
T1.data <- LCK_playoff_data %>%
  filter(team == "T1")
Gen.G.data <- LCK_playoff_data %>%
  filter(team == "Gen.G")
DragonX.data <- LCK_playoff_data %>%
  filter(team == "DragonX")
DAMWON.data <- LCK_playoff_data %>%
  filter(team == "DAMWON Gaming")
# Breaking down for LCS teams #
Cloud9.data <- LCS_playoff_data %>%
  filter(team == "Cloud9")
Flyquest.data <- LCS_playoff_data %>%
  filter(team == "FlyQuest")
EG.data <- LCS_playoff_data %>%
  filter(team == "Evil Geniuses")
TSM.data <- LCS_playoff_data %>%
  filter(team == "Team SoloMid")
#Breaking down for LEC teams #
G2.data <- LEC_playoff_data %>%
  filter(team == "G2 Esports")
FNC.data <- LEC_playoff_data %>%
  filter(team == "Fnatic")
Origen.data <- LEC_playoff_data %>%
  filter(team == "Origen")
MAD.data <- LEC_playoff_data %>%
  filter(team == "MAD Lions")

Combining the Regions

Since we have four separate region files that have the data of their top four teams, we should combine them so we can compare the players and teams against one another for their roles. To do this we will use the rbind function.

MSC.data <- rbind(JDG.data, TES.data, FPX.data, IG.data, T1.data, Gen.G.data, DragonX.data, DAMWON.data) # for the actual midseason cup #
WSC.data <- rbind(G2.data, FNC.data, Origen.data, MAD.data, Cloud9.data, Flyquest.data, EG.data, TSM.data) # for the western teams i.e. 'western cup' #
top4.data <- rbind(MSC.data, WSC.data) # for the full data set of the top 4 teams #

Breaking into Positions

Now that we have the top 4 teams in their respective datasets, we can break them further into positions so we compare players based on their positions.

top.data <- top4.data %>%
  filter(position == "top")
jg.data <- top4.data %>%
  filter(position == "jng")
mid.data <- top4.data %>%
  filter(position == "mid")
bot.data <- top4.data %>%
  filter(position == "bot")
sup.data <- top4.data %>%
  filter(position == "sup")

#Selecting Metrics# From here, we’ll futher narrow down the data into a table we can use to generate a chart. Since there aren’t any defined metrics to measure each positions’ effectiveness throughout the game, we’ll have to estimate. For every role except support I chose to use each players’ mean damage per minute (dpm) and mean gold earned (mean.earned.gold). For support, the LPL data is lacking the vision score per minute ‘vspm’ column, so I chose to use the wards per minute ‘wpm’ column instead of vspm for comparing against their mean gold earned. There are lots of faults with using only wpm and earned gold to measure supports, but it’s the best data available with this dataset.

# Selecting the relevant metrics #
top.data <- select(top.data, "player", "team", "champion", "dpm", "earned gpm") %>% group_by(player)
mean.top.data <- summarise(top.data, mean.dpm = mean(dpm), mean.earned.gold = mean(`earned gpm`)) #gives the mean for each player of the top 4 teams in each region

jg.data <- select(jg.data, "player", "team", "champion", "dpm", "earned gpm") %>% group_by(player)
mean.jg.data <- summarise(jg.data, mean.dpm = mean(dpm), mean.earned.gold = mean(`earned gpm`)) 

mid.data <- select(mid.data, "player", "team", "champion", "dpm", "earned gpm") %>% group_by(player)
mean.mid.data <- summarise(mid.data, mean.dpm = mean(dpm), mean.earned.gold = mean(`earned gpm`))

bot.data <- select(bot.data, "player", "team", "champion", "dpm", "earned gpm") %>% group_by(player)
mean.bot.data <- summarise(bot.data, mean.dpm = mean(dpm), mean.earned.gold = mean(`earned gpm`))

sup.data <- select(sup.data, "player", "team", "champion", "wpm", "earned gpm") %>% group_by(player) 
mean.sup.data <- summarise(sup.data, mean.wpm = mean(wpm), mean.earned.gold = mean(`earned gpm`))

#Creating our charts# To create our charts we’ll need some images of the players. I already made an excel file that contains the player image urls from gamepedia. To import that we’ll use the built in read_excel function (this is built into RStudio).

library(readxl)
player_images <- read_excel("player.images.xlsx")

Now we’ll join the player images to some of the dataframes we made earlier. Let’s start with the top lane.

top.chart <- mean.top.data %>% left_join(player_images, by = c("player" = "player")) #to put the player image urls with the data#
mid.chart <- mean.mid.data %>% left_join(player_images, by = c("player" = "player"))
jg.chart <- mean.jg.data %>% left_join(player_images, by = c("player" = "player"))
bot.chart <- mean.bot.data %>% left_join(player_images, by = c("player" = "player"))
sup.chart <- mean.sup.data %>% left_join(player_images, by = c("player" = "player"))
View(top.chart)
View(mid.chart)
View(jg.chart)
View(bot.chart)
View(sup.chart)

We should view the new tables to make sure there aren’t any ‘NA’ values. If there are we need to go back and make sure the player fields of each table match.

Let’s create our first plots. Taking inspiration from Ben Baldwin’s guide to nflscrapR, we’ll start with this code.

top.chart %>%
  ggplot(aes(x = mean.earned.gold, y = mean.dpm)) +
  geom_image(aes(image = url), size = 0.12) +
  labs(x = "Mean Earned Gold",
       y = "Mean Damage Per Minute",
       caption = "Data from Oracle's Elixir",
       title = "Top Lane Average DPM and Earned Gold",
       subtitle = "2020 Spring Playoffs") +
  theme_bw() +
  theme(axis.title = element_text(size = 12),
        axis.text = element_text(size = 10),
        plot.title = element_text(size = 16),
        plot.subtitle = element_text(size = 14),
        plot.caption = element_text(size = 12))

This makes a decent chart, but its very crowded and imprecise. This can be solved in two ways. First, let’s break down the samples into LPL/LCK and LEC/LCS matchups since the Mid-Season Cup will actually be taking place.

#Breaking into LPL/LCK and LEC/LCS# By making a whole dataframe that has all the players metrics and images, we can then break it into region, and then combine each pair of regions into the actual Midseason Cup and fictional Western cup.

player.image.chart <- bind_rows(top.chart, jg.chart, mid.chart, bot.chart, sup.chart)
LPL.chart <- player.image.chart %>%
  filter(league == 'LPL')
LCK.chart <- player.image.chart %>%
  filter(league == 'LCK')
MSC.chart <- bind_rows(LPL.chart, LCK.chart)
LEC.chart <- player.image.chart %>%
  filter(league == 'LEC')
LCS.chart <- player.image.chart %>%
  filter(league == 'LCS')
WSC.chart <- bind_rows(LEC.chart, LCS.chart)
View(MSC.chart)
View(WSC.chart)

Now lets break it into their positions again

MSC.top <- MSC.chart %>%
  filter(position == 'top')
MSC.jg <- MSC.chart %>%
  filter(position == 'jng')
MSC.mid <- MSC.chart %>%
  filter(position == 'mid')
MSC.bot <- MSC.chart %>%
  filter(position == 'bot')
MSC.sup <- MSC.chart %>%
  filter(position == 'sup')
WSC.top <- WSC.chart %>%
  filter(position == 'top')
WSC.jg <- WSC.chart %>%
  filter(position == 'jng')
WSC.mid <- WSC.chart %>%
  filter(position == 'mid')
WSC.bot <- WSC.chart %>%
  filter(position == 'bot')
WSC.sup <- WSC.chart %>%
  filter(position == 'sup')
View(MSC.top)
View(MSC.jg)
View(MSC.mid)
View(MSC.bot)
View(MSC.sup)
View(WSC.top)
View(WSC.jg)
View(WSC.mid)
View(WSC.bot)
View(WSC.sup)

From here we can create better charts. Let’s try the top lane chart again

#Generating Useful Charts# Now that we have our positions broken into their actual groups, lets compare them again.

MSC.top %>%
  ggplot(aes(x = mean.earned.gold, y = mean.dpm)) +
  geom_image(aes(image = url), size = 0.12) +
  labs(x = "Mean Earned Gold",
       y = "Mean Damage Per Minute",
       caption = "Data from Oracle's Elixir",
       title = "Top Lane Average DPM and Earned Gold",
       subtitle = "2020 Spring Playoffs") +
  theme_bw() +
  theme(axis.title = element_text(size = 12),
        axis.text = element_text(size = 10),
        plot.title = element_text(size = 16),
        plot.subtitle = element_text(size = 14),
        plot.caption = element_text(size = 12))

This is still imprecise, but it is much mess crowded and still gives a comparative view of the top-laners in the LPL and LCK going into the tournament.

Let’s try to add a point in addition to the players’ images to be more precise.

2020 Mid-Season Cup- LCK vs LPL

Jay Shearrow