Introduction
Discussion Board Title: Analyzing Champion Stats in League of Legends
Dataset: https://ddragon.leagueoflegends.com/cdn/11.19.1/data/en_US/champion.json
Provided by: Santiago Torres
Suggested Prompt: “For all champions, figure out who has the highest starting hp for each tag category.”
Load Libaries
Load all required libraris into the evironment. For this dataset, we will be using json packages rjson and jsonlite for json parsing
library(rjson)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.4 ✓ dplyr 1.0.7
## ✓ tidyr 1.1.3 ✓ stringr 1.4.0
## ✓ readr 2.0.1 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(stringr)
library(jsonlite)
##
## Attaching package: 'jsonlite'
## The following object is masked from 'package:purrr':
##
## flatten
## The following objects are masked from 'package:rjson':
##
## fromJSON, toJSON
Load Data
Data is hosted on ddragon.leagueoflegends.com. We can use rjson::fromJSON to read and parse directly from the URL.
fromJSON will return a nested list structure containing all of the available json data
<- rjson::fromJSON(file="https://ddragon.leagueoflegends.com/cdn/11.19.1/data/en_US/champion.json") champion_json
Data Tidying
The first step in my tidy process will be to convert the champion_json list object into a “long” dataframe, containing the key-value pairs for all json elements
<- enframe(unlist(champion_json)) unlisted_json
As you can see below, every unique nested json element is notated using subsequent “.”s.
head(unlisted_json, 15)
What is the longest json object?
In order to create a “Hadley approved” long dataset, we will need to create enw columns to capture all of the nested json elements. In order to know how many columns to create, we need to know what the longest nested item is.
<- "\\."
dot_split_regex
<-
n_cols_max %>%
unlisted_json pull(name) %>%
str_split(dot_split_regex) %>%
map_dbl(~length(.)) %>%
max()
n_cols_max
## [1] 4
Use separate() function to split out the name column
separate() will split the name column into multiple columns based on the “.” split. For items that do not have as many splits as n_cols_max, NA’s will be introduced for the new columns
<- unlisted_json %>% separate(name, into = c(paste0("x", 1:n_cols_max)),fill="right") split_champion_list
head(split_champion_list,10)
Get vectors for all interesting data points.
Below are all of the data points that we want to capture into vectors. The vectors should all be the same length, assuming that all champions have the same set of nested datapoints.
Following this assumption, it was discovered that not all data points are included for all champions! “tags” are not available for every champion, so for those items we will need a different approach.
- name
- title
- blurb
- tags*** – hp – hpperlevel – mp – mpperlevel – movespeed – armor – armorperlevel – spellblock – spellblockperlevel – attackrange – hpregen – hpregenperlevel – mpregen – mpregenperlevel – crit – critperlevel – attackdamage
<- split_champion_list %>%
champ_names filter(x3 == "name") %>%
select(value)
<- split_champion_list %>%
champ_title filter(x3 == "title") %>%
select(value)
<- split_champion_list %>%
champ_blurb filter(x3 == "blurb") %>%
select(value)
<- split_champion_list %>%
champ_hp filter(x4 == "hp") %>%
select(value)
<- split_champion_list %>%
champ_hpperlevel filter(x4 == "hpperlevel") %>%
select(value)
<- split_champion_list %>%
champ_mp filter(x4 == "mp") %>%
select(value)
<- split_champion_list %>%
champ_mpperlevel filter(x4 == "mpperlevel") %>%
select(value)
<- split_champion_list %>%
champ_movespeed filter(x4 == "movespeed") %>%
select(value)
<- split_champion_list %>%
champ_armor filter(x4 == "armor") %>%
select(value)
<- split_champion_list %>%
champ_armorperlevel filter(x4 == "armorperlevel") %>%
select(value)
<- split_champion_list %>%
champ_spellblock filter(x4 == "spellblock") %>%
select(value)
<- split_champion_list %>%
champ_spellblockperlevel filter(x4 == "spellblockperlevel") %>%
select(value)
<- split_champion_list %>%
champ_attackrange filter(x4 == "attackrange") %>%
select(value)
<- split_champion_list %>%
champ_hpregen filter(x4 == "hpregen") %>%
select(value)
<- split_champion_list %>%
champ_hpregenperlevel filter(x4 == "hpregenperlevel") %>%
select(value)
<- split_champion_list %>%
champ_mpregen filter(x4 == "mpregen") %>%
select(value)
<- split_champion_list %>%
champ_mpregenperlevel filter(x4 == "mpregenperlevel") %>%
select(value)
<- split_champion_list %>%
champ_crit filter(x4 == "crit") %>%
select(value)
<- split_champion_list %>%
champ_critperlevel filter(x4 == "critperlevel") %>%
select(value)
<- split_champion_list %>%
champ_attackdamage filter(x4 == "attackdamage") %>%
select(value)
Combine into one dataframe
We will combine all of the above generated vectors into the champ_names tibble. Note that we have still not addressed the problem with “tags”.
<- champ_names %>%
champs_tidy mutate(
title = pull(champ_title, value),
blurb = pull(champ_blurb, value),
hp = pull(champ_hp, value),
hpperlevel = pull(champ_hpperlevel, value),
mp = pull(champ_mp, value),
mpperlevel = pull(champ_mpperlevel, value),
movespeed = pull(champ_movespeed, value),
armor = pull(champ_armor, value),
armorperlevel = pull(champ_armorperlevel, value),
spellblock = pull(champ_spellblock, value),
spellblockperlevel = pull(champ_spellblockperlevel, value),
attackrange = pull(champ_attackrange, value),
hpregen = pull(champ_hpregen, value),
hpregenperlevel = pull(champ_hpregenperlevel, value),
mpregen = pull(champ_mpregen, value),
mpregenperlevel = pull(champ_mpregenperlevel, value),
crit = pull(champ_crit, value),
critperlevel = pull(champ_critperlevel, value),
attackdamage = pull(champ_attackdamage, value)
)
View Tiday Data!
After all of the above steps, we now have a long dataset.
head(champs_tidy, 5)
Answer the Original Prompt:
Here we will be answering the original prompt, which is to determine which champions have the highest starting HP per category.
%>%
champs_tidy group_by(tag_category) %>%
top_n(1, hp) %>%
select(tag_category, name, hp)