── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ lubridate 1.9.4 ✔ tibble 3.2.1
✔ purrr 1.0.2 ✔ tidyr 1.3.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
── Attaching packages ────────────────────────────────────── tidymodels 1.2.0 ──
✔ broom 1.0.7 ✔ rsample 1.2.1
✔ dials 1.3.0 ✔ tune 1.2.1
✔ infer 1.0.7 ✔ workflows 1.1.4
✔ modeldata 1.4.0 ✔ workflowsets 1.1.0
✔ parsnip 1.2.1 ✔ yardstick 1.3.1
✔ recipes 1.1.0
── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
✖ scales::discard() masks purrr::discard()
✖ dplyr::filter() masks stats::filter()
✖ recipes::fixed() masks stringr::fixed()
✖ dplyr::lag() masks stats::lag()
✖ yardstick::spec() masks readr::spec()
✖ recipes::step() masks stats::step()
• Use tidymodels_prefer() to resolve common conflicts.
Registered S3 method overwritten by 'quantmod':
method from
as.zoo.data.frame zoo
Attaching package: 'shiny'
The following objects are masked from 'package:DT':
dataTableOutput, renderDataTable
The following object is masked from 'package:infer':
observe
Loading required package: lattice
Registered S3 methods overwritten by 'pROC':
method from
print.roc fmsb
plot.roc fmsb
Attaching package: 'caret'
The following objects are masked from 'package:yardstick':
precision, recall, sensitivity, specificity
The following object is masked from 'package:purrr':
lift
League Q
League of Legends (Background)
League of Legends (LoL) is a multiplayer online battle arena (MOBA) game developed and published by Riot Games in 2009. Inspired by the popular Warcraft III mod Defense of the Ancients (DotA), LoL became a global phenomenon, offering a competitive and strategic gaming experience. Players assume the role of “champions,” each with unique abilities, and compete in team-based matches to destroy the enemy’s Nexus, the heart of their base. Known for its dynamic gameplay, frequent updates, and a vast roster of champions, LoL fosters a thriving esports scene, with tournaments like the League of Legends World Championship drawing millions of viewers worldwide. Its rich lore and engaging gameplay have cemented its place as one of the most iconic and influential games in the industry.
Purpose
The purpose of the analysis was to look at LoL stats and analyze the different CSV files I obtained from (Kaggle).
I was motivated by boredom in all honesty.
Rows: 1855 Columns: 29
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): team, region, matchType
dbl (26): Baron, Dra, Turts, kills, deaths, assists, CS, gold, damage, tanki...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 21076 Columns: 25
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (5): player_name, team, region, matchType, position
dbl (20): kills, deaths, assists, CS, gold, damage, tanking, matches_played,...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 0 Columns: 1
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): 03_hero.csv
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 7044 Columns: 29
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (5): Team1_name, Team1_region, Team2_name, Team2_region, matchType
dbl (24): matches_played, minutes_played, matches_won_1, matches_won_2, Team...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 7044 Columns: 97
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (25): Team1_name, Team1_region, Team2_name, Team2_region, matchType, Tea...
dbl (72): matches_played, minutes_played, Team1_player1_kills, Team1_player1...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 11458 Columns: 17
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (14): player_name, player_team, player_region, matchType, hero_chosen_1,...
dbl (3): matches_played, minutes_played, matches_won
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
dat <- vroom(...)
problems(dat)
Rows: 21147 Columns: 116
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (37): matchType, MatchDate, win, Team1, Team1_region, Team1_ban1, Team1...
dbl (78): MatchID, gameset, Team1_Baron, Team1_Dra, Team1_Turts, Team2_Baro...
time (1): Duration
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 27270 Columns: 10
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): hero_name, position, matchType
dbl (7): pick_count, ban_count, win_count, matches_played, pick_rate, ban_ra...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 201253 Columns: 23
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (5): hero1_name, hero1_position, hero2_name, hero2_position, matchType
dbl (18): matches_played, minutes_played, matches_won_1, matches_won_2, hero...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Exploratory Data Analysis and Visualizations
`01_team`$team <- abbreviate(`01_team`$team)
Warning in abbreviate(`01_team`$team): abbreviate used with non-ASCII chars
<- `01_team` %>%
team_win select(team, matches_won, matches_lose) %>%
group_by(team) %>%
mutate(win_percentage = matches_won / (matches_won + matches_lose) * 100) %>%
arrange(desc(matches_won))
datatable(team_win)
Regional Wins
<- `01_team` %>%
region_win select(region, matches_won, matches_lose) %>%
group_by(region) %>%
summarise(total_wins = sum(matches_won),
total_losses = sum(matches_lose),
total_win_percentage = total_wins / (total_wins + total_losses) * 100) %>%
arrange(desc(total_wins))
datatable(region_win)
Head to Head Record
`04_team_vs` <- `04_team_vs` %>%
mutate(team_pair = ifelse(Team1_name < Team2_name,
paste(Team1_name, Team2_name, sep = "-"),
paste(Team2_name, Team1_name, sep = "-")))
<- `04_team_vs` %>%
head_to_head group_by(team_pair) %>%
summarise(
matches_played = sum(matches_played, na.rm = TRUE),
team1_wins = sum(matches_won_1, na.rm = TRUE),
team2_wins = sum(matches_won_2, na.rm = TRUE),
.groups = "drop"
)
<- head_to_head %>%
head_to_head mutate(
Team_A = sapply(strsplit(as.character(team_pair), "-"), `[`, 1),
Team_B = sapply(strsplit(as.character(team_pair), "-"), `[`, 2)
%>%
) select(Team_A, Team_B, matches_played, team1_wins, team2_wins)
<- function(input, output, session) {
server # Reactive data filtering
<- reactive({
filtered_data req(input$team1, input$team2) # Ensure inputs are available
%>%
head_to_head filter((Team_A == input$team1 & Team_B == input$team2) |
== input$team2 & Team_B == input$team1))
(Team_A
})
# Render the Highcharter plot
$plot <- renderHighchart({
output<- filtered_data()
data
if (nrow(data) == 0) {
return(
highchart() %>%
hc_title(text = paste("No Data for Selected Teams:", input$team1, "vs", input$team2)) %>%
hc_chart(type = "line")
)
}
# Prepare data for Highcharter
<- c("Team 1 Wins", "Team 2 Wins")
categories <- c(sum(data$team1_wins, na.rm = TRUE), sum(data$team2_wins, na.rm = TRUE))
values
highchart() %>%
hc_chart(type = "column") %>%
hc_title(text = paste("Head-to-Head: ", input$team1, " vs ", input$team2)) %>%
hc_xAxis(categories = categories, title = list(text = "Result")) %>%
hc_yAxis(title = list(text = "Count")) %>%
hc_series(
list(
name = "Wins",
data = values
)%>%
) hc_tooltip(pointFormat = "<b>{point.category}:</b> {point.y}")
}) }
Player Stats and Performance
<- `02_player` %>%
player_stats group_by(player_name) %>%
summarise(mean_kills_per_game = mean(kills_per_game),
mean_CS_per_minute = mean(CS_per_minute),
mean_damage_per_game = mean(damage_per_game),
mean_deaths_per_game = mean(deaths_per_game),
mean_damage_per_game = mean(damage_per_game),
mean_tanking_per_game = mean(tanking_per_game),
mean_KDA = mean(KDA),
mean_assists_per_game = mean(assists_per_game))
<- sapply(player_stats, is.numeric)
numeric_columns <- preProcess(as.data.frame(player_stats[, numeric_columns]), method = "range")
preprocessor <- predict(preprocessor, player_stats[, numeric_columns])
normalized_data
<- player_stats[, !numeric_columns, drop = FALSE]
non_numeric_data
<- cbind(normalized_data, non_numeric_data) normalized_stats
<- function(input, output, session) {
server # Reactive data filtering
<- reactive({
filtered_data_player req(input$Player) # Ensure inputs are available
%>%
normalized_stats filter(player_name == input$Player)
})
# Render the Highcharter plot
$plot <- renderHighchart({
output<- filtered_data_player()
data
if (nrow(data) == 0) {
return(
highchart() %>%
hc_title(text = paste("No Data for Selected Player:", input$Player)) %>%
hc_chart(type = "line")
)
}
# Define categories (excluding player column)
<- setdiff(colnames(data), "player_name") # Adjust column name as needed
categories
# Ensure categories are numeric
<- as.numeric(data[ 1,categories, drop = TRUE])
performance_data
if (any(is.na(performance_data))) {
return(
highchart() %>%
hc_title(text = paste("Invalid Data for Player:", input$Player)) %>%
hc_chart(type = "line")
)
}
highchart() %>%
hc_chart(type = "line", polar = TRUE) %>%
hc_title(text = paste("Performance of", input$Player)) %>%
hc_xAxis(categories = categories, tickmarkPlacement = "on", lineWidth = 0) %>%
hc_yAxis(gridLineInterpolation = "polygon", lineWidth = 0, min = 0, max = 1) %>%
hc_series(
list(
name = input$Player,
data = performance_data
)%>%
) hc_tooltip(pointFormat = "<b>{point.y}</b>")
}) }
shinyApp(ui = fluidPage(
titlePanel("Player Performance"),
sidebarLayout(
sidebarPanel(
selectInput("Player", "Select Player", choices = unique(c(player_stats$player_name))),
),mainPanel(
highchartOutput("plot")
)
)server = server) ),
Warning: The select input "Player" contains a large number of options; consider
using server-side selectize for massively improved performance. See the Details
section of the ?selectizeInput help topic.
Hero Stats
$year <- gsub(".*?(\\d{4}).*", "\\1", hero$matchType)
hero<- hero[grepl("^\\d+$", hero$year), ]
hero $year <- ymd(hero$year, truncated = 2L)
hero<- hero %>%
hero_stats group_by(hero_name, year) %>%
summarise(mean_pick_rate = mean(pick_rate),
mean_ban_rate = mean(ban_rate))
`summarise()` has grouped output by 'hero_name'. You can override using the
`.groups` argument.
<- function(input, output, session) {
server # Reactive data filtering
<- reactive({
filtered_data_hero req(input$Hero) # Ensure the Hero input is available
%>%
hero_stats filter(hero_name == input$Hero)
})
# Render the Highcharter plot
$plot <- renderHighchart({
output# Retrieve the filtered data
<- filtered_data_hero()
data
# Define numeric categories excluding non-relevant columns
<- colnames(data)[sapply(data, is.numeric) & !(colnames(data) %in% c("player", "year"))]
categories
# Ensure categories are numeric
<- suppressWarnings(as.numeric(data[1, categories, drop = TRUE]))
hero_data
# Check for invalid data
if (any(is.na(hero_data))) {
return(
highchart() %>%
hc_title(text = paste("Invalid Data for Hero:", input$Hero)) %>%
hc_chart(type = "line")
)
}
# Convert year data to milliseconds for Highcharter
<- as.numeric(as.POSIXct(data$year)) * 1000
year_data print(year_data)
# Create the Highchart line graph with two series
highchart() %>%
hc_chart(type = "line") %>%
hc_title(text = "Mean Pick and Ban Rates") %>%
hc_xAxis(
type = "datetime",
title = list(text = "Year"),
labels = list(format = "{value:%Y}") # Display only the year
%>%
) hc_yAxis(title = list(text = "Rate")) %>%
hc_series(
list(
name = "Mean Pick Rate",
data = mapply(function(x, y) list(x, y), year_data, data$mean_pick_rate, SIMPLIFY = FALSE)
),list(
name = "Mean Ban Rate",
data = mapply(function(x, y) list(x, y), year_data, data$mean_ban_rate, SIMPLIFY = FALSE)
)
)
}) }
shinyApp(ui = fluidPage(
titlePanel("Hero Pick and Ban Rate"),
sidebarLayout(
sidebarPanel(
selectInput("Hero", "Select Hero", choices = unique(c(hero_stats$hero_name))),
),mainPanel(
highchartOutput("plot")
)
)server = server) ),
Modeling Time!!!!
To start, we will create a simple logistic regression to predict which teams won what matches and go from there. Then, we will proceed to either create a convolutional neural net or a random forest model. We shall see.
Fitting model
# Load necessary libraries
library(tidymodels)
# Read the dataset
<- read.csv("01_team.csv")
data
# Create the binary target variable based on win/loss
<- data %>%
data mutate(win_binary = ifelse(matches_won > matches_lose, 1, 0)) %>%
select(win_binary, Baron_per_game, Dra_per_game, Turts_per_game, kills_per_game,
deaths_per_game, assists_per_game, CS_per_minute, gold_per_minute,
damage_per_game, tanking_per_game)
$win_binary <- as.factor(data$win_binary)
data
# Split the data into training and testing sets
set.seed(42)
<- initial_split(data, prop = 0.8, strata = win_binary)
data_split <- training(data_split)
train_data <- testing(data_split)
test_data
# Define the logistic regression model
<- logistic_reg() %>%
logistic_model set_engine("glm") %>%
set_mode("classification")
# Create a recipe for preprocessing
<- recipe(win_binary ~ ., data = train_data) %>%
logistic_recipe step_normalize(all_predictors())
# Create a workflow
<- workflow() %>%
logistic_workflow add_model(logistic_model) %>%
add_recipe(logistic_recipe)
# Train the model
<- logistic_workflow %>%
logistic_fit fit(data = train_data)
# Make predictions on the test set
<- logistic_fit %>%
test_predictions predict(new_data = test_data) %>%
bind_cols(test_data)
# Evaluate the model
<- test_predictions %>%
metrics metrics(truth = win_binary, estimate = .pred_class)
<- test_predictions %>%
conf_mat conf_mat(truth = win_binary, estimate = .pred_class)
# Print the results
print(metrics)
# A tibble: 2 × 3
.metric .estimator .estimate
<chr> <chr> <dbl>
1 accuracy binary 0.900
2 kap binary 0.797
print(conf_mat)
Truth
Prediction 0 1
0 192 15
1 22 142