Assignment 7

Author

Nick Mallow

Introduction

For this assignment I wanted to look at the hit points of enemies in Silksong. I find this interesting because Silksong is a hard game so i want to compare the enemy hit points to that of hollow knight, and i wish to look more specifically looking into enemies featured in a section called the high halls gauntlet as well as the hardest arean in Hollow Knight being the Trial of the Fool.

I scraped this data form the HollowKinght wiki and all information found here is form the Hollowknight wiki.

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(janitor)

Attaching package: 'janitor'

The following objects are masked from 'package:stats':

    chisq.test, fisher.test

Importing the Data

Bugs <- read_csv("https://myxavier-my.sharepoint.com/:x:/g/personal/mallown_xavier_edu/IQCJGRP97ifkTqhqcEoNWRScAceNZGV4YAMvZ8BuVIYRDns?download=1")
Rows: 344 Columns: 9
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (9): MAIN GAME...1, MAIN GAME...2, MAIN GAME...3, ...4, ...5, ...6, ...7...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Data Wrangling

A

glimpse(Bugs)
Rows: 344
Columns: 9
$ `MAIN GAME...1` <chr> "HJ ID", "HJ ID", "5", "137", "96", "11", "11", "82", …
$ `MAIN GAME...2` <chr> "Enemy", "Enemy", "Aknid", "Alita", "Barnak", "Beastfl…
$ `MAIN GAME...3` <chr> "HP / BT", "HP / BT", "15 / 60", "80", "35", "15 / 60"…
$ ...4            <chr> "Damage Modifiers by Needle Level", "Lvl. 0", "1", "1.…
$ ...5            <chr> "Damage Modifiers by Needle Level", "Lvl. 1", "1", "1.…
$ ...6            <chr> "Damage Modifiers by Needle Level", "Lvl. 2", "1", "1.…
$ ...7            <chr> "Damage Modifiers by Needle Level", "Lvl. 3", "1", "1"…
$ ...8            <chr> "Damage Modifiers by Needle Level", "Lvl. 4", "1", "1"…
$ ...9            <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…

As you can see this data is not the most tidy every thing is stored as a character, the names of things are stored as data instead of being stored in there proper places, in the hit points column it will sometimes have a “/” in it to show the enemy health when in act 3 instead of that being stored in its own column. thre are rows that are titles for the rows bellow it. So I will do the work to get the column names to be column names as well as split BT(HP of Voided Enemies) into its own column and lastly changing the data types to their appropriate types. I will also make variables for both games hardest arenas as well as separate out both the games and git rid of columns that are not needed.

Getting Rid of Unneeded Rows

#using base r because it is easier syntax than trying to do filter stuff
Bugs <- Bugs[-c(204, 323, 327, 331, 334), ]

Fixing Column Names.

##Use the janator library to move the rows up to column names.
Bugs <- Bugs %>% 
  row_to_names(row_number = 2)
##cleaning names for my own sainity
Bugs <- clean_names(Bugs)

Separating the Games

##use the fact skong enemies have hdjid and hollow knight enemies do not to make a game varible
Bugs <- Bugs %>%
  mutate(from_SilkSong = !is.na(hj_id))
##we dont need hunters jornal id anymore so its going to the byebye zone
Bugs <- Bugs %>%
  select(!hj_id)

Separating HP and BT

##first we get rid of the weird numbers in hollowknigt enemies
Bugs <- Bugs %>%
  mutate(hp_bt = if_else(from_SilkSong,hp_bt,str_extract(hp_bt, "^[0-9]+")))

##now on to what we set out to do
Bugs <- Bugs %>%
  separate(hp_bt,
           into = c("HP", "BT"),
           sep = "/",
           fill = "right",
           convert = TRUE)

This was easier than I expected

Misc Things in HP

It turns out one enemy known as the Garpid has 999999 (1) as their HP. When I looked into it i found out that there is 1 HP per Garpid but doing so does not kill the swarm. So i am going to change that value to 1.

Bugs <- Bugs %>%
  mutate(HP = replace(HP, row_number() == 64, 1))

Change the data types

Bugs <- Bugs %>% 
  mutate(HP = as.numeric(HP)) %>% 
  mutate(BT = as.numeric(BT)) %>% 
  mutate(lvl_0 = as.numeric(lvl_0)) %>% 
  mutate(lvl_1 = as.numeric(lvl_1)) %>% 
  mutate(lvl_2 = as.numeric(lvl_2)) %>% 
  mutate(lvl_3 = as.numeric(lvl_3)) %>% 
  mutate(lvl_4 = as.numeric(lvl_4))
Warning: There was 1 warning in `mutate()`.
ℹ In argument: `HP = as.numeric(HP)`.
Caused by warning:
! NAs introduced by coercion

Make the Arena Varibles

We are making a True/False variable for if an enemy is in the high halls gauntlet. We are also making one for tier 3 of the coliseum of fools the hardest arena in Hollow Knight.

high_halls <- c("Choristor", "Reed", "Envoy", "Choir Bellbearer", "Clawmaiden", "Minister", "Maestro")

fool <- c("Heavy Fool", "Sturdy Fool", "Armoured Squit", " Shielded Fool", "Primal Aspid", "Winged Fool", "Sharp Baldur", "Battle Obble", "Furious Vengefly", "Belfly", "Death Loodle", "Garpede", "Mantis Petra", "Mantis Traitor", "Soul Twister", "Mistake", "Soul Warrior", "Folly", "Volt Twister", "Lesser Mawlek")

Bugs <- Bugs %>% 
  mutate(HighHallsG = enemy %in% high_halls) %>% 
  mutate(Trial_of_fool = enemy %in% fool)

Visualization Time!!!

Distribution of health

Bugs %>%  
ggplot(aes(x = HP)) +
  geom_histogram() +
  ylim(0, 50) +
  xlim(0,200) +
  labs(title = "Distribution of Enemy Hit Points in Hollow Knight and Silksong")
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Warning: Removed 12 rows containing non-finite outside the scale range
(`stat_bin()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_bar()`).

Bugs %>%  
  filter(from_SilkSong == TRUE) %>% 
  ggplot(aes(x = HP)) +
  geom_histogram(fill = "darkred") +
  xlim(0, 200) +
  ylim(0,50) +
  labs(title = "Distribution of Enemy Hit Points in Silksong")
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Warning: Removed 5 rows containing non-finite outside the scale range
(`stat_bin()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_bar()`).

Bugs %>%  
  filter(from_SilkSong == FALSE) %>% 
  ggplot(aes(x = HP)) +
  geom_histogram(fill = "darkblue") +
  xlim(0, 200) +
  ylim(0,50) +
  labs(title = "Distribution of Enemy Hit Points in Hollow Knight")
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Warning: Removed 7 rows containing non-finite outside the scale range
(`stat_bin()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_bar()`).

Bugs %>%
  ggplot(aes(x = HP, fill = from_SilkSong)) +
  geom_histogram(alpha = 0.6, position = "identity", binwidth = 10) +
  scale_fill_manual(values = c("TRUE" = "darkred", "FALSE" = "darkblue"),
                    labels = c("Hollow Knight", "Silksong")) +
  labs(title = "Enemy Health Distributions Hollow Knight Vs Silksong",
       x = "Hit Points (HP)",
       y = "Count",
       fill = "Game") +
  xlim(0, 200) +
  ylim(0,50) +
  theme_minimal()
Warning: Removed 12 rows containing non-finite outside the scale range
(`stat_bin()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_bar()`).

It looks like form these grpahs hollow knight has a fairly simmilar distribution but it has more enemies in the high health range and is overall skewed towards lower health enemies.

Comparing Gauntlets

Bugs %>%
  filter(from_SilkSong == TRUE) %>% 
  ggplot(aes(x = HP, fill = HighHallsG)) +
  geom_histogram(alpha = 0.6, position = "identity", binwidth = 5) +
  scale_fill_manual(values = c("TRUE" = "gold", "FALSE" = "darkred"),
                    labels = c("Not in High Halls Gauntlet", "High Halls Gauntlet")) +
  labs(title = "Enemy Health Distributions Hollow Knight Vs Silksong",
       x = "Hit Points (HP)",
       y = "Count",
       fill = "Game") +
  xlim(0, 200) +
  ylim(0,25) +
  theme_minimal()
Warning: Removed 5 rows containing non-finite outside the scale range
(`stat_bin()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_bar()`).

We can see that he high halls enemies are mostly around the spike that is near 50 HP meaning that they are tankier but no the games most tanky enemies.

Bugs %>%
  filter(from_SilkSong == FALSE) %>% 
  ggplot(aes(x = HP, fill = Trial_of_fool)) +
  geom_histogram(alpha = 0.6, position = "identity", binwidth = 5) +
  scale_fill_manual(values = c("TRUE" = "#C1D366", "FALSE" = "darkblue"),
                    labels = c("Not in Trial of the Fool", "Trial of the Fool")) +
  labs(title = "Enemy Health Distributions Hollow Knight Vs Silksong",
       x = "Hit Points (HP)",
       y = "Count",
       fill = "Game") +
  xlim(0, 200) +
  ylim(0,25) +
  theme_minimal()
Warning: Removed 7 rows containing non-finite outside the scale range
(`stat_bin()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_bar()`).

We can see here that the trial of the fool is more spread out but has enemies around the 50 hp range as well as enemies beyond it.

Bugs %>%
  filter(Trial_of_fool == TRUE | HighHallsG == TRUE) %>% 
  ggplot(aes(x = HP, fill = Trial_of_fool)) +
  geom_histogram(alpha = 0.6, position = "identity", binwidth = 5) +
  scale_fill_manual(values = c("TRUE" = "forestgreen", "FALSE" = "gold"),
                    labels = c("High Halls Gauntlet", "Trial of the Fool")) +
  labs(title = "Enemy Health Distributions Hollow Knight Vs Silksong",
       x = "Hit Points (HP)",
       y = "Count",
       fill = "Game") +
  xlim(0, 100) +
  ylim(0,10) +
  theme_minimal()
Warning: Removed 1 row containing non-finite outside the scale range
(`stat_bin()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_bar()`).

It seems that there is more spread in trial of the fool as well as groups of stronger enemies when compared to high halls gauntlet showing that trial of the fool may be more difficult.