Introduction
The Digimon franchise, similar to Pokemon, revolves around capturing, caring for, and training monsters for combat. This dataset contains information on Digimon from “Digimon Story: Cyber Sleuth,” a video game released for Playstation Vita in 2015 and Playstation 4 in 2016. The dataset includes three files: a list of all the Digimon that can be captured or fought in Cyber Sleuth, all the moves that Digimon can perform, and all the Support Skills. The dataset was created by Mark Korsak and is used with permission. An interactive version of the database can be found at http://digidb.io/.
In this analysis, we will be exploring the Digimon dataset to gain insights and understanding about the various attributes of different Digimon. We will be analyzing data on the attacks and defense capabilities of Digimon, the type of Digimon, the stages of development, and other attributes.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.1 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the ]8;;http://conflicted.r-lib.org/conflicted package]8;; to force all conflicts to become errors
library(janitor)
##
## Attaching package: 'janitor'
##
## The following objects are masked from 'package:stats':
##
## chisq.test, fisher.test
Loading data
We will start by loading the Digimon, Move, and Support data into our workspace using the read.csv function. We will also use the janitor library to clean up the column names.
digimon <- read.csv("DigiDB_digimonlist.csv")
move <- read.csv("DigiDB_movelist.csv")
support <- read.csv("DigiDB_supportlist.csv")
head(digimon)
## Number Digimon Stage Type Attribute Memory Equip.Slots Lv.50.HP Lv50.SP
## 1 1 Kuramon Baby Free Neutral 2 0 590 77
## 2 2 Pabumon Baby Free Neutral 2 0 950 62
## 3 3 Punimon Baby Free Neutral 2 0 870 50
## 4 4 Botamon Baby Free Neutral 2 0 690 68
## 5 5 Poyomon Baby Free Neutral 2 0 540 98
## 6 6 Koromon In-Training Free Fire 3 0 940 52
## Lv50.Atk Lv50.Def Lv50.Int Lv50.Spd
## 1 79 69 68 95
## 2 76 76 69 68
## 3 97 87 50 75
## 4 77 95 76 61
## 5 54 59 95 86
## 6 109 93 52 76
str(digimon)
## 'data.frame': 249 obs. of 13 variables:
## $ Number : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Digimon : chr "Kuramon" "Pabumon" "Punimon" "Botamon" ...
## $ Stage : chr "Baby" "Baby" "Baby" "Baby" ...
## $ Type : chr "Free" "Free" "Free" "Free" ...
## $ Attribute : chr "Neutral" "Neutral" "Neutral" "Neutral" ...
## $ Memory : int 2 2 2 2 2 3 3 3 3 3 ...
## $ Equip.Slots: int 0 0 0 0 0 0 0 0 0 0 ...
## $ Lv.50.HP : int 590 950 870 690 540 940 1030 930 930 640 ...
## $ Lv50.SP : int 77 62 50 68 98 52 64 54 64 86 ...
## $ Lv50.Atk : int 79 76 97 77 54 109 85 107 108 76 ...
## $ Lv50.Def : int 69 76 87 95 59 93 82 92 64 74 ...
## $ Lv50.Int : int 68 69 50 76 95 52 73 54 54 74 ...
## $ Lv50.Spd : int 95 68 75 61 86 76 69 76 93 103 ...
summary(digimon)
## Number Digimon Stage Type
## Min. : 1 Length:249 Length:249 Length:249
## 1st Qu.: 63 Class :character Class :character Class :character
## Median :125 Mode :character Mode :character Mode :character
## Mean :125
## 3rd Qu.:187
## Max. :249
## Attribute Memory Equip.Slots Lv.50.HP
## Length:249 Min. : 2.00 Min. :0.000 Min. : 530
## Class :character 1st Qu.: 6.00 1st Qu.:1.000 1st Qu.: 990
## Mode :character Median :12.00 Median :1.000 Median :1180
## Mean :11.99 Mean :1.574 Mean :1211
## 3rd Qu.:18.00 3rd Qu.:2.000 3rd Qu.:1480
## Max. :25.00 Max. :3.000 Max. :2080
## Lv50.SP Lv50.Atk Lv50.Def Lv50.Int
## Min. : 50.0 Min. : 52.0 Min. : 59.0 Min. : 50.0
## 1st Qu.: 84.0 1st Qu.: 89.0 1st Qu.: 93.0 1st Qu.: 79.0
## Median :104.0 Median :119.0 Median :113.0 Median :104.0
## Mean :109.8 Mean :124.5 Mean :116.4 Mean :112.6
## 3rd Qu.:132.0 3rd Qu.:153.0 3rd Qu.:138.0 3rd Qu.:138.0
## Max. :203.0 Max. :318.0 Max. :213.0 Max. :233.0
## Lv50.Spd
## Min. : 61.0
## 1st Qu.: 92.0
## Median :119.0
## Mean :120.4
## 3rd Qu.:143.0
## Max. :218.0
colnames(digimon) <- tolower(gsub("\\.", "_", colnames(digimon)))
head(move)
## Move SP.Cost Type Power Attribute Inheritable
## 1 Wolkenapalm I 3 Physical 65 Fire Yes
## 2 Wolkenapalm II 6 Physical 85 Fire Yes
## 3 Wolkenapalm III 9 Physical 105 Fire Yes
## 4 Burst Flame I 3 Magic 55 Fire Yes
## 5 Burst Flame II 6 Magic 75 Fire Yes
## 6 Burst Flame III 9 Magic 95 Fire Yes
## Description
## 1 Physical attack, 65 Fire damage to one foe. 95% accuracy.
## 2 Physical attack, 85 Fire damage to one foe. 95% accuracy.
## 3 Physical attack, 105 Fire damage to one foe. 95% accuracy.
## 4 Magic attack, 55 Fire damage to one foe. 95% accuracy.
## 5 Magic attack, 75 Fire damage to one foe. 95% accuracy.
## 6 Magic attack, 95 Fire damage to one foe. 95% accuracy.
str(move)
## 'data.frame': 387 obs. of 7 variables:
## $ Move : chr "Wolkenapalm I" "Wolkenapalm II" "Wolkenapalm III" "Burst Flame I" ...
## $ SP.Cost : int 3 6 9 3 6 9 4 7 10 10 ...
## $ Type : chr "Physical" "Physical" "Physical" "Magic" ...
## $ Power : int 65 85 105 55 75 95 30 45 75 30 ...
## $ Attribute : chr "Fire" "Fire" "Fire" "Fire" ...
## $ Inheritable: chr "Yes" "Yes" "Yes" "Yes" ...
## $ Description: chr "Physical attack, 65 Fire damage to one foe. 95% accuracy." "Physical attack, 85 Fire damage to one foe. 95% accuracy." "Physical attack, 105 Fire damage to one foe. 95% accuracy." "Magic attack, 55 Fire damage to one foe. 95% accuracy." ...
summary(move)
## Move SP.Cost Type Power
## Length:387 Min. : 0.00 Length:387 Min. : 0.00
## Class :character 1st Qu.: 6.00 Class :character 1st Qu.: 20.00
## Mode :character Median :10.00 Mode :character Median : 65.00
## Mean :14.03 Mean : 60.18
## 3rd Qu.:20.00 3rd Qu.: 95.00
## Max. :60.00 Max. :250.00
## Attribute Inheritable Description
## Length:387 Length:387 Length:387
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
##
##
##
colnames(move) <- tolower(gsub("\\.", "_", colnames(move)))
head(support)
## Name
## 1 Adroit Wisdom
## 2 All-Rounder
## 3 Analyze
## 4 Animal Colosseum
## 5 Aus Generics
## 6 Backwater Camp
## Description
## 1 Increases INT by 15%.
## 2 Increases ATK, DEF, INT and SPD by 5%.
## 3 Increases scan values by 10%.
## 4 Increases damage from Earth skills by 15%.
## 5 Increases SPD and EVA by 25% when HP drops below 25%.
## 6 Increases damage given by 20%, but also increases damage received by 20%.
str(support)
## 'data.frame': 86 obs. of 2 variables:
## $ Name : chr "Adroit Wisdom" "All-Rounder" "Analyze" "Animal Colosseum" ...
## $ Description: chr "Increases INT by 15%." "Increases ATK, DEF, INT and SPD by 5%." "Increases scan values by 10%." "Increases damage from Earth skills by 15%." ...
summary(support)
## Name Description
## Length:86 Length:86
## Class :character Class :character
## Mode :character Mode :character
colnames(support) <- tolower(gsub("\\.", "_", colnames(support)))
Exploring Attack Ratios
In the first visualization, we will explore the attack ratios of different moves using a bar graph. We will use the dplyr package to calculate the attack ratios and the ggplot2 package to create the graph.
move_ratios <- move %>%
mutate(attack_ratio = power / sp_cost)
move_ratios_sorted <- move_ratios %>%
arrange(desc(attack_ratio))
top_moves <- move_ratios_sorted %>%
head(10)
ggplot(top_moves, aes(x = move, y = attack_ratio)) +
geom_bar(stat = "identity") +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
ggtitle("Top 10 Moves with Highest Attack Ratio") +
xlab("Move") +
ylab("Attack Power to SP Ratio")
The graph shows the top 10 moves with the highest attack power to SP ratio.
Exploring Power-to-SP Ratios
In the second visualization, we will explore the power-to-SP ratios of different moves using a stacked bar graph. We will again use the dplyr and ggplot2 packages.
move$power_to_sp <- move$power/move$sp_cost
top_moves <- move[order(-move$power/move$sp_cost),][1:20,]
ggplot(top_moves, aes(x=move, y=power_to_sp, fill=type)) +
geom_bar(stat="identity", position="dodge") +
theme(axis.text.x = element_text(angle=90, hjust=1)) +
ggtitle("Top 20 moves by attack power to SP ratio") +
xlab("Move") + ylab("Attack Power to SP Ratio") +
scale_fill_brewer(palette="Set1")
The graph shows the top 20 moves with the highest attack power to SP ratio, grouped by move type.
Exploring Digimon Types
In this section, we will explore the distribution of Digimon by type using a bar chart. To do this, we will be using the dplyr and ggplot2 packages.
First, we will generate a summary of the Digimon team which will include the sum of the Digimon’s attacks and defenses.
team_summary <- digimon %>%
group_by(digimon) %>%
summarise(attack = sum(`lv50_atk`), defense = sum(`lv50_def`))
Next, we will count the number of Digimon per type.
type_count <- digimon %>%
count(type)
We will then create a bar chart that shows the number of Digimon per type.
ggplot(type_count, aes(x = type, y = n, fill = type)) +
geom_bar(stat = "identity") +
theme_minimal() +
xlab("Digimon Type") +
ylab("Number of Digimon") +
ggtitle("Number of Digimon per Type")
The following chart shows the distribution of Digimon by stage using a bar chart.
stage_summary <- digimon %>%
count(stage)
stage_summary
## stage n
## 1 Armor 3
## 2 Baby 5
## 3 Champion 54
## 4 In-Training 11
## 5 Mega 74
## 6 Rookie 38
## 7 Ultimate 58
## 8 Ultra 6
ggplot(stage_summary, aes(x = stage, y = n)) +
geom_bar(stat = "identity", fill = "green", alpha = 0.7) +
labs(title = "Distribution of Digimon by Stage",
x = "Stage",
y = "Number of Digimon") +
theme_minimal()
We can also explore the distribution of Digimon by attribute. To do this, we will use a bar chart.
table(digimon$attribute)
##
## Dark Earth Electric Fire Light Neutral Plant Water
## 37 24 25 33 29 28 25 24
## Wind
## 24
ggplot(digimon, aes(x = attribute)) +
geom_bar(fill = "blue", alpha = 0.7) +
labs(title = "Distribution of Digimon by Attribute",
x = "Attribute",
y = "Number of Digimon") +
theme_minimal()
Next, we will explore the tradeoff between health (HP) and energy (SP) using a scatter plot.
ggplot(digimon, aes(x = lv_50_hp, y = lv50_sp)) +
geom_point() +
labs(title = "Tradeoff between HP and SP",
x = "HP",
y = "SP")
The plot displays the tradeoff between the HP and SP of a collection of Digimon. The x-axis represents the HP of each Digimon at level 50, while the y-axis represents their SP at level 50. Each data point in the plot represents a single Digimon, with its position on the graph indicating its HP and SP values. The title of the plot is “Tradeoff between HP and SP”, and the x and y axes are labeled “HP” and “SP”, respectively. Overall, this plot can help to identify patterns and relationships between HP and SP among different Digimon species in the game.
Finally, we can explore the distribution of Digimon types across different stages. To do this, we generate a summary of the number of Digimon per type and stage, and create a stacked bar chart.
type_stage_summary <- digimon %>%
group_by(stage, type) %>%
summarise(n = n()) %>%
ungroup() %>%
mutate(prop = n/sum(n), .groups = 'drop')
## `summarise()` has grouped output by 'stage'. You can override using the
## `.groups` argument.
type_stage_summary2 <- as.data.frame(type_stage_summary)
ggplot(type_stage_summary, aes(x = stage, y = prop, fill = type)) +
geom_bar(stat = "identity") +
labs(title = "Distribution of Digimon Types by Stage",
x = "Stage",
y = "Proportion of Digimon")
This graph shows the distribution of Digimon types across different stages. The horizontal axis represents the stage of the Digimon, while the vertical axis shows the proportion of each type of Digimon in each stage. The height of each bar represents the proportion of a specific type of Digimon in a particular stage.
To create this graph, a summary of the number of Digimon by type and stage was first generated using the group_by and summarise functions of the dplyr package. Then, ggplot2 was used to create the stacked bar chart. Each bar is composed of different colors that represent the different types of Digimon. The colors are automatically assigned to each type by ggplot2.
In summary, this graph allows us to clearly visualize how different types of Digimon are distributed in each stage of the game. We can see that some types of Digimon are more common in certain stages than in others. For example, vaccine-type Digimon are more common in stages 3 and 4, while virus-type Digimon are more common in stage 5. This can be useful for those who want to build a balanced and strategic Digimon team based on the stages of the game.