Introduction

The Digimon franchise, similar to Pokemon, revolves around capturing, caring for, and training monsters for combat. This dataset contains information on Digimon from “Digimon Story: Cyber Sleuth,” a video game released for Playstation Vita in 2015 and Playstation 4 in 2016. The dataset includes three files: a list of all the Digimon that can be captured or fought in Cyber Sleuth, all the moves that Digimon can perform, and all the Support Skills. The dataset was created by Mark Korsak and is used with permission. An interactive version of the database can be found at http://digidb.io/.

In this analysis, we will be exploring the Digimon dataset to gain insights and understanding about the various attributes of different Digimon. We will be analyzing data on the attacks and defense capabilities of Digimon, the type of Digimon, the stages of development, and other attributes.

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.1     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the ]8;;http://conflicted.r-lib.org/conflicted package]8;; to force all conflicts to become errors
library(janitor)
## 
## Attaching package: 'janitor'
## 
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test

Loading data

We will start by loading the Digimon, Move, and Support data into our workspace using the read.csv function. We will also use the janitor library to clean up the column names.

digimon <- read.csv("DigiDB_digimonlist.csv")
move <- read.csv("DigiDB_movelist.csv")
support <- read.csv("DigiDB_supportlist.csv")
head(digimon)
##   Number Digimon       Stage Type Attribute Memory Equip.Slots Lv.50.HP Lv50.SP
## 1      1 Kuramon        Baby Free   Neutral      2           0      590      77
## 2      2 Pabumon        Baby Free   Neutral      2           0      950      62
## 3      3 Punimon        Baby Free   Neutral      2           0      870      50
## 4      4 Botamon        Baby Free   Neutral      2           0      690      68
## 5      5 Poyomon        Baby Free   Neutral      2           0      540      98
## 6      6 Koromon In-Training Free      Fire      3           0      940      52
##   Lv50.Atk Lv50.Def Lv50.Int Lv50.Spd
## 1       79       69       68       95
## 2       76       76       69       68
## 3       97       87       50       75
## 4       77       95       76       61
## 5       54       59       95       86
## 6      109       93       52       76
str(digimon)
## 'data.frame':    249 obs. of  13 variables:
##  $ Number     : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Digimon    : chr  "Kuramon" "Pabumon" "Punimon" "Botamon" ...
##  $ Stage      : chr  "Baby" "Baby" "Baby" "Baby" ...
##  $ Type       : chr  "Free" "Free" "Free" "Free" ...
##  $ Attribute  : chr  "Neutral" "Neutral" "Neutral" "Neutral" ...
##  $ Memory     : int  2 2 2 2 2 3 3 3 3 3 ...
##  $ Equip.Slots: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Lv.50.HP   : int  590 950 870 690 540 940 1030 930 930 640 ...
##  $ Lv50.SP    : int  77 62 50 68 98 52 64 54 64 86 ...
##  $ Lv50.Atk   : int  79 76 97 77 54 109 85 107 108 76 ...
##  $ Lv50.Def   : int  69 76 87 95 59 93 82 92 64 74 ...
##  $ Lv50.Int   : int  68 69 50 76 95 52 73 54 54 74 ...
##  $ Lv50.Spd   : int  95 68 75 61 86 76 69 76 93 103 ...
summary(digimon)
##      Number      Digimon             Stage               Type          
##  Min.   :  1   Length:249         Length:249         Length:249        
##  1st Qu.: 63   Class :character   Class :character   Class :character  
##  Median :125   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :125                                                           
##  3rd Qu.:187                                                           
##  Max.   :249                                                           
##   Attribute             Memory       Equip.Slots       Lv.50.HP   
##  Length:249         Min.   : 2.00   Min.   :0.000   Min.   : 530  
##  Class :character   1st Qu.: 6.00   1st Qu.:1.000   1st Qu.: 990  
##  Mode  :character   Median :12.00   Median :1.000   Median :1180  
##                     Mean   :11.99   Mean   :1.574   Mean   :1211  
##                     3rd Qu.:18.00   3rd Qu.:2.000   3rd Qu.:1480  
##                     Max.   :25.00   Max.   :3.000   Max.   :2080  
##     Lv50.SP         Lv50.Atk        Lv50.Def        Lv50.Int    
##  Min.   : 50.0   Min.   : 52.0   Min.   : 59.0   Min.   : 50.0  
##  1st Qu.: 84.0   1st Qu.: 89.0   1st Qu.: 93.0   1st Qu.: 79.0  
##  Median :104.0   Median :119.0   Median :113.0   Median :104.0  
##  Mean   :109.8   Mean   :124.5   Mean   :116.4   Mean   :112.6  
##  3rd Qu.:132.0   3rd Qu.:153.0   3rd Qu.:138.0   3rd Qu.:138.0  
##  Max.   :203.0   Max.   :318.0   Max.   :213.0   Max.   :233.0  
##     Lv50.Spd    
##  Min.   : 61.0  
##  1st Qu.: 92.0  
##  Median :119.0  
##  Mean   :120.4  
##  3rd Qu.:143.0  
##  Max.   :218.0
colnames(digimon) <- tolower(gsub("\\.", "_", colnames(digimon)))

head(move)
##              Move SP.Cost     Type Power Attribute Inheritable
## 1   Wolkenapalm I       3 Physical    65      Fire         Yes
## 2  Wolkenapalm II       6 Physical    85      Fire         Yes
## 3 Wolkenapalm III       9 Physical   105      Fire         Yes
## 4   Burst Flame I       3    Magic    55      Fire         Yes
## 5  Burst Flame II       6    Magic    75      Fire         Yes
## 6 Burst Flame III       9    Magic    95      Fire         Yes
##                                                  Description
## 1  Physical attack, 65 Fire damage to one foe. 95% accuracy.
## 2  Physical attack, 85 Fire damage to one foe. 95% accuracy.
## 3 Physical attack, 105 Fire damage to one foe. 95% accuracy.
## 4     Magic attack, 55 Fire damage to one foe. 95% accuracy.
## 5     Magic attack, 75 Fire damage to one foe. 95% accuracy.
## 6     Magic attack, 95 Fire damage to one foe. 95% accuracy.
str(move)
## 'data.frame':    387 obs. of  7 variables:
##  $ Move       : chr  "Wolkenapalm I" "Wolkenapalm II" "Wolkenapalm III" "Burst Flame I" ...
##  $ SP.Cost    : int  3 6 9 3 6 9 4 7 10 10 ...
##  $ Type       : chr  "Physical" "Physical" "Physical" "Magic" ...
##  $ Power      : int  65 85 105 55 75 95 30 45 75 30 ...
##  $ Attribute  : chr  "Fire" "Fire" "Fire" "Fire" ...
##  $ Inheritable: chr  "Yes" "Yes" "Yes" "Yes" ...
##  $ Description: chr  "Physical attack, 65 Fire damage to one foe. 95% accuracy." "Physical attack, 85 Fire damage to one foe. 95% accuracy." "Physical attack, 105 Fire damage to one foe. 95% accuracy." "Magic attack, 55 Fire damage to one foe. 95% accuracy." ...
summary(move)
##      Move              SP.Cost          Type               Power       
##  Length:387         Min.   : 0.00   Length:387         Min.   :  0.00  
##  Class :character   1st Qu.: 6.00   Class :character   1st Qu.: 20.00  
##  Mode  :character   Median :10.00   Mode  :character   Median : 65.00  
##                     Mean   :14.03                      Mean   : 60.18  
##                     3rd Qu.:20.00                      3rd Qu.: 95.00  
##                     Max.   :60.00                      Max.   :250.00  
##   Attribute         Inheritable        Description       
##  Length:387         Length:387         Length:387        
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
## 
colnames(move) <- tolower(gsub("\\.", "_", colnames(move)))

head(support)
##               Name
## 1    Adroit Wisdom
## 2      All-Rounder
## 3          Analyze
## 4 Animal Colosseum
## 5     Aus Generics
## 6   Backwater Camp
##                                                                 Description
## 1                                                     Increases INT by 15%.
## 2                                    Increases ATK, DEF, INT and SPD by 5%.
## 3                                             Increases scan values by 10%.
## 4                                Increases damage from Earth skills by 15%.
## 5                     Increases SPD and EVA by 25% when HP drops below 25%.
## 6 Increases damage given by 20%, but also increases damage received by 20%.
str(support)
## 'data.frame':    86 obs. of  2 variables:
##  $ Name       : chr  "Adroit Wisdom" "All-Rounder" "Analyze" "Animal Colosseum" ...
##  $ Description: chr  "Increases INT by 15%." "Increases ATK, DEF, INT and SPD by 5%." "Increases scan values by 10%." "Increases damage from Earth skills by 15%." ...
summary(support)
##      Name           Description       
##  Length:86          Length:86         
##  Class :character   Class :character  
##  Mode  :character   Mode  :character
colnames(support) <- tolower(gsub("\\.", "_", colnames(support)))

Exploring Attack Ratios

In the first visualization, we will explore the attack ratios of different moves using a bar graph. We will use the dplyr package to calculate the attack ratios and the ggplot2 package to create the graph.

move_ratios <- move %>%
  mutate(attack_ratio = power / sp_cost)

move_ratios_sorted <- move_ratios %>%
  arrange(desc(attack_ratio))

top_moves <- move_ratios_sorted %>%
  head(10)

ggplot(top_moves, aes(x = move, y = attack_ratio)) +
  geom_bar(stat = "identity") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  ggtitle("Top 10 Moves with Highest Attack Ratio") +
  xlab("Move") +
  ylab("Attack Power to SP Ratio")

The graph shows the top 10 moves with the highest attack power to SP ratio.

Exploring Power-to-SP Ratios

In the second visualization, we will explore the power-to-SP ratios of different moves using a stacked bar graph. We will again use the dplyr and ggplot2 packages.

move$power_to_sp <- move$power/move$sp_cost

top_moves <- move[order(-move$power/move$sp_cost),][1:20,]

ggplot(top_moves, aes(x=move, y=power_to_sp, fill=type)) +
  geom_bar(stat="identity", position="dodge") +
  theme(axis.text.x = element_text(angle=90, hjust=1)) +
  ggtitle("Top 20 moves by attack power to SP ratio") +
  xlab("Move") + ylab("Attack Power to SP Ratio") +
  scale_fill_brewer(palette="Set1")

The graph shows the top 20 moves with the highest attack power to SP ratio, grouped by move type.

Exploring Digimon Types

In this section, we will explore the distribution of Digimon by type using a bar chart. To do this, we will be using the dplyr and ggplot2 packages.

First, we will generate a summary of the Digimon team which will include the sum of the Digimon’s attacks and defenses.

team_summary <- digimon %>%
  group_by(digimon) %>%
  summarise(attack = sum(`lv50_atk`), defense = sum(`lv50_def`))

Next, we will count the number of Digimon per type.

type_count <- digimon %>% 
  count(type)

We will then create a bar chart that shows the number of Digimon per type.

ggplot(type_count, aes(x = type, y = n, fill = type)) +
  geom_bar(stat = "identity") +
  theme_minimal() +
  xlab("Digimon Type") +
  ylab("Number of Digimon") +
  ggtitle("Number of Digimon per Type")

The following chart shows the distribution of Digimon by stage using a bar chart.

stage_summary <- digimon %>%
  count(stage)

stage_summary
##         stage  n
## 1       Armor  3
## 2        Baby  5
## 3    Champion 54
## 4 In-Training 11
## 5        Mega 74
## 6      Rookie 38
## 7    Ultimate 58
## 8       Ultra  6
ggplot(stage_summary, aes(x = stage, y = n)) +
  geom_bar(stat = "identity", fill = "green", alpha = 0.7) +
  labs(title = "Distribution of Digimon by Stage",
       x = "Stage",
       y = "Number of Digimon") +
  theme_minimal()

We can also explore the distribution of Digimon by attribute. To do this, we will use a bar chart.

table(digimon$attribute)
## 
##     Dark    Earth Electric     Fire    Light  Neutral    Plant    Water 
##       37       24       25       33       29       28       25       24 
##     Wind 
##       24
ggplot(digimon, aes(x = attribute)) +
  geom_bar(fill = "blue", alpha = 0.7) +
  labs(title = "Distribution of Digimon by Attribute",
       x = "Attribute",
       y = "Number of Digimon") +
  theme_minimal()

Next, we will explore the tradeoff between health (HP) and energy (SP) using a scatter plot.

ggplot(digimon, aes(x = lv_50_hp, y = lv50_sp)) +
  geom_point() +
  labs(title = "Tradeoff between HP and SP",
       x = "HP",
       y = "SP")

The plot displays the tradeoff between the HP and SP of a collection of Digimon. The x-axis represents the HP of each Digimon at level 50, while the y-axis represents their SP at level 50. Each data point in the plot represents a single Digimon, with its position on the graph indicating its HP and SP values. The title of the plot is “Tradeoff between HP and SP”, and the x and y axes are labeled “HP” and “SP”, respectively. Overall, this plot can help to identify patterns and relationships between HP and SP among different Digimon species in the game.

Finally, we can explore the distribution of Digimon types across different stages. To do this, we generate a summary of the number of Digimon per type and stage, and create a stacked bar chart.

type_stage_summary <- digimon %>%
  group_by(stage, type) %>%
  summarise(n = n()) %>%
  ungroup() %>%
  mutate(prop = n/sum(n), .groups = 'drop')
## `summarise()` has grouped output by 'stage'. You can override using the
## `.groups` argument.
type_stage_summary2 <- as.data.frame(type_stage_summary)

ggplot(type_stage_summary, aes(x = stage, y = prop, fill = type)) +
  geom_bar(stat = "identity") +
  labs(title = "Distribution of Digimon Types by Stage",
       x = "Stage",
       y = "Proportion of Digimon")

This graph shows the distribution of Digimon types across different stages. The horizontal axis represents the stage of the Digimon, while the vertical axis shows the proportion of each type of Digimon in each stage. The height of each bar represents the proportion of a specific type of Digimon in a particular stage.

To create this graph, a summary of the number of Digimon by type and stage was first generated using the group_by and summarise functions of the dplyr package. Then, ggplot2 was used to create the stacked bar chart. Each bar is composed of different colors that represent the different types of Digimon. The colors are automatically assigned to each type by ggplot2.

In summary, this graph allows us to clearly visualize how different types of Digimon are distributed in each stage of the game. We can see that some types of Digimon are more common in certain stages than in others. For example, vaccine-type Digimon are more common in stages 3 and 4, while virus-type Digimon are more common in stage 5. This can be useful for those who want to build a balanced and strategic Digimon team based on the stages of the game.