Data 110 Project 2

Author

Shadeja Fuentes

Digimon 2 Adventures; 2000

Digimon : Digital Monsters

The Digimon database is a database of Digimon and their moves. The information it contains is based on “Digimon Digimon Story: Cyber Sleuth” a video game released for Playstation 4 in 2016.

First, set the working directory, then load in the the csv file from Digimon Database.

knitr::opts_knit$set(root.dir ="C:/Users/funne/OneDrive/Desktop/DataSets/Digimon")
library(readr)
Warning: package 'readr' was built under R version 4.2.3
library(tidyverse)
Warning: package 'tidyverse' was built under R version 4.2.3
Warning: package 'ggplot2' was built under R version 4.2.3
Warning: package 'tibble' was built under R version 4.2.3
Warning: package 'dplyr' was built under R version 4.2.3
Warning: package 'lubridate' was built under R version 4.2.3
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.1     ✔ purrr     1.0.1
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.2     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(RColorBrewer)
library(plotly)

Attaching package: 'plotly'

The following object is masked from 'package:ggplot2':

    last_plot

The following object is masked from 'package:stats':

    filter

The following object is masked from 'package:graphics':

    layout
library(circlize)
Warning: package 'circlize' was built under R version 4.2.3
========================================
circlize version 0.4.15
CRAN page: https://cran.r-project.org/package=circlize
Github page: https://github.com/jokergoo/circlize
Documentation: https://jokergoo.github.io/circlize_book/book/

If you use it in published research, please cite:
Gu, Z. circlize implements and enhances circular visualization
  in R. Bioinformatics 2014.

This message can be suppressed by:
  suppressPackageStartupMessages(library(circlize))
========================================
digimon_list <- read_csv("DigiDB_digimonlist.csv")
Rows: 249 Columns: 13
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (4): Digimon, Stage, Type, Attribute
dbl (9): Number, Memory, Equip Slots, Lv 50 HP, Lv50 SP, Lv50 Atk, Lv50 Def,...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Lets investigate digimon_list. It provides the names of the Digimon, attribute, type, stage, etc.

head(digimon_list)
# A tibble: 6 × 13
  Number Digimon Stage Type  Attribute Memory `Equip Slots` `Lv 50 HP` `Lv50 SP`
   <dbl> <chr>   <chr> <chr> <chr>      <dbl>         <dbl>      <dbl>     <dbl>
1      1 Kuramon Baby  Free  Neutral        2             0        590        77
2      2 Pabumon Baby  Free  Neutral        2             0        950        62
3      3 Punimon Baby  Free  Neutral        2             0        870        50
4      4 Botamon Baby  Free  Neutral        2             0        690        68
5      5 Poyomon Baby  Free  Neutral        2             0        540        98
6      6 Koromon In-T… Free  Fire           3             0        940        52
# ℹ 4 more variables: `Lv50 Atk` <dbl>, `Lv50 Def` <dbl>, `Lv50 Int` <dbl>,
#   `Lv50 Spd` <dbl>

Change column names.

colnames(digimon_list) <- c("id", "name", "stage", "type", "attribute", "memory", "equip_slots", "hp", "sp", "atk", "def", "int", "spd")
colnames(digimon_list)
 [1] "id"          "name"        "stage"       "type"        "attribute"  
 [6] "memory"      "equip_slots" "hp"          "sp"          "atk"        
[11] "def"         "int"         "spd"        

Digimon evolution

Exploring memory, hp, and attribute:

“Attributes” are the elemental file types that constitute Digimon, while type is one of 9 different elements across 3 cycles. Attributes and type determine how effective an attack will be against a Digimon of a different type.

Design a custom color palette to identify the attributes by color.

attribute<-c('Dark','Earth','Electric','Fire','Light','Neutral','Plant','Water','Wind')
color<-c("Black", "#B15928", "Yellow", "#E41A1C", "#FFD92F", "#E0E0E0" ,"#33A02C", "#377EB8", "#80B1D3")
my_colors<-data.frame(attribute,color)

Convert attributes, stage, and type to factor.

digimon_list$attribute <- factor(digimon_list$attribute)
digimon_list$type <- factor(digimon_list$type)
digimon_list$stage <- factor(digimon_list$stage)
head(digimon_list)
# A tibble: 6 × 13
     id name    stage type  attribute memory equip_slots    hp    sp   atk   def
  <dbl> <chr>   <fct> <fct> <fct>      <dbl>       <dbl> <dbl> <dbl> <dbl> <dbl>
1     1 Kuramon Baby  Free  Neutral        2           0   590    77    79    69
2     2 Pabumon Baby  Free  Neutral        2           0   950    62    76    76
3     3 Punimon Baby  Free  Neutral        2           0   870    50    97    87
4     4 Botamon Baby  Free  Neutral        2           0   690    68    77    95
5     5 Poyomon Baby  Free  Neutral        2           0   540    98    54    59
6     6 Koromon In-T… Free  Fire           3           0   940    52   109    93
# ℹ 2 more variables: int <dbl>, spd <dbl>

In order to win a battle attack the attribute move used must win against the attribute of the target Digimon. Type is also very important to an effective attack. Choosing the right type of Digimon to battle your target is advantageous, creating an attack that yields greater damage to your opponent.To determine the ideal moves to carry (assuming just one move), we can look at the distribution of attributes by hit point and memory.

Create a scatterplot using the digimon_list data frame, graphing attribute by memory and hp. Use the colors saved in the my_colors dataframe to color the attributes.

plot1 <- digimon_list %>% ggplot( aes( hp, memory, color = attribute)) + xlab(" Level 50 Digimon HP") + ylab("Level 50 Digimon Memory") + xlim(500,2100) + ylim(0,25) + geom_point (alpha = 0.5) + scale_color_manual(values = c("Black", "#B15928", "Yellow", "#E41A1C", "#FFD92F", "#E0E0E0" ,"#33A02C", "#377EB8", "#80B1D3")) +  ggtitle("Attribute by Memory and Hit Point")
plot1<- ggplotly(plot1)
plot1

Create a second scatterplot using the digimon_list data frame, graphing type by memory and hp. Use the colors saved in the my_colors dataframe to color the attributes.

plot2 <- digimon_list %>% ggplot( aes( hp, memory, color = type)) + xlab(" Level 50 Digimon HP") + ylab("Level 50 Digimon Memory") + xlim(500,2100) + ylim(0,25) + geom_point (alpha = 0.5) + scale_color_manual(values = c("#80B1D3", "#E41A1C","#33A02C","#B15928"))  + ggtitle("Type by Memory and Hit Point")
plot2 <- ggplotly(plot2)
plot2

Observations from plot1 and plot2:

There is a weak positive correlation between hp and memory. This means that as the level 50 HP of a Digimon increases, its level 50 memory tends to increase as well, but the relationship is not very strong.

The free and virus attributes are the rarest found in the Digimon world, and show fewer data points as a result.Based on the scatterplot, there is a trend that the free and virus attributes tend to have lower levels of hp and memory.

The vaccine attribute is more common, and tends to have higher levels of hp and memory than the free and virus attributes. The data points for the dark attribute have lower levels of hp and memory by comparison.

Tai Kamiya; Greymon; Garurumon

Exploring commonality of attribute and type:

Group digimon_list by type and attribute, then create the factor ordered_type for the types of Digimon: Vaccine, Virus, Data, and Free. Finally, merge type_attr with colors in the color database.

type_attr<-data.frame(digimon_list %>% group_by(type, attribute) %>% summarize(count=n()))
`summarise()` has grouped output by 'type'. You can override using the
`.groups` argument.
type_attr$ordered_type <- factor(type_attr$type, levels = c('Vaccine','Virus','Data','Free'))
type_attr<-merge(type_attr,my_colors, by= "attribute")

With the new database type_attr, show the attributes ordered by types in a histogram. Move the legend position to make the labels readable.

plot3 <- type_attr %>%  ggplot(aes(x= ordered_type,y= count)) + xlab("Ordered Type") + ylab("Count") +
  geom_bar(aes(fill=attribute),position='fill',color='grey',stat='identity') + 
  scale_fill_manual(name='',values=color) + 
  ggtitle('Digimon Attributes by Types') +
  theme(legend.position='top',axis.text.x=element_blank())
plot3 <- ggplotly(plot3)
plot3

Observations from plot3:

Based on the plot, the most common attributes among different digimons are:

  • For the vaccine type, the most common attribute is data, followed by vaccine and then virus.

  • For the data type, the most common attribute is data, followed by vaccine and then virus.

  • For the virus type, the most common attribute is virus, followed by data and then vaccine.

Overall, it can be observed that data and vaccine attributes are more commonly found among different types of digimons, while virus is more specific to the virus type.

Chi squared test:

The chisq.test() function is used to perform a chi-squared test of independence on the two variables, to determine whether there is a significant association between the two variables.

table(digimon_list$type,digimon_list$attribute) 
         
          Dark Earth Electric Fire Light Neutral Plant Water Wind
  Data       3     7        6    8     5       8    10     6    7
  Free       6     4        0    3     2      11     4     2    5
  Vaccine    2     4        6   11    20       5     4     8   10
  Virus     26     9       13   11     2       4     7     8    2
chisq.test(digimon_list$type,digimon_list$attribute)
Warning in chisq.test(digimon_list$type, digimon_list$attribute): Chi-squared
approximation may be incorrect

    Pearson's Chi-squared test

data:  digimon_list$type and digimon_list$attribute
X-squared = 88.03, df = 24, p-value = 3.046e-09

Observations of chi squared test:

The output shows that there are 24 degrees of freedom for this test. Degrees of freedom are determined by the number of rows and columns in the contingency table.The p-value of 3.046e-09 is very small, indicating strong evidence that there is a significant association between type and attribute. It suggests that certain types of Digimon may be more likely to have certain attributes.

Digimon Characters

Exploring Attributes vs. stages

Create a new data frame called stage_attr with the frequency count of each combination of stage and attribute in the digimon_list data frame. Group digimon_list by stage and attribute then summarize the counts.

stage_attr<-digimon_list %>% group_by(stage, attribute) %>% summarize(count=n())
`summarise()` has grouped output by 'stage'. You can override using the
`.groups` argument.
stage_attr$ordered_stage <- factor(stage_attr$stage, levels = c('Baby','In-Training','Rookie','Champion','Ultimate','Mega','Ultra','Armor'))
stage_attr<-merge(stage_attr,my_colors,by='attribute')

Use the newly created dataframe to create an interactive plot that demonstrates the distribution of stage in the attributes.

plot4 <- plot_ly(stage_attr,x = ~attribute, y= ~count, color = ~stage, colors ="PuOr") %>% layout(title = "Stage Attribute Distribution",
         xaxis = list(title = "Attribute"),
         yaxis = list(title = "Count"),
         legend = list(title = "Stage", orientation = "h", y = -0.2))
plot4
No trace type specified:
  Based on info supplied, a 'bar' trace seems appropriate.
  Read more about this trace type -> https://plotly.com/r/reference/#bar

Observations from plot4:

  • The “Mega” stage has the largest number of Digimon with most attributes, followed by “Ultimate” and “Champion” stages.

  • The “Rookie” stage has the largest number of Digimon with the “Plant” attribute, followed by “Champion” stage.

  • The “Electric” attribute is most common among “Mega” stage Digimon, while the “Earth” attribute is most common among “Champion” stage Digimon.

  • The “Light” attribute is most common among “Mega” and “Ultimate” stage Digimon, while the “Fire” attribute is most common among “Champion” stage Digimon.

  • The “Neutral” attribute is distributed more evenly across different stages, but is most common among “Mega” and “Ultimate” stage Digimon.

Attributes vs Stages continued

Create a chord diagram, The chord visualizesthe relationships between attributes and stages of the digimons. The outer circle represents the stages of the digimons, while the inner circle represents the attributes. The thickness of the chords connecting the stages and attributes represents the frequency of their occurrence in the dataset. The grey chords represent connections that are not significant enough to be colored

plot5 <- chordDiagram(
  stage_attr[,c(1:2)],
  transparency = 0.15, 
  grid.col = append(color,rep('grey',8)), 
  col= as.character(stage_attr$color))
title(main = "Attributes vs Stages")

plot5
         rn          cn value1 value2 o1 o2 x1 x2       col
1      Dark In-Training      1      1  6  8  6  8 #000000D8
2      Dark    Champion      1      1  5  9  5  9 #000000D8
3      Dark       Ultra      1      1  4  3  4  3 #000000D8
4      Dark        Mega      1      1  3  9  3  9 #000000D8
5      Dark    Ultimate      1      1  2  9  2  9 #000000D8
6      Dark      Rookie      1      1  1  9  1  9 #000000D8
7     Earth       Armor      1      1  1  3  1  3 #B15928D8
8     Earth In-Training      1      1  5  7  6  7 #B15928D8
9     Earth    Ultimate      1      1  3  8  3  8 #B15928D8
10    Earth    Champion      1      1  6  8  5  8 #B15928D8
11    Earth      Rookie      1      1  4  8  2  8 #B15928D8
12    Earth        Mega      1      1  2  8  4  8 #B15928D8
13 Electric    Ultimate      1      1  3  7  2  7 #FFFF00D8
14 Electric    Champion      1      1  1  7  4  7 #FFFF00D8
15 Electric      Rookie      1      1  4  7  1  7 #FFFF00D8
16 Electric        Mega      1      1  2  7  3  7 #FFFF00D8
17     Fire    Ultimate      1      1  6  6  3  6 #E41A1CD8
18     Fire      Rookie      1      1  2  6  2  6 #E41A1CD8
19     Fire    Champion      1      1  1  6  5  6 #E41A1CD8
20     Fire In-Training      1      1  5  6  6  6 #E41A1CD8
21     Fire        Mega      1      1  3  6  4  6 #E41A1CD8
22     Fire       Armor      1      1  4  2  1  2 #E41A1CD8
23    Light    Ultimate      1      1  6  5  3  5 #FFD92FD8
24    Light    Champion      1      1  3  5  6  5 #FFD92FD8
25    Light      Rookie      1      1  1  5  2  5 #FFD92FD8
26    Light       Ultra      1      1  5  2  5  2 #FFD92FD8
27    Light        Mega      1      1  4  5  4  5 #FFD92FD8
28    Light       Armor      1      1  2  1  1  1 #FFD92FD8
29    Light In-Training      1      1  7  5  7  5 #FFD92FD8
30  Neutral        Mega      1      1  6  4  4  4 #E0E0E0D8
31  Neutral    Champion      1      1  7  4  6  4 #E0E0E0D8
32  Neutral In-Training      1      1  4  4  7  4 #E0E0E0D8
33  Neutral    Ultimate      1      1  1  4  3  4 #E0E0E0D8
34  Neutral       Ultra      1      1  5  1  5  1 #E0E0E0D8
35  Neutral        Baby      1      1  2  1  1  1 #E0E0E0D8
36  Neutral      Rookie      1      1  3  4  2  4 #E0E0E0D8
37    Plant      Rookie      1      1  1  3  1  3 #33A02CD8
38    Plant    Ultimate      1      1  2  3  2  3 #33A02CD8
39    Plant    Champion      1      1  4  3  4  3 #33A02CD8
40    Plant        Mega      1      1  3  3  3  3 #33A02CD8
41    Plant In-Training      1      1  5  3  5  3 #33A02CD8
42    Water    Ultimate      1      1  3  2  2  2 #377EB8D8
43    Water    Champion      1      1  1  2  4  2 #377EB8D8
44    Water      Rookie      1      1  4  2  1  2 #377EB8D8
45    Water        Mega      1      1  2  2  3  2 #377EB8D8
46    Water In-Training      1      1  5  2  5  2 #377EB8D8
47     Wind      Rookie      1      1  1  1  1  1 #80B1D3D8
48     Wind        Mega      1      1  3  1  3  1 #80B1D3D8
49     Wind    Ultimate      1      1  2  1  2  1 #80B1D3D8
50     Wind    Champion      1      1  4  1  4  1 #80B1D3D8
51     Wind In-Training      1      1  5  1  5  1 #80B1D3D8

Conclusion

The attributes scatterplot shows that the majority of Digimons have a memory of 8 or 12, with very few Digimons having memory values between 16 and 24. It also shows that Digimons with fire attributes tend to have higher hit points compared to other attributes. The type scatterplot shows that most Digimons are of the virus and data types, and that virus-type Digimons tend to have higher hit points compared to the other types.Digimon with the Light attribute and Vaccine type tend to have higher HP and memory compared to other attributes and types. It may be advantageous to choose Digimon with these attributes and types for battles, as they may have a greater chance of success. Additionally, the scatterplots provide a visual representation of the distribution of Digimon based on their attributes and types, which could be useful for players trying to build a balanced team of Digimon for battles.