knitr::opts_knit$set(root.dir ="C:/Users/funne/OneDrive/Desktop/DataSets/Digimon")Data 110 Project 2
Digimon : Digital Monsters
Digimon, short for “Digital Monsters,” is a Japanese multimedia franchise created by Akiyoshi Hongo. The Digimon anime series premiered in Japan in 1999, and quickly gained popularity around the world. It follows a group of young children who, after being transported to a digital world, befriend and partner with digital monsters called Digimon. Together, they battle evil forces and protect both the digital and human worlds. The series has had several iterations, with different storylines and characters, and has spawned numerous movies and spin-offs.The Digimon trading card game was first introduced in Japan in 1999 and was based on the anime series. It was later released in other countries, including the United States, and has continued to be popular among fans of the franchise. Players use a deck of cards featuring different Digimon characters and abilities to battle against each other. The game has undergone several updates and revisions over the years, and continues to be enjoyed by fans of all ages.
The Digimon database is a database of Digimon and their moves. The information it contains is based on “Digimon Digimon Story: Cyber Sleuth” a video game released for Playstation 4 in 2016.
First, set the working directory, then load in the the csv file from Digimon Database.
library(readr)Warning: package 'readr' was built under R version 4.2.3
library(tidyverse)Warning: package 'tidyverse' was built under R version 4.2.3
Warning: package 'ggplot2' was built under R version 4.2.3
Warning: package 'tibble' was built under R version 4.2.3
Warning: package 'dplyr' was built under R version 4.2.3
Warning: package 'lubridate' was built under R version 4.2.3
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.1 ✔ purrr 1.0.1
✔ forcats 1.0.0 ✔ stringr 1.5.0
✔ ggplot2 3.4.2 ✔ tibble 3.2.1
✔ lubridate 1.9.2 ✔ tidyr 1.3.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(RColorBrewer)
library(plotly)
Attaching package: 'plotly'
The following object is masked from 'package:ggplot2':
last_plot
The following object is masked from 'package:stats':
filter
The following object is masked from 'package:graphics':
layout
library(circlize)Warning: package 'circlize' was built under R version 4.2.3
========================================
circlize version 0.4.15
CRAN page: https://cran.r-project.org/package=circlize
Github page: https://github.com/jokergoo/circlize
Documentation: https://jokergoo.github.io/circlize_book/book/
If you use it in published research, please cite:
Gu, Z. circlize implements and enhances circular visualization
in R. Bioinformatics 2014.
This message can be suppressed by:
suppressPackageStartupMessages(library(circlize))
========================================
digimon_list <- read_csv("DigiDB_digimonlist.csv")Rows: 249 Columns: 13
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (4): Digimon, Stage, Type, Attribute
dbl (9): Number, Memory, Equip Slots, Lv 50 HP, Lv50 SP, Lv50 Atk, Lv50 Def,...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Lets investigate digimon_list. It provides the names of the Digimon, attribute, type, stage, etc.
head(digimon_list)# A tibble: 6 × 13
Number Digimon Stage Type Attribute Memory `Equip Slots` `Lv 50 HP` `Lv50 SP`
<dbl> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 1 Kuramon Baby Free Neutral 2 0 590 77
2 2 Pabumon Baby Free Neutral 2 0 950 62
3 3 Punimon Baby Free Neutral 2 0 870 50
4 4 Botamon Baby Free Neutral 2 0 690 68
5 5 Poyomon Baby Free Neutral 2 0 540 98
6 6 Koromon In-T… Free Fire 3 0 940 52
# ℹ 4 more variables: `Lv50 Atk` <dbl>, `Lv50 Def` <dbl>, `Lv50 Int` <dbl>,
# `Lv50 Spd` <dbl>
Change column names.
colnames(digimon_list) <- c("id", "name", "stage", "type", "attribute", "memory", "equip_slots", "hp", "sp", "atk", "def", "int", "spd")
colnames(digimon_list) [1] "id" "name" "stage" "type" "attribute"
[6] "memory" "equip_slots" "hp" "sp" "atk"
[11] "def" "int" "spd"
Exploring memory, hp, and attribute:
“Attributes” are the elemental file types that constitute Digimon, while type is one of 9 different elements across 3 cycles. Attributes and type determine how effective an attack will be against a Digimon of a different type.
Design a custom color palette to identify the attributes by color.
attribute<-c('Dark','Earth','Electric','Fire','Light','Neutral','Plant','Water','Wind')
color<-c("Black", "#B15928", "Yellow", "#E41A1C", "#FFD92F", "#E0E0E0" ,"#33A02C", "#377EB8", "#80B1D3")
my_colors<-data.frame(attribute,color)Convert attributes, stage, and type to factor.
digimon_list$attribute <- factor(digimon_list$attribute)
digimon_list$type <- factor(digimon_list$type)
digimon_list$stage <- factor(digimon_list$stage)
head(digimon_list)# A tibble: 6 × 13
id name stage type attribute memory equip_slots hp sp atk def
<dbl> <chr> <fct> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 Kuramon Baby Free Neutral 2 0 590 77 79 69
2 2 Pabumon Baby Free Neutral 2 0 950 62 76 76
3 3 Punimon Baby Free Neutral 2 0 870 50 97 87
4 4 Botamon Baby Free Neutral 2 0 690 68 77 95
5 5 Poyomon Baby Free Neutral 2 0 540 98 54 59
6 6 Koromon In-T… Free Fire 3 0 940 52 109 93
# ℹ 2 more variables: int <dbl>, spd <dbl>
In order to win a battle attack the attribute move used must win against the attribute of the target Digimon. Type is also very important to an effective attack. Choosing the right type of Digimon to battle your target is advantageous, creating an attack that yields greater damage to your opponent.To determine the ideal moves to carry (assuming just one move), we can look at the distribution of attributes by hit point and memory.
Create a scatterplot using the digimon_list data frame, graphing attribute by memory and hp. Use the colors saved in the my_colors dataframe to color the attributes.
plot1 <- digimon_list %>% ggplot( aes( hp, memory, color = attribute)) + xlab(" Level 50 Digimon HP") + ylab("Level 50 Digimon Memory") + xlim(500,2100) + ylim(0,25) + geom_point (alpha = 0.5) + scale_color_manual(values = c("Black", "#B15928", "Yellow", "#E41A1C", "#FFD92F", "#E0E0E0" ,"#33A02C", "#377EB8", "#80B1D3")) + ggtitle("Attribute by Memory and Hit Point")
plot1<- ggplotly(plot1)
plot1Create a second scatterplot using the digimon_list data frame, graphing type by memory and hp. Use the colors saved in the my_colors dataframe to color the attributes.
plot2 <- digimon_list %>% ggplot( aes( hp, memory, color = type)) + xlab(" Level 50 Digimon HP") + ylab("Level 50 Digimon Memory") + xlim(500,2100) + ylim(0,25) + geom_point (alpha = 0.5) + scale_color_manual(values = c("#80B1D3", "#E41A1C","#33A02C","#B15928")) + ggtitle("Type by Memory and Hit Point")
plot2 <- ggplotly(plot2)
plot2Observations from plot1 and plot2:
There is a weak positive correlation between hp and memory. This means that as the level 50 HP of a Digimon increases, its level 50 memory tends to increase as well, but the relationship is not very strong.
The free and virus attributes are the rarest found in the Digimon world, and show fewer data points as a result.Based on the scatterplot, there is a trend that the free and virus attributes tend to have lower levels of hp and memory.
The vaccine attribute is more common, and tends to have higher levels of hp and memory than the free and virus attributes. The data points for the dark attribute have lower levels of hp and memory by comparison.
Exploring commonality of attribute and type:
Group digimon_list by type and attribute, then create the factor ordered_type for the types of Digimon: Vaccine, Virus, Data, and Free. Finally, merge type_attr with colors in the color database.
type_attr<-data.frame(digimon_list %>% group_by(type, attribute) %>% summarize(count=n()))`summarise()` has grouped output by 'type'. You can override using the
`.groups` argument.
type_attr$ordered_type <- factor(type_attr$type, levels = c('Vaccine','Virus','Data','Free'))
type_attr<-merge(type_attr,my_colors, by= "attribute")With the new database type_attr, show the attributes ordered by types in a histogram. Move the legend position to make the labels readable.
plot3 <- type_attr %>% ggplot(aes(x= ordered_type,y= count)) + xlab("Ordered Type") + ylab("Count") +
geom_bar(aes(fill=attribute),position='fill',color='grey',stat='identity') +
scale_fill_manual(name='',values=color) +
ggtitle('Digimon Attributes by Types') +
theme(legend.position='top',axis.text.x=element_blank())
plot3 <- ggplotly(plot3)
plot3Observations from plot3:
Based on the plot, the most common attributes among different digimons are:
For the vaccine type, the most common attribute is data, followed by vaccine and then virus.
For the data type, the most common attribute is data, followed by vaccine and then virus.
For the virus type, the most common attribute is virus, followed by data and then vaccine.
Overall, it can be observed that data and vaccine attributes are more commonly found among different types of digimons, while virus is more specific to the virus type.
Chi squared test:
The chisq.test() function is used to perform a chi-squared test of independence on the two variables, to determine whether there is a significant association between the two variables.
table(digimon_list$type,digimon_list$attribute)
Dark Earth Electric Fire Light Neutral Plant Water Wind
Data 3 7 6 8 5 8 10 6 7
Free 6 4 0 3 2 11 4 2 5
Vaccine 2 4 6 11 20 5 4 8 10
Virus 26 9 13 11 2 4 7 8 2
chisq.test(digimon_list$type,digimon_list$attribute)Warning in chisq.test(digimon_list$type, digimon_list$attribute): Chi-squared
approximation may be incorrect
Pearson's Chi-squared test
data: digimon_list$type and digimon_list$attribute
X-squared = 88.03, df = 24, p-value = 3.046e-09
Observations of chi squared test:
The output shows that there are 24 degrees of freedom for this test. Degrees of freedom are determined by the number of rows and columns in the contingency table.The p-value of 3.046e-09 is very small, indicating strong evidence that there is a significant association between type and attribute. It suggests that certain types of Digimon may be more likely to have certain attributes.
Exploring Attributes vs. stages
Create a new data frame called stage_attr with the frequency count of each combination of stage and attribute in the digimon_list data frame. Group digimon_list by stage and attribute then summarize the counts.
stage_attr<-digimon_list %>% group_by(stage, attribute) %>% summarize(count=n())`summarise()` has grouped output by 'stage'. You can override using the
`.groups` argument.
stage_attr$ordered_stage <- factor(stage_attr$stage, levels = c('Baby','In-Training','Rookie','Champion','Ultimate','Mega','Ultra','Armor'))
stage_attr<-merge(stage_attr,my_colors,by='attribute')Use the newly created dataframe to create an interactive plot that demonstrates the distribution of stage in the attributes.
plot4 <- plot_ly(stage_attr,x = ~attribute, y= ~count, color = ~stage, colors ="PuOr") %>% layout(title = "Stage Attribute Distribution",
xaxis = list(title = "Attribute"),
yaxis = list(title = "Count"),
legend = list(title = "Stage", orientation = "h", y = -0.2))
plot4No trace type specified:
Based on info supplied, a 'bar' trace seems appropriate.
Read more about this trace type -> https://plotly.com/r/reference/#bar
Observations from plot4:
The “Mega” stage has the largest number of Digimon with most attributes, followed by “Ultimate” and “Champion” stages.
The “Rookie” stage has the largest number of Digimon with the “Plant” attribute, followed by “Champion” stage.
The “Electric” attribute is most common among “Mega” stage Digimon, while the “Earth” attribute is most common among “Champion” stage Digimon.
The “Light” attribute is most common among “Mega” and “Ultimate” stage Digimon, while the “Fire” attribute is most common among “Champion” stage Digimon.
The “Neutral” attribute is distributed more evenly across different stages, but is most common among “Mega” and “Ultimate” stage Digimon.
Attributes vs Stages continued
Create a chord diagram, The chord visualizesthe relationships between attributes and stages of the digimons. The outer circle represents the stages of the digimons, while the inner circle represents the attributes. The thickness of the chords connecting the stages and attributes represents the frequency of their occurrence in the dataset. The grey chords represent connections that are not significant enough to be colored
plot5 <- chordDiagram(
stage_attr[,c(1:2)],
transparency = 0.15,
grid.col = append(color,rep('grey',8)),
col= as.character(stage_attr$color))
title(main = "Attributes vs Stages")plot5 rn cn value1 value2 o1 o2 x1 x2 col
1 Dark In-Training 1 1 6 8 6 8 #000000D8
2 Dark Champion 1 1 5 9 5 9 #000000D8
3 Dark Ultra 1 1 4 3 4 3 #000000D8
4 Dark Mega 1 1 3 9 3 9 #000000D8
5 Dark Ultimate 1 1 2 9 2 9 #000000D8
6 Dark Rookie 1 1 1 9 1 9 #000000D8
7 Earth Armor 1 1 1 3 1 3 #B15928D8
8 Earth In-Training 1 1 5 7 6 7 #B15928D8
9 Earth Ultimate 1 1 3 8 3 8 #B15928D8
10 Earth Champion 1 1 6 8 5 8 #B15928D8
11 Earth Rookie 1 1 4 8 2 8 #B15928D8
12 Earth Mega 1 1 2 8 4 8 #B15928D8
13 Electric Ultimate 1 1 3 7 2 7 #FFFF00D8
14 Electric Champion 1 1 1 7 4 7 #FFFF00D8
15 Electric Rookie 1 1 4 7 1 7 #FFFF00D8
16 Electric Mega 1 1 2 7 3 7 #FFFF00D8
17 Fire Ultimate 1 1 6 6 3 6 #E41A1CD8
18 Fire Rookie 1 1 2 6 2 6 #E41A1CD8
19 Fire Champion 1 1 1 6 5 6 #E41A1CD8
20 Fire In-Training 1 1 5 6 6 6 #E41A1CD8
21 Fire Mega 1 1 3 6 4 6 #E41A1CD8
22 Fire Armor 1 1 4 2 1 2 #E41A1CD8
23 Light Ultimate 1 1 6 5 3 5 #FFD92FD8
24 Light Champion 1 1 3 5 6 5 #FFD92FD8
25 Light Rookie 1 1 1 5 2 5 #FFD92FD8
26 Light Ultra 1 1 5 2 5 2 #FFD92FD8
27 Light Mega 1 1 4 5 4 5 #FFD92FD8
28 Light Armor 1 1 2 1 1 1 #FFD92FD8
29 Light In-Training 1 1 7 5 7 5 #FFD92FD8
30 Neutral Mega 1 1 6 4 4 4 #E0E0E0D8
31 Neutral Champion 1 1 7 4 6 4 #E0E0E0D8
32 Neutral In-Training 1 1 4 4 7 4 #E0E0E0D8
33 Neutral Ultimate 1 1 1 4 3 4 #E0E0E0D8
34 Neutral Ultra 1 1 5 1 5 1 #E0E0E0D8
35 Neutral Baby 1 1 2 1 1 1 #E0E0E0D8
36 Neutral Rookie 1 1 3 4 2 4 #E0E0E0D8
37 Plant Rookie 1 1 1 3 1 3 #33A02CD8
38 Plant Ultimate 1 1 2 3 2 3 #33A02CD8
39 Plant Champion 1 1 4 3 4 3 #33A02CD8
40 Plant Mega 1 1 3 3 3 3 #33A02CD8
41 Plant In-Training 1 1 5 3 5 3 #33A02CD8
42 Water Ultimate 1 1 3 2 2 2 #377EB8D8
43 Water Champion 1 1 1 2 4 2 #377EB8D8
44 Water Rookie 1 1 4 2 1 2 #377EB8D8
45 Water Mega 1 1 2 2 3 2 #377EB8D8
46 Water In-Training 1 1 5 2 5 2 #377EB8D8
47 Wind Rookie 1 1 1 1 1 1 #80B1D3D8
48 Wind Mega 1 1 3 1 3 1 #80B1D3D8
49 Wind Ultimate 1 1 2 1 2 1 #80B1D3D8
50 Wind Champion 1 1 4 1 4 1 #80B1D3D8
51 Wind In-Training 1 1 5 1 5 1 #80B1D3D8