Pokemon Catch em All
Pokemon Catch em All
## corrplot 0.84 loaded
Greetings trainers,
In this page(s) i will try to inform you things about Pokemon
Pokemon is …
Nah, i don’t have any obligation to explain you what Pokemon is, just Google it
What insight we will try to get :
1. What is 6 best Pokemon to start your journey with
2. Where is Pikachu?
3. Pokemon BMI, is it any correlation with attack/defense and versality?
4. What is the strongest legendary Pokemon? Each type
Loading, Preview and Cleaning Data
First we have to load the data
The capture_rate variable has factor type and we want to convert it to numeric
We’re going to separate the common and legendary Pokemons
Pokemon Starter Pack
There are many categories those defines a Pokemon categorized as starter pack
1. Capture Rate
2. Most Common Type
3. Versality based of damage taken
4. Attack/Defense ratio
Capture Rate
Analytics
The minimum value of capture rate is 1 and the maximum is 34. Based on those values we make the function
Graphics
As we see from the pie above, we can see that most of the Pokemon not that easy to catch, but not that hard to catch
just like girlfriends ;)
common.p.30 <- head(common.p,30)
ggplot(common.p.30, aes(x = reorder(name, capture_rate), y = capture_rate))+
geom_col(aes(fill = capture_rate))+
coord_flip()+
scale_fill_continuous(high = "#811e09", low = "#fad61d")+
theme(plot.title = element_text(hjust = 0.5,size = 16),
panel.background = element_rect(fill = "white"),
plot.background = element_rect(fill = "white"),
legend.position = "none")+
labs(x = NULL, y = NULL,title = "Capture Rate")+
scale_y_continuous(limits = c(0,40),
breaks = seq(0,40,5)) The easiest Pokemon to catch in our list is Sandslash
Most Common Type
To help our new beginning journey we have to know which type of Pokemon are most common
It will help us to identify the solution while facing the enemy
common.type <- as.data.frame(table(common.p.easy$type1))
common.type <- common.type[order(common.type$Freq,decreasing = T),]
head(common.type,6)From above we know what are the most common types :
* water
* normal
* bug
* grass
* ground
* poison
We only use these 6 kinds of type
Versality
Versality means how Pokemon can defend theirselves from many type of attacks
We divide it by 3 variables : Strong, Moderate, Weak
# damage taken by all elements
common.p.easy$vers <- rowSums(common.p.easy[,c(2:19)])/18
# versality function
vr <- function(y){
if(y <= 0.3472223){
x <- "Strong"
}else if(y > 0.3472223 & y <= 1.041667){
x <- "Moderate"
}else{
x <- "Weak"
}
}
common.p.easy$vr_prob <- as.factor(sapply(common.p.easy$vers,vr))There is no Pokemon in “Strong” term, so we use “Moderate”
Attack and Defense Ratio
Attack and Defense are important, so we’ll calculate the balance of Attack and Defense
The formula is
((Attack/Defense)+(Defense/Attack))/2
Because Attack is as important as Defense
# attack/defense
common.p.easy$ad <- as.numeric(((common.p.easy$attack/common.p.easy$defense)+(common.p.easy$defense/common.p.easy$attack))/2)
agg.ad <- aggregate(formula = ad ~ name + type1 + height_m + weight_kg, data = common.p.easy, FUN = max)
agg.ad <- agg.ad[order(agg.ad$ad,decreasing = T),]
head(agg.ad,6)Here it is !!! The best Pokemon for your new beginning journey !!!
And here it is, if you want to bring each type
common.water <- head(agg.ad[agg.ad$type1 == "water",],1)
common.normal <- head(agg.ad[agg.ad$type1 == "normal",],1)
common.bug <- head(agg.ad[agg.ad$type1 == "bug",],1)
common.grass <- head(agg.ad[agg.ad$type1 == "grass",],1)
common.ground <- head(agg.ad[agg.ad$type1 == "ground",],1)
common.poison <- head(agg.ad[agg.ad$type1 == "poison",],1)
common.element <- rbind(common.water,common.normal,common.bug,common.grass,common.ground,common.poison)
common.elementWhere is Pikachu?
Pika Pika Pika
We’ll try to find Pikachu and define his position among others
pikachu <- common.p[common.p$name == "Pikachu",]
pikachu[,c("name","capture_prob","attack","defense","type1")]# Pikachu's versality
pikachu$vers <- rowSums(pikachu[,c(2:19)])/18
pikachu$vr_prob <- as.factor(sapply(pikachu$vers,vr))
# Pikachu's att/def
pikachu$ad <- as.numeric(((pikachu$attack/pikachu$defense)+(pikachu$defense/pikachu$attack))/2)
#print Pikachu
pikachu[,c("name","capture_prob","ad","vr_prob","type1")]Electric Types
Analytics
Because of Pikachu is an electric, so we will compare him with his comrades
# Electric Pokemon with Moderate capture_prob
electric <- common.p[common.p$capture_prob == "Moderate" & common.p$type1 == "electric",]
# Electric Pokemon's versality
electric$vers <- rowSums(electric[,c(2:19)])/18
electric$vr_prob <- as.factor(sapply(electric$vers,vr))
# Electric Pokemon's att/def
electric$ad <- as.numeric(((electric$attack/electric$defense)+(electric$defense/electric$attack))/2)
# Print electric
electric[,c("name","capture_prob","ad","vr_prob","type1")]#Aggregate Electric
agg.electric <- aggregate(formula = ad ~ name + type1 + height_m + weight_kg + vr_prob, data = electric, FUN = max)
agg.electric <- agg.electric[order(agg.electric$ad,decreasing = T),]
agg.electricGraphics
ggplot(electric, aes(x = reorder(name, ad), y = ad))+
geom_col(aes(fill = ad))+
coord_flip()+
scale_fill_continuous(high = "#fad61d", low = "#fad61d")+
geom_col(data = pikachu, aes(fill = ad), fill = "#f62d14")+
theme(plot.title = element_text(hjust = 0.5,size = 16),
panel.background = element_rect(fill = "white"),
plot.background = element_rect(fill = "white"),
legend.position = "none")+
labs(x = NULL, y = NULL,title = "Electric Power")+
scale_y_continuous(limits = c(0,2),
breaks = seq(0,2,0.1))statpack.plot <- head(agg.ad,6)
statpack.plot <- statpack.plot[,c("name","ad")]
pikachu.plot <- pikachu[,c("name","ad")]
spp.plot <- rbind(statpack.plot,pikachu.plot)
ggplot(spp.plot, aes(x = reorder(name, ad), y = ad))+
geom_col(aes(fill = ad))+
coord_flip()+
scale_fill_continuous(high = "#fad61d", low = "#fad61d")+
geom_col(data = pikachu.plot, aes(fill = ad), fill = "#f62d14")+
theme(plot.title = element_text(hjust = 0.5,size = 16),
panel.background = element_rect(fill = "white"),
plot.background = element_rect(fill = "white"),
legend.position = "none")+
labs(x = NULL, y = NULL,title = "Where Is Pikachu",subtitle = "Among Top 6 Start Pack")+
scale_y_continuous(limits = c(0,2),
breaks = seq(0,2,0.1))common.element.plot <- common.element[,c("name","ad")]
pikachu.plot <- pikachu[,c("name","ad")]
cep.plot <- rbind(common.element.plot,pikachu.plot)
ggplot(cep.plot, aes(x = reorder(name, ad), y = ad))+
geom_col(aes(fill = ad))+
coord_flip()+
scale_fill_continuous(high = "#fad61d", low = "#fad61d")+
geom_col(data = pikachu.plot, aes(fill = ad), fill = "#f62d14")+
theme(plot.title = element_text(hjust = 0.5,size = 16),
panel.background = element_rect(fill = "white"),
plot.background = element_rect(fill = "white"),
legend.position = "none")+
labs(x = NULL, y = NULL,title = "Where Is Pikachu",subtitle = "Among Top 6 based on type")+
scale_y_continuous(limits = c(0,2),
breaks = seq(0,2,0.1)) Based on the graphic, there is no reason that Pikachu is the strongest, the easiest to catch, the fittest or wsoe
Then, why he so famous?
BECAUSE HE IS CUTE AS HELL!!!
and Ryan Reynolds casts him too :3
Pokemon BMI
Body Masses
Analytics
Not only human have the BMI (Body Mass Index), Pokemon have to
The formula is Weight (Kg) / Height^2 (m)
So let’s begin
cp.bmi <- common.p
cp.bmi$vers <- rowSums(cp.bmi[,c(2:19)])/18
cp.bmi$vr_prob <- as.factor(sapply(cp.bmi$vers,vr))
cp.bmi$ad <- as.numeric(((cp.bmi$attack/cp.bmi$defense)+(cp.bmi$defense/cp.bmi$attack))/2)
cp.bmi$bmi <- as.numeric((cp.bmi$weight_kg)/(cp.bmi$height_m^2))
cp.bmi[,c("name","vr_prob","ad","bmi")]Oopsie! There are many NA in our data, it means the data is not that good
What we have supposed to do? stop here?
Heck no!
We’ll try to remove those NAs and start over
Here are the categories according to BMI value
Underweight : < 18.5
Normal weight : 18.5–24.9
Overweight : 25–29.9
Obesity : BMI of 30 or greater
# BMI function
bmi <- function(b){
if(b < 18.5){
b <- "Underweight"
}else if(b >= 18.5 & b <= 24.9){
b <- "Normal Weight"
}else if(b >= 25 & b <= 29.9){
b <- "Overweight"
}else{
b <- "Obesity"
}
}
cp.bmi$bmi_cat <- as.factor(sapply(cp.bmi$bmi,bmi))
cp.bmi[,c("name","vr_prob","ad","bmi","bmi_cat")]# Put each category in objects
underweight <- cp.bmi[cp.bmi$bmi_cat == "Underweight",]
normalweight <- cp.bmi[cp.bmi$bmi_cat == "Normal Weight",]
overweight <- cp.bmi[cp.bmi$bmi_cat == "Overweight",]
obesity <- cp.bmi[cp.bmi$bmi_cat == "Obesity",]
cp.bmi[order(cp.bmi$bmi,decreasing = T), c("name","bmi","bmi_cat","type1")]Seems not normal right? How come a creature has 375 BMI?
It means you’re looking at a 1x1 m creature with 375 Kg weight!!
But this is Pokemon world right? We are as data scientist must put the "normality’ aside LOL
Correlations
Analytics
We want to see the correlation between the variables
* BMI
* vers
* ad
* speed
* height
* weight
* attack
* defense
* capture_rate
bmicor <- cp.bmi[,c("bmi","vers","ad","hp","speed","height_m","weight_kg","attack","defense","capture_rate")]
bmicor <-cor(bmicor)
bmicor## bmi vers ad hp speed
## bmi 1.00000000 0.022457688 0.037711265 -0.01555890 -0.22415104
## vers 0.02245769 1.000000000 0.009666441 0.01488890 -0.06326199
## ad 0.03771126 0.009666441 1.000000000 -0.10566317 -0.09301751
## hp -0.01555890 0.014888900 -0.105663170 1.00000000 0.09498438
## speed -0.22415104 -0.063261993 -0.093017508 0.09498438 1.00000000
## height_m -0.13810360 0.049758668 0.003418695 0.40097680 0.16051648
## weight_kg 0.26360237 0.117750362 -0.001992684 0.41010271 -0.02282505
## attack 0.02775554 0.024497092 -0.075502625 0.36697855 0.30158556
## defense 0.14730570 0.054982747 0.196888458 0.19199004 -0.06169946
## capture_rate -0.04394980 0.024202017 -0.035682234 0.24987589 0.24000998
## height_m weight_kg attack defense
## bmi -0.138103601 0.263602372 0.02775554 0.14730570
## vers 0.049758668 0.117750362 0.02449709 0.05498275
## ad 0.003418695 -0.001992684 -0.07550262 0.19688846
## hp 0.400976803 0.410102710 0.36697855 0.19199004
## speed 0.160516484 -0.022825047 0.30158556 -0.06169946
## height_m 1.000000000 0.600558255 0.37242702 0.35986364
## weight_kg 0.600558255 1.000000000 0.41920700 0.45555563
## attack 0.372427017 0.419206995 1.00000000 0.45167441
## defense 0.359863641 0.455555629 0.45167441 1.00000000
## capture_rate 0.258003585 0.209173878 0.34412752 0.31038775
## capture_rate
## bmi -0.04394980
## vers 0.02420202
## ad -0.03568223
## hp 0.24987589
## speed 0.24000998
## height_m 0.25800359
## weight_kg 0.20917388
## attack 0.34412752
## defense 0.31038775
## capture_rate 1.00000000
Graphics
bmicor <- cp.bmi[,c("bmi","vers","ad","hp","speed","height_m","weight_kg","attack","defense","capture_rate")]
bmicor <-cor(bmicor)
corrplot(bmicor, type="upper", order="hclust", method = "number",
col=brewer.pal(n=8, name="RdYlBu")) The highest positive correlation is height_m and weight_kg = 0.6
The highest negative correlation is bmi and speed = -0.22
ggplot(data = cp.bmi, aes(x = capture_prob, y = bmi_cat))+
geom_jitter(aes(size = ad, col = vers))+
facet_grid( ~ as.factor(vr_prob),labeller = label_value)+
theme(legend.position = "right")+
labs(x = "Capture Probability", y = "BMI Category")Legendary Pokemon
Legendary Pokemons are believed as the mighty Pokemon, and some say they have godlike ability
Based on Attack
Analytics
legend.p.att <- head(legend.p[order(legend.p$attack,decreasing = T),],12)
legend.p.att[,c("name","attack")]Graphics
ggplot(legend.p.att, aes(x = reorder(name, attack), y = attack))+
geom_col(aes(fill = attack))+
coord_flip()+
scale_fill_continuous(high = "#811e09", low = "#fad61d")+
theme(plot.title = element_text(hjust = 0.5,size = 16),
panel.background = element_rect(fill = "white"),
plot.background = element_rect(fill = "white"),
legend.position = "none")+
labs(x = NULL, y = NULL,title = "Strongest Pokemon based on Attack")Based on Defense
Analytics
legend.p.def <- head(legend.p[order(legend.p$defense,decreasing = T),],12)
legend.p.def[,c("name","defense")]Graphics
ggplot(legend.p.def, aes(x = reorder(name, defense), y = defense))+
geom_col(aes(fill = defense))+
coord_flip()+
scale_fill_continuous(high = "#811e09", low = "#fad61d")+
theme(plot.title = element_text(hjust = 0.5,size = 16),
panel.background = element_rect(fill = "white"),
plot.background = element_rect(fill = "white"),
legend.position = "none")+
labs(x = NULL, y = NULL,title = "Strongest Pokemon based on Defense")Based on Attack Defense Ratio
Analytics
legend.p$ad <- as.numeric(((legend.p$attack/legend.p$defense)+(legend.p$defense/legend.p$attack))/2)
legend.p.ad <- head(legend.p[order(legend.p$ad,decreasing = T),],12)
legend.p.ad[,c("name","ad")]Graphics
ggplot(legend.p.ad, aes(x = reorder(name, ad), y = ad))+
geom_col(aes(fill = ad))+
coord_flip()+
scale_fill_continuous(high = "#811e09", low = "#fad61d")+
theme(plot.title = element_text(hjust = 0.5,size = 16),
panel.background = element_rect(fill = "white"),
plot.background = element_rect(fill = "white"),
legend.position = "none")+
labs(x = NULL, y = NULL,title = "Strongest Pokemon based on Attack Defense Ratio")