The Pokémon franchise is a large multimedia franchise with many individual aspects. Whether it’s the trading cards, the shows/movies, or the games, they have a large presence everywhere you look.
I have always been interested in the Pokémon games. While I have never played one (as nintendo consoles are expensive), I have always had a fascination with how they are able to make over a thousand different creatures feel and play completely different from each other. In this document, I intend to analyze how specific aspects of a Pokémon’s stats affect and compare to one-another.
To do this, we will need these libraries:
library(tidyverse) # The tidyverse collection of packages
Warning: package 'tidyverse' was built under R version 4.4.3
Warning: package 'ggplot2' was built under R version 4.4.3
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 4.0.0 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(httr) # Useful for web authenticationlibrary(rvest) # Useful tools for working with HTML and XML
Attaching package: 'rvest'
The following object is masked from 'package:readr':
guess_encoding
library(polite) # Promoting responsible web scraping
Warning: package 'polite' was built under R version 4.4.3
Rows: 1219 Columns: 10
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): ID, Name, Type
dbl (7): Total, Hp, Attack, Defense, Sp.Atk, Sp.Def, Speed
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Obtaining the data
Before I begin with analysis, I want to showcase how and where I was able to obtain my data.
There are many websites where you can find the stats and traits of all the different Pokémon. I chose to look for a website that is open and is easy to navigate, in which I ended up on:
This website gives a great visualization of all the data I will be working with, as well as more in-depth data that can be obtained by going to specific Pokémon pages by clicking on their respective names.
Analysis: The different stats and how they interact
In the Pokémon games, their power is divided up into multiple “stats”. These stats are
Hp: How much health a Pokémon haps compared to others
Attack: How much power their physical attacks have
Defense: How much damage they can reduce from physical attacks
Sp.Atk (Special Attack): How much power their non-physical, usually more elemental attacks deal
Sp.Def (Special Defense): The same as Defense, but for Special Attacks.
Speed: In battle, whichever Pokémon has the higher speed gets to attack first.
I want to look at some of the relationships between some of the stats, specifically:
Does an increase in Defense tend to coincide with an increase of Sp.Def?
Does an increase in Attack tend to coincide with an increase of Sp.Atk
How does increases in Attack affect Defense
How does increases in Sp.Atk affect Sp.Def
Q1: Defense vs Sp.Def
When I think “Tanky” Pokémon, I tend to think of a Pokémon with high health, defense, and special defense, but is there a relation between defense and special defense?
pokemonData %>%ggplot( aes(x=Defense, y=Sp.Def) ) +geom_point() +geom_smooth() +labs(title="Pokémon Defense vs. Special Defense")
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
While the correlation tends to be positive, there are many outliers. However, ignoring these outliers as special, gimicky Pokémon or Pokémon who are more specialists, we can see that, usually, they have a positive association. In fact, if we add a x=y line to the graph..
pokemonData %>%ggplot( aes(x=Defense, y=Sp.Def) ) +geom_point() +geom_smooth() +geom_abline(slope=1,color="red") +labs(title="Pokémon Defense vs. Special Defense")
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
We can see that a substantial amount of entries lie on the line where x=y! This makes sense in a gameplay design standpoint, where having many Pokémon with very similar defense and special defense allow them to serve as more “generalists” that can withstand a beating from both types of attacks in a similar manner.
Q2: Attack vs. Sp.Atk
Going off of what we saw in the previous analysis, I believe that a graph of the Pokémons offensive capababilities will appear similar to one of their defensive abilities, with a positive trend with many outliers that serve as more specialist Pokémon.
pokemonData %>%ggplot( aes(x=Attack, y=Sp.Atk) ) +geom_point() +geom_smooth() +labs(title="Pokémon Attack vs. Special Attack")
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
While the axis in this plot are different from the ones above, we can still certainly see a similar positive association! In fact, we can even see many of the Attack=Sp.Atk Pokémon forming an apparent line in the middle of the plot!
Q3 & Q4: Attack vs. Defense & Sp.Atk vs. Sp.Def
As both of these pairs of values share the same values they are increasing/decreasing, I opted to analyze them both at the same time.
I intend to see an even stronger linear correlation, as Pokémon who specialize in physical combat probably have both a high Attack and Defense compared to one who uses more special moves, and vice versa.
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
pokemonData %>%ggplot( aes(x=Sp.Atk, y=Sp.Def) ) +geom_point() +geom_smooth() +labs(title="Pokémon Special Attack vs. Special Defense")
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
These two plots share a very similar story. Although they both have completely different variables for their x and y axis, they share very similar graph structure where there is a clear positive association with many outlier that dictate gimmicky and specialist Pokémon.
Conclusion
After concluding my analysis, I can that there is certainly many associations between the multiple types of stats, atleast the attacking/defensive ones. While there are many Pokémon that tend to serve as “gimmicks” or “specialists” that favor one stat over all others, most of the Pokémon tend to share the trend of having similar values as their numbers climb. Even when coming Pokémon with many points in both Attack and Defense compared to one with below-average points in those stats, they tend to carry similar proportions in how they distribute these stats.