install.packages("DT", repos = "https://cloud.r-project.org/")
# Load necessary libraries
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.2 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.3 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(DT)
# Load the dataset
pokemon_data <- read.csv("./PokemonStats.csv", stringsAsFactors = FALSE)
# 1.1 Group by 'Type1' and summarize the average 'Total' points
# We will group the Pokémon dataset by the primary type (Type1) and calculate the average Total points for each type. This will allow us to understand which Pokémon type, on average, has the highest and lowest total points.
#WHY?
# 1. Some Pokémon types might have significantly higher average total points than others, indicating they are generally stronger in terms of combined stats.
# 2. There might be a few types that stand out as having particularly low average total points, suggesting these types, on average, might be weaker or more specialized.
# 3. The variation in average total points across types can indicate game design decisions to balance gameplay or to make certain types more challenging to master.
group_type1 <- pokemon_data %>%
group_by(Type1) %>%
summarize(AvgTotal = mean(Total, na.rm = TRUE))
group_type1
## # A tibble: 18 × 2
## Type1 AvgTotal
## <chr> <dbl>
## 1 Bug 381.
## 2 Dark 450.
## 3 Dragon 528.
## 4 Electric 447.
## 5 Fairy 449.
## 6 Fighting 458.
## 7 Fire 455.
## 8 Flying 450.
## 9 Ghost 438.
## 10 Grass 420.
## 11 Ground 440.
## 12 Ice 439.
## 13 Normal 409.
## 14 Poison 428.
## 15 Psychic 486.
## 16 Rock 447.
## 17 Steel 485
## 18 Water 437.
ggplot(group_type1, aes(x = Type1, y = AvgTotal)) +
geom_bar(stat = "identity", fill = "skyblue") +
coord_flip() +
labs(title = "Average Total Points by Pokémon Type", x = "Pokémon Type", y = "Average Total Points") +
theme_minimal()

# INSIGHTS
# Dragon-type Pokémon have the highest average total points, suggesting that they are generally stronger in terms of combined stats.
# Bug-type Pokémon have the lowest average total points, indicating that they might be generally weaker or more specialized in comparison to other types.
# Significance:
# Players might favor Dragon-type Pokémon due to their higher average stats, especially in competitive settings.
# Bug-type Pokémon, although having lower average stats, could have other advantages or unique abilities that make them valuable in specific scenarios.
# Further Questions:
#What specific abilities or moves do Bug-type Pokémon have that could compensate for their lower average stats?
#Are Dragon-type Pokémon rarer or harder to evolve, thus justifying their higher stats?
#1.2 Group by 'Type1' and summarize the average 'Speed'
# We will group the Pokémon dataset by the primary type (Type1) and calculate the average Speed for each type. This will help us understand which Pokémon type, on average, is the fastest.
#WHY?
# 1. Certain Pokémon types might be consistently faster, on average, than other types. These types might be favored in battles where speed is crucial.
# 2. Slow average speeds for certain types might indicate that these Pokémon compensate with other strengths, such as higher defense or attack.
group_type_speed <- pokemon_data %>%
group_by(Type1) %>%
summarize(AvgSpeed = mean(Speed, na.rm = TRUE))
group_type_speed
## # A tibble: 18 × 2
## Type1 AvgSpeed
## <chr> <dbl>
## 1 Bug 61.9
## 2 Dark 77.5
## 3 Dragon 84.7
## 4 Electric 87.7
## 5 Fairy 67.1
## 6 Fighting 76.1
## 7 Fire 73.7
## 8 Flying 86.8
## 9 Ghost 63.3
## 10 Grass 61.1
## 11 Ground 64.1
## 12 Ice 67.6
## 13 Normal 70.7
## 14 Poison 65.0
## 15 Psychic 80.4
## 16 Rock 57.4
## 17 Steel 56.0
## 18 Water 68.3
ggplot(group_type_speed, aes(x = Type1, y = AvgSpeed)) +
geom_bar(stat = "identity", fill = "salmon") +
coord_flip() +
labs(title = "Average Speed by Pokémon Type", x = "Pokémon Type", y = "Average Speed") +
theme_minimal()

# INSIGHTS
# Electric-type Pokémon have the highest average speed. This aligns with the intuitive notion that electricity is fast, and these Pokémon might be favored in battles where going first can be a decisive advantage.
# Steel-type Pokémon have the lowest average speed, suggesting they might be slow but potentially sturdy or strong in other aspects.
# Significance:
#Electric-type Pokémon, with their higher speed, can be crucial in battles where the order of moves can determine the outcome. Being able to strike first can sometimes mean knocking out an opponent before they have a chance to react.
#Steel-type Pokémon, while slower, might have higher defenses or other attributes that make them valuable, especially in scenarios where endurance is vital.
#Further Questions:
#What are the average defense and attack stats for Steel-type Pokémon? Does their slower speed correlate with higher defensive capabilities?
#Are there specific moves or abilities unique to Electric-type Pokémon that capitalize on their higher speed?
#1.3 Group by both 'Type1' and 'Type2' and summarize the average 'Attack'
# We will investigate dual-type Pokémon. Specifically, we'll group the Pokémon dataset by both primary (Type1) and secondary (Type2) types and then calculate the average Attack points for each type combination. This will help us understand which type combinations have the highest average attack.
group_dual_type <- pokemon_data %>%
group_by(Type1, Type2) %>%
summarize(AvgAttack = mean(Attack, na.rm = TRUE))
## `summarise()` has grouped output by 'Type1'. You can override using the
## `.groups` argument.
datatable(group_dual_type, options = list(pageLength = nrow(group_dual_type)))
#The heatmap can reveal type combinations that are generally weaker in terms of attack. Trainers might need to use these Pokémon more strategically or in combination with other Pokémon to ensure success in battles.
ggplot(group_dual_type, aes(x = Type1, y = Type2, fill = AvgAttack)) +
geom_tile() +
labs(title = "Average Attack by Pokémon Type Combination", x = "Primary Type (Type1)", y = "Secondary Type (Type2)", fill = "Average Attack") +
scale_fill_gradient2(low = "blue", mid = "white", high = "red", midpoint = mean(group_dual_type$AvgAttack)) +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))

# INSIGHTS
# Some type combinations, such as Bug & Fighting, stand out with notably high average attack points.
#Conversely, certain combinations, like Bug & Fairy, have relatively low average attack points.
#Significance:
#Dual-type Pokémon with high average attack points, like Bug & Fighting, can be valuable assets in battles, especially when offensive strategies are favored.
#Pokémon combinations with lower average attack might be specialized in other areas, such as defense or special abilities, and might require more strategic gameplay.
#Further Questions:
#What specific Pokémon belong to the Bug & Fighting type combination?
#For type combinations with lower average attack points, what are their other strengths or unique abilities that can be leveraged in battles?
# Categorical Variable Combinations
# Combinations of 'Type1' and 'Type2' that never appear in the data set.
# The most and least common combinations.
# 2.1 Creating a data frame of all possible combinations of 'Type1' and 'Type2'
all_combinations <- expand_grid(Type1 = unique(pokemon_data$Type1), Type2 = unique(pokemon_data$Type2))
#2.2 Identifying combinations that never appear in the data set:
missing_combinations <- anti_join(all_combinations, pokemon_data, by = c("Type1", "Type2"))
missing_combinations
## # A tibble: 124 × 2
## Type1 Type2
## <chr> <chr>
## 1 Grass Grass
## 2 Grass Rock
## 3 Grass Electric
## 4 Grass Water
## 5 Grass Bug
## 6 Fire Fairy
## 7 Fire Grass
## 8 Fire Electric
## 9 Fire Ice
## 10 Fire Fire
## # ℹ 114 more rows
# Insights from Missing Pokémon Type Combinations:
#We have identified 124 type combinations that are not present in the dataset. Examples include combinations like Grass & Grass, Grass & Rock, Flying & Electric, and so on. This means there are no Pokémon in the dataset that possess these specific dual-type combinations.
# Further Questions:
#Are there lore or in-game reasons for the absence of certain type combinations?
# 2.3 Identify the most and least common combinations:
# We will use the count() function to determine the frequency of each 'Type1' and 'Type2' combination in the dataset.
common_combinations <- pokemon_data %>%
count(Type1, Type2, sort = TRUE)
common_combinations
## Type1 Type2 n
## 1 Water 81
## 2 Normal 79
## 3 Psychic 47
## 4 Grass 46
## 5 Electric 37
## 6 Fire 37
## 7 Normal Flying 31
## 8 Fighting 30
## 9 Bug 25
## 10 Ice 22
## 11 Fairy 21
## 12 Rock 20
## 13 Ghost 19
## 14 Ground 17
## 15 Dark 16
## 16 Poison 16
## 17 Grass Poison 15
## 18 Bug Flying 14
## 19 Dragon 13
## 20 Bug Poison 12
## 21 Steel 12
## 22 Ghost Grass 11
## 23 Water Ground 10
## 24 Psychic Fairy 9
## 25 Psychic Flying 9
## 26 Electric Flying 8
## 27 Water Dark 8
## 28 Bug Steel 7
## 29 Dark Flying 7
## 30 Dragon Ground 7
## 31 Fire Fighting 7
## 32 Fire Flying 7
## 33 Grass Flying 7
## 34 Steel Psychic 7
## 35 Water Flying 7
## 36 Water Psychic 7
## 37 Bug Grass 6
## 38 Dragon Flying 6
## 39 Dragon Ice 6
## 40 Rock Flying 6
## 41 Rock Ground 6
## 42 Rock Water 6
## 43 Bug Fighting 5
## 44 Dark Normal 5
## 45 Fire Ghost 5
## 46 Grass Dark 5
## 47 Grass Dragon 5
## 48 Grass Fairy 5
## 49 Grass Fighting 5
## 50 Ground Steel 5
## 51 Normal Fairy 5
## 52 Normal Psychic 5
## 53 Poison Dark 5
## 54 Steel Dragon 5
## 55 Steel Ghost 5
## 56 Water Dragon 5
## 57 Water Rock 5
## 58 Bug Electric 4
## 59 Bug Rock 4
## 60 Dark Dragon 4
## 61 Dark Fire 4
## 62 Dragon Psychic 4
## 63 Electric Steel 4
## 64 Fairy Flying 4
## 65 Fairy Steel 4
## 66 Flying 4
## 67 Ghost Poison 4
## 68 Ground Flying 4
## 69 Ground Ghost 4
## 70 Ice Psychic 4
## 71 Ice Water 4
## 72 Normal Fighting 4
## 73 Poison Dragon 4
## 74 Poison Ground 4
## 75 Psychic Ghost 4
## 76 Rock Electric 4
## 77 Rock Steel 4
## 78 Steel Fairy 4
## 79 Water Fairy 4
## 80 Water Fighting 4
## 81 Water Ghost 4
## 82 Water Ice 4
## 83 Bug Psychic 3
## 84 Bug Water 3
## 85 Dark Fairy 3
## 86 Dark Ice 3
## 87 Dark Steel 3
## 88 Dragon Ghost 3
## 89 Dragon Water 3
## 90 Electric Dragon 3
## 91 Electric Grass 3
## 92 Electric Poison 3
## 93 Fighting Psychic 3
## 94 Fighting Steel 3
## 95 Fire Ground 3
## 96 Fire Psychic 3
## 97 Fire Rock 3
## 98 Ghost Fire 3
## 99 Ghost Flying 3
## 100 Grass Ghost 3
## 101 Grass Ice 3
## 102 Grass Normal 3
## 103 Grass Psychic 3
## 104 Grass Steel 3
## 105 Ground Dark 3
## 106 Ground Rock 3
## 107 Ice Ground 3
## 108 Poison Flying 3
## 109 Poison Water 3
## 110 Psychic Fighting 3
## 111 Rock Fairy 3
## 112 Rock Poison 3
## 113 Steel Rock 3
## 114 Water Grass 3
## 115 Water Poison 3
## 116 Bug Fairy 2
## 117 Bug Fire 2
## 118 Bug Ground 2
## 119 Dark Fighting 2
## 120 Dark Ghost 2
## 121 Dark Grass 2
## 122 Dark Poison 2
## 123 Dark Psychic 2
## 124 Dragon Fighting 2
## 125 Electric Dark 2
## 126 Electric Fairy 2
## 127 Electric Fighting 2
## 128 Electric Ice 2
## 129 Electric Normal 2
## 130 Fighting Dark 2
## 131 Fighting Flying 2
## 132 Fighting Ghost 2
## 133 Fighting Poison 2
## 134 Fighting Water 2
## 135 Fire Bug 2
## 136 Fire Dragon 2
## 137 Fire Normal 2
## 138 Flying Dragon 2
## 139 Ghost Dragon 2
## 140 Ghost Fairy 2
## 141 Ghost Ground 2
## 142 Ground Dragon 2
## 143 Ground Grass 2
## 144 Ground Psychic 2
## 145 Ice Bug 2
## 146 Ice Flying 2
## 147 Ice Steel 2
## 148 Normal Ghost 2
## 149 Normal Grass 2
## 150 Poison Fighting 2
## 151 Poison Fire 2
## 152 Poison Normal 2
## 153 Poison Psychic 2
## 154 Psychic Grass 2
## 155 Psychic Normal 2
## 156 Psychic Steel 2
## 157 Rock Bug 2
## 158 Rock Dark 2
## 159 Rock Dragon 2
## 160 Rock Fire 2
## 161 Rock Grass 2
## 162 Rock Ice 2
## 163 Rock Psychic 2
## 164 Steel Flying 2
## 165 Steel Ground 2
## 166 Steel Poison 2
## 167 Water Bug 2
## 168 Water Electric 2
## 169 Bug Dark 1
## 170 Bug Ghost 1
## 171 Dark Ground 1
## 172 Dragon Dark 1
## 173 Dragon Electric 1
## 174 Dragon Fairy 1
## 175 Dragon Fire 1
## 176 Dragon Normal 1
## 177 Electric Fire 1
## 178 Electric Ghost 1
## 179 Electric Ground 1
## 180 Electric Psychic 1
## 181 Electric Water 1
## 182 Fairy Fighting 1
## 183 Fairy Psychic 1
## 184 Fighting Dragon 1
## 185 Fighting Electric 1
## 186 Fighting Fire 1
## 187 Fighting Ice 1
## 188 Fire Dark 1
## 189 Fire Poison 1
## 190 Fire Steel 1
## 191 Fire Water 1
## 192 Flying Dark 1
## 193 Flying Fighting 1
## 194 Flying Steel 1
## 195 Flying Water 1
## 196 Ghost Dark 1
## 197 Grass Fire 1
## 198 Grass Ground 1
## 199 Ground Electric 1
## 200 Ground Fighting 1
## 201 Ground Fire 1
## 202 Ground Normal 1
## 203 Ice Fairy 1
## 204 Ice Fire 1
## 205 Ice Ghost 1
## 206 Ice Rock 1
## 207 Normal Dragon 1
## 208 Normal Ground 1
## 209 Normal Water 1
## 210 Poison Bug 1
## 211 Poison Fairy 1
## 212 Psychic Dark 1
## 213 Psychic Dragon 1
## 214 Psychic Fire 1
## 215 Psychic Ice 1
## 216 Rock Fighting 1
## 217 Steel Fighting 1
## 218 Water Steel 1
# Visualization of combinations using a heatmap
library(ggplot2)
ggplot(common_combinations, aes(x = Type1, y = Type2, fill = n)) +
geom_tile() +
labs(title = "Heatmap of Type Combinations", x = "Type 1", y = "Type 2", fill = "Count of Pokémon") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))

#INSIGHTS
#Water & Ground type combination seems to be one of the most common dual-type combinations, indicating that many Pokémon in the dataset have this type combination.
#Conversely, many cells in the heatmap are empty or have low counts, indicating that certain dual-type combinations are rare or non-existent in the dataset.
#Significance:
#Common dual-type combinations, like Water & Ground, might offer a balanced set of abilities and attributes that make them versatile in various battle scenarios.
#Further Questions:
#What are the specific attributes and abilities of Pokémon with the Water & Ground type combination that might make them popular or common?
# Are there certain dual-type combinations that are particularly advantageous or disadvantageous in battles?
# Formulating Hypotheses:
# Hypothesis 1: Certain Pokémon types might be rarer in the dataset because they represent newer additions in later Pokémon generations. For instance, if a particular type was introduced in a later game version, it might have fewer Pokémon of that type.
# Test: We'd need to consider the Generation to test this hypothesis. By checking the distribution of Pokémon types across generations, we could validate or refute this hypothesis.
# Hypothesis 2: Some Pokémon type combinations might be designed to be rarer to make them more valuable or sought-after in the game.
# Test: This would be harder to test with the dataset alone. It would require external information like player experiences.
# Hypothesis 3: Certain Pokémon types might have higher average stats because they are designed to be "boss" or "elite" Pokémon, meant to be more challenging to defeat.
# Test: By comparing the list of Pokémon with high average stats to known elite or boss Pokémon from the game series, we could get some validation for this hypothesis.
# Further Questions:
#How do the stats vary for Pokémon of different weights and heights?
#Are there patterns in stats based on the Pokémon's name or ID? Perhaps certain naming conventions indicate stronger Pokémon?