install.packages("DT", repos = "https://cloud.r-project.org/")
# Load necessary libraries
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.2     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.3     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(DT)
# Load the dataset
pokemon_data <- read.csv("./PokemonStats.csv", stringsAsFactors = FALSE)
# 1.1 Group by 'Type1' and summarize the average 'Total' points

# We will group the Pokémon dataset by the primary type (Type1) and calculate the average Total points for each type. This will allow us to understand which Pokémon type, on average, has the highest and lowest total points.

#WHY?
# 1. Some Pokémon types might have significantly higher average total points than others, indicating they are generally stronger in terms of combined stats.
# 2. There might be a few types that stand out as having particularly low average total points, suggesting these types, on average, might be weaker or more specialized.
# 3. The variation in average total points across types can indicate game design decisions to balance gameplay or to make certain types more challenging to master.
group_type1 <- pokemon_data %>%
  group_by(Type1) %>%
  summarize(AvgTotal = mean(Total, na.rm = TRUE))

group_type1
## # A tibble: 18 × 2
##    Type1    AvgTotal
##    <chr>       <dbl>
##  1 Bug          381.
##  2 Dark         450.
##  3 Dragon       528.
##  4 Electric     447.
##  5 Fairy        449.
##  6 Fighting     458.
##  7 Fire         455.
##  8 Flying       450.
##  9 Ghost        438.
## 10 Grass        420.
## 11 Ground       440.
## 12 Ice          439.
## 13 Normal       409.
## 14 Poison       428.
## 15 Psychic      486.
## 16 Rock         447.
## 17 Steel        485 
## 18 Water        437.
ggplot(group_type1, aes(x = Type1, y = AvgTotal)) +
  geom_bar(stat = "identity", fill = "skyblue") +
  coord_flip() +
  labs(title = "Average Total Points by Pokémon Type", x = "Pokémon Type", y = "Average Total Points") +
  theme_minimal()

# INSIGHTS

# Dragon-type Pokémon have the highest average total points, suggesting that they are generally stronger in terms of combined stats.
# Bug-type Pokémon have the lowest average total points, indicating that they might be generally weaker or more specialized in comparison to other types.


# Significance:
# Players might favor Dragon-type Pokémon due to their higher average stats, especially in competitive settings.
# Bug-type Pokémon, although having lower average stats, could have other advantages or unique abilities that make them valuable in specific scenarios.

# Further Questions:
#What specific abilities or moves do Bug-type Pokémon have that could compensate for their lower average stats?
#Are Dragon-type Pokémon rarer or harder to evolve, thus justifying their higher stats?
#1.2 Group by 'Type1' and summarize the average 'Speed'

# We will group the Pokémon dataset by the primary type (Type1) and calculate the average Speed for each type. This will help us understand which Pokémon type, on average, is the fastest.

#WHY?

# 1. Certain Pokémon types might be consistently faster, on average, than other types. These types might be favored in battles where speed is crucial.
# 2. Slow average speeds for certain types might indicate that these Pokémon compensate with other strengths, such as higher defense or attack.
group_type_speed <- pokemon_data %>%
  group_by(Type1) %>%
  summarize(AvgSpeed = mean(Speed, na.rm = TRUE))

group_type_speed
## # A tibble: 18 × 2
##    Type1    AvgSpeed
##    <chr>       <dbl>
##  1 Bug          61.9
##  2 Dark         77.5
##  3 Dragon       84.7
##  4 Electric     87.7
##  5 Fairy        67.1
##  6 Fighting     76.1
##  7 Fire         73.7
##  8 Flying       86.8
##  9 Ghost        63.3
## 10 Grass        61.1
## 11 Ground       64.1
## 12 Ice          67.6
## 13 Normal       70.7
## 14 Poison       65.0
## 15 Psychic      80.4
## 16 Rock         57.4
## 17 Steel        56.0
## 18 Water        68.3
ggplot(group_type_speed, aes(x = Type1, y = AvgSpeed)) +
  geom_bar(stat = "identity", fill = "salmon") +
  coord_flip() +
  labs(title = "Average Speed by Pokémon Type", x = "Pokémon Type", y = "Average Speed") +
  theme_minimal()

# INSIGHTS

# Electric-type Pokémon have the highest average speed. This aligns with the intuitive notion that electricity is fast, and these Pokémon might be favored in battles where going first can be a decisive advantage.
# Steel-type Pokémon have the lowest average speed, suggesting they might be slow but potentially sturdy or strong in other aspects.

# Significance:
#Electric-type Pokémon, with their higher speed, can be crucial in battles where the order of moves can determine the outcome. Being able to strike first can sometimes mean knocking out an opponent before they have a chance to react.
#Steel-type Pokémon, while slower, might have higher defenses or other attributes that make them valuable, especially in scenarios where endurance is vital.

#Further Questions:
#What are the average defense and attack stats for Steel-type Pokémon? Does their slower speed correlate with higher defensive capabilities?
#Are there specific moves or abilities unique to Electric-type Pokémon that capitalize on their higher speed?
#1.3 Group by both 'Type1' and 'Type2' and summarize the average 'Attack'

# We will investigate dual-type Pokémon. Specifically, we'll group the Pokémon dataset by both primary (Type1) and secondary (Type2) types and then calculate the average Attack points for each type combination. This will help us understand which type combinations have the highest average attack.
group_dual_type <- pokemon_data %>%
  group_by(Type1, Type2) %>%
  summarize(AvgAttack = mean(Attack, na.rm = TRUE))
## `summarise()` has grouped output by 'Type1'. You can override using the
## `.groups` argument.
datatable(group_dual_type, options = list(pageLength = nrow(group_dual_type)))
#The heatmap can reveal type combinations that are generally weaker in terms of attack. Trainers might need to use these Pokémon more strategically or in combination with other Pokémon to ensure success in battles.

ggplot(group_dual_type, aes(x = Type1, y = Type2, fill = AvgAttack)) +
  geom_tile() +
  labs(title = "Average Attack by Pokémon Type Combination", x = "Primary Type (Type1)", y = "Secondary Type (Type2)", fill = "Average Attack") +
  scale_fill_gradient2(low = "blue", mid = "white", high = "red", midpoint = mean(group_dual_type$AvgAttack)) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

# INSIGHTS 

# Some type combinations, such as Bug & Fighting, stand out with notably high average attack points.
#Conversely, certain combinations, like Bug & Fairy, have relatively low average attack points.

#Significance:

#Dual-type Pokémon with high average attack points, like Bug & Fighting, can be valuable assets in battles, especially when offensive strategies are favored.
#Pokémon combinations with lower average attack might be specialized in other areas, such as defense or special abilities, and might require more strategic gameplay.

#Further Questions:

#What specific Pokémon belong to the Bug & Fighting type combination? 
#For type combinations with lower average attack points, what are their other strengths or unique abilities that can be leveraged in battles?
# Categorical Variable Combinations

# Combinations of 'Type1' and 'Type2' that never appear in the data set.
# The most and least common combinations.
# 2.1 Creating a data frame of all possible combinations of 'Type1' and 'Type2'

all_combinations <- expand_grid(Type1 = unique(pokemon_data$Type1), Type2 = unique(pokemon_data$Type2))
#2.2 Identifying combinations that never appear in the data set:

missing_combinations <- anti_join(all_combinations, pokemon_data, by = c("Type1", "Type2"))

missing_combinations
## # A tibble: 124 × 2
##    Type1 Type2   
##    <chr> <chr>   
##  1 Grass Grass   
##  2 Grass Rock    
##  3 Grass Electric
##  4 Grass Water   
##  5 Grass Bug     
##  6 Fire  Fairy   
##  7 Fire  Grass   
##  8 Fire  Electric
##  9 Fire  Ice     
## 10 Fire  Fire    
## # ℹ 114 more rows
# Insights from Missing Pokémon Type Combinations:
#We have identified 124 type combinations that are not present in the dataset. Examples include combinations like Grass & Grass, Grass & Rock, Flying & Electric, and so on. This means there are no Pokémon in the dataset that possess these specific dual-type combinations.

# Further Questions:
#Are there lore or in-game reasons for the absence of certain type combinations?
# 2.3 Identify the most and least common combinations:

# We will use the count() function to determine the frequency of each 'Type1' and 'Type2' combination in the dataset.
common_combinations <- pokemon_data %>%
  count(Type1, Type2, sort = TRUE)

common_combinations
##        Type1    Type2  n
## 1      Water          81
## 2     Normal          79
## 3    Psychic          47
## 4      Grass          46
## 5   Electric          37
## 6       Fire          37
## 7     Normal   Flying 31
## 8   Fighting          30
## 9        Bug          25
## 10       Ice          22
## 11     Fairy          21
## 12      Rock          20
## 13     Ghost          19
## 14    Ground          17
## 15      Dark          16
## 16    Poison          16
## 17     Grass   Poison 15
## 18       Bug   Flying 14
## 19    Dragon          13
## 20       Bug   Poison 12
## 21     Steel          12
## 22     Ghost    Grass 11
## 23     Water   Ground 10
## 24   Psychic    Fairy  9
## 25   Psychic   Flying  9
## 26  Electric   Flying  8
## 27     Water     Dark  8
## 28       Bug    Steel  7
## 29      Dark   Flying  7
## 30    Dragon   Ground  7
## 31      Fire Fighting  7
## 32      Fire   Flying  7
## 33     Grass   Flying  7
## 34     Steel  Psychic  7
## 35     Water   Flying  7
## 36     Water  Psychic  7
## 37       Bug    Grass  6
## 38    Dragon   Flying  6
## 39    Dragon      Ice  6
## 40      Rock   Flying  6
## 41      Rock   Ground  6
## 42      Rock    Water  6
## 43       Bug Fighting  5
## 44      Dark   Normal  5
## 45      Fire    Ghost  5
## 46     Grass     Dark  5
## 47     Grass   Dragon  5
## 48     Grass    Fairy  5
## 49     Grass Fighting  5
## 50    Ground    Steel  5
## 51    Normal    Fairy  5
## 52    Normal  Psychic  5
## 53    Poison     Dark  5
## 54     Steel   Dragon  5
## 55     Steel    Ghost  5
## 56     Water   Dragon  5
## 57     Water     Rock  5
## 58       Bug Electric  4
## 59       Bug     Rock  4
## 60      Dark   Dragon  4
## 61      Dark     Fire  4
## 62    Dragon  Psychic  4
## 63  Electric    Steel  4
## 64     Fairy   Flying  4
## 65     Fairy    Steel  4
## 66    Flying           4
## 67     Ghost   Poison  4
## 68    Ground   Flying  4
## 69    Ground    Ghost  4
## 70       Ice  Psychic  4
## 71       Ice    Water  4
## 72    Normal Fighting  4
## 73    Poison   Dragon  4
## 74    Poison   Ground  4
## 75   Psychic    Ghost  4
## 76      Rock Electric  4
## 77      Rock    Steel  4
## 78     Steel    Fairy  4
## 79     Water    Fairy  4
## 80     Water Fighting  4
## 81     Water    Ghost  4
## 82     Water      Ice  4
## 83       Bug  Psychic  3
## 84       Bug    Water  3
## 85      Dark    Fairy  3
## 86      Dark      Ice  3
## 87      Dark    Steel  3
## 88    Dragon    Ghost  3
## 89    Dragon    Water  3
## 90  Electric   Dragon  3
## 91  Electric    Grass  3
## 92  Electric   Poison  3
## 93  Fighting  Psychic  3
## 94  Fighting    Steel  3
## 95      Fire   Ground  3
## 96      Fire  Psychic  3
## 97      Fire     Rock  3
## 98     Ghost     Fire  3
## 99     Ghost   Flying  3
## 100    Grass    Ghost  3
## 101    Grass      Ice  3
## 102    Grass   Normal  3
## 103    Grass  Psychic  3
## 104    Grass    Steel  3
## 105   Ground     Dark  3
## 106   Ground     Rock  3
## 107      Ice   Ground  3
## 108   Poison   Flying  3
## 109   Poison    Water  3
## 110  Psychic Fighting  3
## 111     Rock    Fairy  3
## 112     Rock   Poison  3
## 113    Steel     Rock  3
## 114    Water    Grass  3
## 115    Water   Poison  3
## 116      Bug    Fairy  2
## 117      Bug     Fire  2
## 118      Bug   Ground  2
## 119     Dark Fighting  2
## 120     Dark    Ghost  2
## 121     Dark    Grass  2
## 122     Dark   Poison  2
## 123     Dark  Psychic  2
## 124   Dragon Fighting  2
## 125 Electric     Dark  2
## 126 Electric    Fairy  2
## 127 Electric Fighting  2
## 128 Electric      Ice  2
## 129 Electric   Normal  2
## 130 Fighting     Dark  2
## 131 Fighting   Flying  2
## 132 Fighting    Ghost  2
## 133 Fighting   Poison  2
## 134 Fighting    Water  2
## 135     Fire      Bug  2
## 136     Fire   Dragon  2
## 137     Fire   Normal  2
## 138   Flying   Dragon  2
## 139    Ghost   Dragon  2
## 140    Ghost    Fairy  2
## 141    Ghost   Ground  2
## 142   Ground   Dragon  2
## 143   Ground    Grass  2
## 144   Ground  Psychic  2
## 145      Ice      Bug  2
## 146      Ice   Flying  2
## 147      Ice    Steel  2
## 148   Normal    Ghost  2
## 149   Normal    Grass  2
## 150   Poison Fighting  2
## 151   Poison     Fire  2
## 152   Poison   Normal  2
## 153   Poison  Psychic  2
## 154  Psychic    Grass  2
## 155  Psychic   Normal  2
## 156  Psychic    Steel  2
## 157     Rock      Bug  2
## 158     Rock     Dark  2
## 159     Rock   Dragon  2
## 160     Rock     Fire  2
## 161     Rock    Grass  2
## 162     Rock      Ice  2
## 163     Rock  Psychic  2
## 164    Steel   Flying  2
## 165    Steel   Ground  2
## 166    Steel   Poison  2
## 167    Water      Bug  2
## 168    Water Electric  2
## 169      Bug     Dark  1
## 170      Bug    Ghost  1
## 171     Dark   Ground  1
## 172   Dragon     Dark  1
## 173   Dragon Electric  1
## 174   Dragon    Fairy  1
## 175   Dragon     Fire  1
## 176   Dragon   Normal  1
## 177 Electric     Fire  1
## 178 Electric    Ghost  1
## 179 Electric   Ground  1
## 180 Electric  Psychic  1
## 181 Electric    Water  1
## 182    Fairy Fighting  1
## 183    Fairy  Psychic  1
## 184 Fighting   Dragon  1
## 185 Fighting Electric  1
## 186 Fighting     Fire  1
## 187 Fighting      Ice  1
## 188     Fire     Dark  1
## 189     Fire   Poison  1
## 190     Fire    Steel  1
## 191     Fire    Water  1
## 192   Flying     Dark  1
## 193   Flying Fighting  1
## 194   Flying    Steel  1
## 195   Flying    Water  1
## 196    Ghost     Dark  1
## 197    Grass     Fire  1
## 198    Grass   Ground  1
## 199   Ground Electric  1
## 200   Ground Fighting  1
## 201   Ground     Fire  1
## 202   Ground   Normal  1
## 203      Ice    Fairy  1
## 204      Ice     Fire  1
## 205      Ice    Ghost  1
## 206      Ice     Rock  1
## 207   Normal   Dragon  1
## 208   Normal   Ground  1
## 209   Normal    Water  1
## 210   Poison      Bug  1
## 211   Poison    Fairy  1
## 212  Psychic     Dark  1
## 213  Psychic   Dragon  1
## 214  Psychic     Fire  1
## 215  Psychic      Ice  1
## 216     Rock Fighting  1
## 217    Steel Fighting  1
## 218    Water    Steel  1
# Visualization of combinations using a heatmap

library(ggplot2)

ggplot(common_combinations, aes(x = Type1, y = Type2, fill = n)) +
  geom_tile() +
  labs(title = "Heatmap of Type Combinations", x = "Type 1", y = "Type 2", fill = "Count of Pokémon") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

#INSIGHTS 
#Water & Ground type combination seems to be one of the most common dual-type combinations, indicating that many Pokémon in the dataset have this type combination.
#Conversely, many cells in the heatmap are empty or have low counts, indicating that certain dual-type combinations are rare or non-existent in the dataset.

#Significance:
#Common dual-type combinations, like Water & Ground, might offer a balanced set of abilities and attributes that make them versatile in various battle scenarios.

#Further Questions:
#What are the specific attributes and abilities of Pokémon with the Water & Ground type combination that might make them popular or common?
# Are there certain dual-type combinations that are particularly advantageous or disadvantageous in battles?
# Formulating Hypotheses:

# Hypothesis 1: Certain Pokémon types might be rarer in the dataset because they represent newer additions in later Pokémon generations. For instance, if a particular type was introduced in a later game version, it might have fewer Pokémon of that type.

# Test: We'd need to consider the Generation to test this hypothesis. By checking the distribution of Pokémon types across generations, we could validate or refute this hypothesis.

# Hypothesis 2: Some Pokémon type combinations might be designed to be rarer to make them more valuable or sought-after in the game.

# Test: This would be harder to test with the dataset alone. It would require external information like player experiences. 

# Hypothesis 3: Certain Pokémon types might have higher average stats because they are designed to be "boss" or "elite" Pokémon, meant to be more challenging to defeat.

# Test: By comparing the list of Pokémon with high average stats to known elite or boss Pokémon from the game series, we could get some validation for this hypothesis.

# Further Questions:

#How do the stats vary for Pokémon of different weights and heights?

#Are there patterns in stats based on the Pokémon's name or ID? Perhaps certain naming conventions indicate stronger Pokémon?