## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
This week we will explore confidence intervals by choosing two numeric variables and pairing with columns that have been built.
## abilities against_bug against_dark against_dragon
## Length:801 Min. :0.2500 Min. :0.250 Min. :0.0000
## Class :character 1st Qu.:0.5000 1st Qu.:1.000 1st Qu.:1.0000
## Mode :character Median :1.0000 Median :1.000 Median :1.0000
## Mean :0.9963 Mean :1.057 Mean :0.9688
## 3rd Qu.:1.0000 3rd Qu.:1.000 3rd Qu.:1.0000
## Max. :4.0000 Max. :4.000 Max. :2.0000
##
## against_electric against_fairy against_fight against_fire
## Min. :0.000 Min. :0.250 Min. :0.000 Min. :0.250
## 1st Qu.:0.500 1st Qu.:1.000 1st Qu.:0.500 1st Qu.:0.500
## Median :1.000 Median :1.000 Median :1.000 Median :1.000
## Mean :1.074 Mean :1.069 Mean :1.066 Mean :1.135
## 3rd Qu.:1.000 3rd Qu.:1.000 3rd Qu.:1.000 3rd Qu.:2.000
## Max. :4.000 Max. :4.000 Max. :4.000 Max. :4.000
##
## against_flying against_ghost against_grass against_ground
## Min. :0.250 Min. :0.000 Min. :0.250 Min. :0.000
## 1st Qu.:1.000 1st Qu.:1.000 1st Qu.:0.500 1st Qu.:1.000
## Median :1.000 Median :1.000 Median :1.000 Median :1.000
## Mean :1.193 Mean :0.985 Mean :1.034 Mean :1.098
## 3rd Qu.:1.000 3rd Qu.:1.000 3rd Qu.:1.000 3rd Qu.:1.000
## Max. :4.000 Max. :4.000 Max. :4.000 Max. :4.000
##
## against_ice against_normal against_poison against_psychic
## Min. :0.250 Min. :0.000 Min. :0.0000 Min. :0.000
## 1st Qu.:0.500 1st Qu.:1.000 1st Qu.:0.5000 1st Qu.:1.000
## Median :1.000 Median :1.000 Median :1.0000 Median :1.000
## Mean :1.208 Mean :0.887 Mean :0.9753 Mean :1.005
## 3rd Qu.:2.000 3rd Qu.:1.000 3rd Qu.:1.0000 3rd Qu.:1.000
## Max. :4.000 Max. :1.000 Max. :4.0000 Max. :4.000
##
## against_rock against_steel against_water attack
## Min. :0.25 Min. :0.2500 Min. :0.250 Min. : 5.00
## 1st Qu.:1.00 1st Qu.:0.5000 1st Qu.:0.500 1st Qu.: 55.00
## Median :1.00 Median :1.0000 Median :1.000 Median : 75.00
## Mean :1.25 Mean :0.9835 Mean :1.058 Mean : 77.86
## 3rd Qu.:2.00 3rd Qu.:1.0000 3rd Qu.:1.000 3rd Qu.:100.00
## Max. :4.00 Max. :4.0000 Max. :4.000 Max. :185.00
##
## base_egg_steps base_happiness base_total capture_rate
## Min. : 1280 Min. : 0.00 Min. :180.0 Length:801
## 1st Qu.: 5120 1st Qu.: 70.00 1st Qu.:320.0 Class :character
## Median : 5120 Median : 70.00 Median :435.0 Mode :character
## Mean : 7191 Mean : 65.36 Mean :428.4
## 3rd Qu.: 6400 3rd Qu.: 70.00 3rd Qu.:505.0
## Max. :30720 Max. :140.00 Max. :780.0
##
## classfication defense experience_growth height_m
## Length:801 Min. : 5.00 Min. : 600000 Min. : 0.100
## Class :character 1st Qu.: 50.00 1st Qu.:1000000 1st Qu.: 0.600
## Mode :character Median : 70.00 Median :1000000 Median : 1.000
## Mean : 73.01 Mean :1054996 Mean : 1.164
## 3rd Qu.: 90.00 3rd Qu.:1059860 3rd Qu.: 1.500
## Max. :230.00 Max. :1640000 Max. :14.500
## NA's :20
## hp japanese_name name percentage_male
## Min. : 1.00 Length:801 Length:801 Min. : 0.00
## 1st Qu.: 50.00 Class :character Class :character 1st Qu.: 50.00
## Median : 65.00 Mode :character Mode :character Median : 50.00
## Mean : 68.96 Mean : 55.16
## 3rd Qu.: 80.00 3rd Qu.: 50.00
## Max. :255.00 Max. :100.00
## NA's :98
## pokedex_number sp_attack sp_defense speed
## Min. : 1 Min. : 10.00 Min. : 20.00 Min. : 5.00
## 1st Qu.:201 1st Qu.: 45.00 1st Qu.: 50.00 1st Qu.: 45.00
## Median :401 Median : 65.00 Median : 66.00 Median : 65.00
## Mean :401 Mean : 71.31 Mean : 70.91 Mean : 66.33
## 3rd Qu.:601 3rd Qu.: 91.00 3rd Qu.: 90.00 3rd Qu.: 85.00
## Max. :801 Max. :194.00 Max. :230.00 Max. :180.00
##
## type1 type2 weight_kg generation
## Length:801 Length:801 Min. : 0.10 Min. :1.00
## Class :character Class :character 1st Qu.: 9.00 1st Qu.:2.00
## Mode :character Mode :character Median : 27.30 Median :4.00
## Mean : 61.38 Mean :3.69
## 3rd Qu.: 64.80 3rd Qu.:5.00
## Max. :999.90 Max. :7.00
## NA's :20
## is_legendary
## Min. :0.00000
## 1st Qu.:0.00000
## Median :0.00000
## Mean :0.08739
## 3rd Qu.:0.00000
## Max. :1.00000
##
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
defense_avg <- pokemon |>
group_by(type1, type2)|>
mutate(def_dev = defense - mean(defense),
def_avg = mean(defense)) |>
summarize(generation, defense, def_dev, def_avg)
## Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
## dplyr 1.1.0.
## ℹ Please use `reframe()` instead.
## ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
## always returns an ungrouped data frame and adjust accordingly.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## `summarise()` has grouped output by 'type1', 'type2'. You can override using
## the `.groups` argument.
defense_avg
## # A tibble: 801 × 6
## # Groups: type1, type2 [166]
## type1 type2 generation defense def_dev def_avg
## <chr> <chr> <int> <int> <dbl> <dbl>
## 1 bug "" 1 35 -23.2 58.2
## 2 bug "" 1 55 -3.17 58.2
## 3 bug "" 1 120 61.8 58.2
## 4 bug "" 2 90 31.8 58.2
## 5 bug "" 3 35 -23.2 58.2
## 6 bug "" 3 55 -3.17 58.2
## 7 bug "" 3 55 -3.17 58.2
## 8 bug "" 3 75 16.8 58.2
## 9 bug "" 3 75 16.8 58.2
## 10 bug "" 4 41 -17.2 58.2
## # ℹ 791 more rows
This table adds one column for the average defense of a Pokemon and another for the standard deviation(SD) of the defense value. The SD measures by how much in either a positive or negative range the original defense value differs from the average. This is helpful because it allows us to investigate if a Pokemon is ‘weaker’ relative to the average. However, we should still ensure that we are acknowledging the context with which we are conducting our analysis so that we refrain from making assumptions.
att_rank <- pokemon |>
group_by(generation, attack)|>
mutate(attack_rank = rank(attack))|>
summarize(attack_rank)
## Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
## dplyr 1.1.0.
## ℹ Please use `reframe()` instead.
## ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
## always returns an ungrouped data frame and adjust accordingly.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## `summarise()` has grouped output by 'generation', 'attack'. You can override
## using the `.groups` argument.
att_rank
## # A tibble: 801 × 3
## # Groups: generation, attack [338]
## generation attack attack_rank
## <int> <int> <dbl>
## 1 1 5 1
## 2 1 10 1
## 3 1 20 1.5
## 4 1 20 1.5
## 5 1 25 1
## 6 1 30 1.5
## 7 1 30 1.5
## 8 1 35 3
## 9 1 35 3
## 10 1 35 3
## # ℹ 791 more rows
library(ggplot2)
att_rank |>
mutate(generation = factor(generation)) |>
ggplot() +
geom_boxplot(mapping = aes(x = generation, y = attack_rank)) +
labs(title="Attack Rank by Generation") + # labels!
theme_minimal()