library(readr)
pokemon <- read_csv("C://Users//ASUS//Downloads//pokemon.csv")
library(dplyr)
library(sjPlot)
library(snakecase)
library(ggplot2)
library(dunn.test)
Your boss has asked you to do the following tasks and present them in an .html:
Compare the share of legendary pokemons across all the generations (variables: is_legendary, generation). If yes, name the “most legendary” generation of all.
table1 <- table(pokemon$is_legendary, pokemon$generation)
table1
##
## 1 2 3 4 5 6 7
## 0 146 94 125 94 143 66 63
## 1 5 6 10 13 13 6 17
In the table we can see numbers of pokemons in all generations: the row, named “0”, shows the number of non-legendary pokemons in each generation, the row, named “1” shows the number of legendary ones. We see, that 7th generation has the biggest number of legendary pokemons, but we cannot say exacltly does it show the relation between it’s generation and being legendary. Let’s check it more precisily with chi-sq test.
chi <- chisq.test(table1)
chi
##
## Pearson's Chi-squared test
##
## data: table1
## X-squared = 24.127, df = 6, p-value = 0.0004949
chi$stdres
##
## 1 2 3 4 5 6
## 0 2.6217914 1.0367789 0.6008514 -1.3420409 0.1999744 0.1277893
## 1 -2.6217914 -1.0367789 -0.6008514 1.3420409 -0.1999744 -0.1277893
##
## 7
## 0 -4.1764550
## 1 4.1764550
Conclusion: p-value is less than .05 -> there is some assosiation. Then we have to look at residuals: to be statitically significant their module have to be > 1.8. And the bigger is it - the better. The biggest assosiation is seen for the 7th generation, it’s residual for being legendary is 4.17, which means that pokemons from the 7th generation tend to be legendary more often, than pokemons from other generations.
Are the attack, defense, weight, and speed characteristics of pokemons related? Run a correlation matrix to learn that! Report the statistically significant results (variables: sp_attack, sp_defense, speed, weight_kg).
task_2 <- pokemon %>% select(sp_attack, sp_defense, speed, weight_kg)
task_2 <- na.omit(task_2)
cor(task_2)
## sp_attack sp_defense speed weight_kg
## sp_attack 1.0000000 0.5048682 0.44534428 0.24521797
## sp_defense 0.5048682 1.0000000 0.22357290 0.30652308
## speed 0.4453443 0.2235729 1.00000000 0.05138394
## weight_kg 0.2452180 0.3065231 0.05138394 1.00000000
By and large, the attack, defense, weight, and speed characteristics of pokemons are not related. The results are not statistically significant for the pairs of our variables. However, there might be a reeeeeeaaly slight statistically significance for weight and speed - p-value for it is nearly equal .05. The same reaults are shown in the table below.
sjp.corr(task_2, show.legend = T)
Is the speed of pokemons different across the primary type of pokemons (variables:speed, type1)? Check with a formal test. Name the most distinct type of pokemons by speed.
Firstly, let’s visualise it, to see any noticable differences.
boxplot(pokemon$speed ~ pokemon$type1)
Well… it seems that the speed means of different types can be really different, but we are not sure is that conclusion statistically signofocant. I’ll use ANOVA to answer these questions. To check
oneway.test(pokemon$speed ~ pokemon$type1, var.equal = T)
##
## One-way analysis of means
##
## data: pokemon$speed and pokemon$type1
## F = 3.5553, num df = 17, denom df = 783, p-value = 1.672e-06
aov.out <- aov(pokemon$speed ~ pokemon$type1)
summary(aov.out)
## Df Sum Sq Mean Sq F value Pr(>F)
## pokemon$type1 17 47906 2818.0 3.555 1.67e-06 ***
## Residuals 783 620616 792.6
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
P-value is veeeery small - it’s good. Now we can say: YES, the speed of pokemons IS different across the primary type of pokemons.
To see the most distinct type of pokemons by speed I’ll use……………..
Хуйня1:
TukeyHSD(aov.out)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = pokemon$speed ~ pokemon$type1)
##
## $`pokemon$type1`
## diff lwr upr p adj
## dark-bug 11.74090038 -9.934098 33.415899 0.9127770
## dragon-bug 12.54166667 -9.698247 34.781581 0.8808551
## electric-bug 21.84081197 2.246655 41.434969 0.0124578
## fairy-bug -9.90277778 -35.873402 16.067847 0.9974961
## fighting-bug 0.71626984 -21.232914 22.665454 1.0000000
## fire-bug 9.77670940 -8.158504 27.711923 0.9082681
## flying-bug 36.09722222 -21.974859 94.169304 0.7691356
## ghost-bug -5.23611111 -27.476025 17.003803 0.9999965
## grass-bug -4.54380342 -20.650101 11.562494 0.9999515
## ground-bug -3.60069444 -24.538881 17.337492 1.0000000
## ice-bug -0.83031401 -24.434839 22.774211 1.0000000
## normal-bug 5.96388889 -9.115696 21.043473 0.9961502
## poison-bug 0.61805556 -20.320131 21.556242 1.0000000
## psychic-bug 11.58149895 -6.255199 29.418197 0.7028121
## rock-bug -6.14722222 -24.874906 12.580461 0.9996185
## steel-bug -6.98611111 -30.214944 16.242721 0.9998820
## water-bug 0.35160819 -14.483866 15.187082 1.0000000
## dragon-dark 0.80076628 -25.555068 27.156601 1.0000000
## electric-dark 10.09991158 -14.065088 34.264911 0.9927961
## fairy-dark -21.64367816 -51.215448 7.928091 0.4834427
## fighting-dark -11.02463054 -37.135603 15.086342 0.9919431
## fire-dark -1.96419098 -24.804684 20.876302 1.0000000
## flying-dark 24.35632184 -35.413092 84.125736 0.9945749
## ghost-dark -16.97701149 -43.332846 9.378823 0.7156480
## grass-dark -16.28470380 -37.719000 5.149592 0.4109502
## ground-dark -15.34159483 -40.608646 9.925457 0.8007264
## ice-dark -12.57121439 -40.088297 14.945868 0.9814886
## normal-dark -5.77701149 -26.450910 14.896887 0.9999576
## poison-dark -11.12284483 -36.389896 14.144207 0.9873816
## psychic-dark -0.15940143 -22.922618 22.603815 1.0000000
## rock-dark -17.88812261 -41.356023 5.579778 0.4046695
## steel-dark -18.72701149 -45.922505 8.468482 0.6000999
## water-dark -11.38929220 -31.885817 9.107233 0.8934512
## electric-dragon 9.29914530 -15.373824 33.972115 0.9978221
## fairy-dragon -22.44444444 -52.432738 7.543850 0.4399475
## fighting-dragon -11.82539683 -38.407178 14.756385 0.9859111
## fire-dragon -2.76495726 -26.142218 20.612304 1.0000000
## flying-dragon 23.55555556 -36.421032 83.532143 0.9964503
## ghost-dragon -17.77777778 -44.600123 9.044568 0.6684113
## grass-dragon -17.08547009 -39.090862 4.919922 0.3696336
## ground-dragon -16.14236111 -41.895654 9.610932 0.7571817
## ice-dragon -13.37198068 -41.336209 14.592248 0.9709684
## normal-dragon -6.57777778 -27.843203 14.687647 0.9998271
## poison-dragon -11.92361111 -37.676904 13.829682 0.9788303
## psychic-dragon -0.96016771 -24.261932 22.341597 1.0000000
## rock-dragon -18.68888889 -42.679524 5.301746 0.3633989
## steel-dragon -19.52777778 -47.175619 8.120063 0.5523110
## water-dragon -12.19005848 -33.283084 8.902967 0.8574912
## fairy-electric -31.74358974 -59.825879 -3.661300 0.0101273
## fighting-electric -21.12454212 -45.535775 3.286691 0.1888464
## fire-electric -12.06410256 -32.940250 8.812045 0.8575433
## flying-electric 14.25641026 -44.790252 73.303072 0.9999949
## ghost-electric -27.07692308 -51.749892 -2.403954 0.0155631
## grass-electric -26.38461538 -45.712172 -7.057058 0.0003012
## ground-electric -25.44150641 -48.947874 -1.935139 0.0188444
## ice-electric -22.67112598 -48.580878 3.238626 0.1737773
## normal-electric -15.87692308 -34.357602 2.603756 0.1991003
## poison-electric -21.22275641 -44.729124 2.283611 0.1352016
## psychic-electric -10.25931301 -31.050884 10.532258 0.9609914
## rock-electric -27.98803419 -49.548827 -6.427242 0.0008871
## steel-electric -28.82692308 -54.394878 -3.258968 0.0105268
## water-electric -21.48920378 -39.771242 -3.207165 0.0054784
## fighting-fairy 10.61904762 -19.154274 40.392369 0.9988937
## fire-fairy 19.67948718 -7.271503 46.630478 0.4879937
## flying-fairy 46.00000000 -15.457714 107.457714 0.4398388
## ghost-fairy 4.66666667 -25.321627 34.654961 1.0000000
## grass-fairy 5.35897436 -20.411102 31.129050 0.9999995
## ground-fairy 6.30208333 -22.733957 35.338124 0.9999990
## ice-fairy 9.07246377 -21.941366 40.086293 0.9999195
## normal-fairy 15.86666667 -9.274492 41.007825 0.7471384
## poison-fairy 10.52083333 -18.515207 39.556874 0.9986537
## psychic-fairy 21.48427673 -5.401255 48.369808 0.3167357
## rock-fairy 3.75555556 -23.729170 31.240281 1.0000000
## steel-fairy 2.91666667 -27.812190 33.645524 1.0000000
## water-fairy 10.25438596 -14.741120 35.249891 0.9941478
## fire-fighting 9.06043956 -14.040410 32.161289 0.9965062
## flying-fighting 35.38095238 -24.488439 95.250343 0.8334572
## ghost-fighting -5.95238095 -32.534162 20.629401 0.9999983
## grass-fighting -5.26007326 -26.971594 16.451448 0.9999946
## ground-fighting -4.31696429 -29.819611 21.185683 1.0000000
## ice-fighting -1.54658385 -29.280155 26.186987 1.0000000
## normal-fighting 5.24761905 -15.713564 26.208802 0.9999913
## poison-fighting -0.09821429 -25.600861 25.404433 1.0000000
## psychic-fighting 10.86522911 -12.159217 33.889675 0.9744544
## rock-fighting -6.86349206 -30.584864 16.857879 0.9999309
## steel-fighting -7.70238095 -35.116903 19.712141 0.9999542
## water-fighting -0.36466165 -21.150922 20.421599 1.0000000
## flying-fire 26.32051282 -32.196572 84.837598 0.9842099
## ghost-fire -15.01282051 -38.390082 8.364441 0.7204420
## grass-fire -14.32051282 -31.964078 3.323052 0.2893215
## ground-fire -13.37740385 -35.519902 8.765094 0.8070720
## ice-fire -10.60702341 -35.286087 14.072040 0.9902234
## normal-fire -3.81282051 -20.524386 12.898745 0.9999978
## poison-fire -9.15865385 -31.301152 12.983844 0.9935846
## psychic-fire 1.80478955 -17.431384 21.040963 1.0000000
## rock-fire -15.92393162 -35.989043 4.141180 0.3291539
## steel-fire -16.76282051 -41.082796 7.557155 0.5983406
## water-fire -9.42510121 -25.916731 7.066529 0.8685085
## ghost-flying -41.33333333 -101.309921 18.643255 0.5986192
## grass-flying -40.64102564 -98.623697 17.341645 0.5668859
## ground-flying -39.69791667 -99.204093 19.808259 0.6571568
## ice-flying -36.92753623 -97.423412 23.568339 0.7938291
## normal-flying -30.13333333 -87.839236 27.572569 0.9355629
## poison-flying -35.47916667 -94.985343 24.027009 0.8231807
## psychic-flying -24.51572327 -83.002689 33.971242 0.9925622
## rock-flying -42.24444444 -101.009259 16.520370 0.5183565
## steel-flying -43.08333333 -103.433611 17.266944 0.5318048
## water-flying -35.74561404 -93.388208 21.896980 0.7724532
## grass-ghost 0.69230769 -21.313084 22.697699 1.0000000
## ground-ghost 1.63541667 -24.117876 27.388710 1.0000000
## ice-ghost 4.40579710 -23.558431 32.370025 1.0000000
## normal-ghost 11.20000000 -10.065425 32.465425 0.9307322
## poison-ghost 5.85416667 -19.899126 31.607460 0.9999979
## psychic-ghost 16.81761006 -6.484155 40.119375 0.5106646
## rock-ghost -0.91111111 -24.901746 23.079524 1.0000000
## steel-ghost -1.75000000 -29.397841 25.897841 1.0000000
## water-ghost 5.58771930 -15.505307 26.680745 0.9999802
## ground-grass 0.94310897 -19.745805 21.632023 1.0000000
## ice-grass 3.71348941 -19.670204 27.097183 1.0000000
## normal-grass 10.50769231 -4.223817 25.239201 0.5334465
## poison-grass 5.16185897 -15.527055 25.850773 0.9999917
## psychic-grass 16.12530237 -1.418109 33.668714 0.1158630
## rock-grass -1.60341880 -20.051986 16.845148 1.0000000
## steel-grass -2.44230769 -25.446702 20.562086 1.0000000
## water-grass 4.89541161 -9.586121 19.376944 0.9994404
## ice-ground 2.77038043 -24.170146 29.710907 1.0000000
## normal-ground 9.56458333 -10.335472 29.464639 0.9695145
## poison-ground 4.21875000 -20.419148 28.856648 1.0000000
## psychic-ground 15.18219340 -6.880584 37.244970 0.6013661
## rock-ground -2.54652778 -25.335658 20.242603 1.0000000
## steel-ground -3.38541667 -29.997388 23.226554 1.0000000
## water-ground 3.95230263 -15.763418 23.668023 0.9999997
## normal-ice 6.79420290 -15.894520 29.482926 0.9998888
## poison-ice 1.44836957 -25.492157 28.388896 1.0000000
## psychic-ice 12.41181296 -12.195749 37.019375 0.9523844
## rock-ice -5.31690821 -30.577756 19.943939 0.9999994
## steel-ice -6.15579710 -34.912761 22.601166 0.9999992
## water-ice 1.18192220 -21.345297 23.709141 1.0000000
## poison-normal -5.34583333 -25.245889 14.554222 0.9999758
## psychic-normal 5.61761006 -10.988182 22.223402 0.9994351
## rock-normal -12.11111111 -29.670458 5.448236 0.5970948
## steel-normal -12.95000000 -35.247605 9.347605 0.8524052
## water-normal -5.61228070 -18.942541 7.717979 0.9921919
## psychic-poison 10.96344340 -11.099334 33.026220 0.9583906
## rock-poison -6.76527778 -29.554408 16.023853 0.9999014
## steel-poison -7.60416667 -34.216138 19.007804 0.9999420
## water-poison -0.26644737 -19.982168 19.449273 1.0000000
## rock-psychic -17.72872117 -37.705823 2.248381 0.1556390
## steel-psychic -18.56761006 -42.815025 5.679805 0.3958533
## water-psychic -11.22989076 -27.614327 5.154546 0.6087110
## steel-rock -0.83888889 -25.749037 24.071259 1.0000000
## water-rock 6.49883041 -10.851331 23.848992 0.9979790
## water-steel 7.33771930 -14.795528 29.470966 0.9995654
plot(TukeyHSD(aov.out), las = 1)
Хуйня2:
pairwise.t.test(pokemon$speed, pokemon$type1, adjust = "bonferroni")
##
## Pairwise comparisons using t tests with pooled SD
##
## data: pokemon$speed and pokemon$type1
##
## bug dark dragon electric fairy fighting fire flying
## dark 1.00000 - - - - - - -
## dragon 1.00000 1.00000 - - - - - -
## electric 0.01534 1.00000 1.00000 - - - - -
## fairy 1.00000 1.00000 1.00000 0.01242 - - - -
## fighting 1.00000 1.00000 1.00000 0.35708 1.00000 - - -
## fire 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 - -
## flying 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 -
## ghost 1.00000 1.00000 1.00000 0.01942 1.00000 1.00000 1.00000 1.00000
## grass 1.00000 1.00000 0.90657 0.00032 1.00000 1.00000 0.64091 1.00000
## ground 1.00000 1.00000 1.00000 0.02381 1.00000 1.00000 1.00000 1.00000
## ice 1.00000 1.00000 1.00000 0.32185 1.00000 1.00000 1.00000 1.00000
## normal 1.00000 1.00000 1.00000 0.38078 1.00000 1.00000 1.00000 1.00000
## poison 1.00000 1.00000 1.00000 0.23556 1.00000 1.00000 1.00000 1.00000
## psychic 1.00000 1.00000 1.00000 1.00000 0.72865 1.00000 1.00000 1.00000
## rock 1.00000 1.00000 0.88884 0.00097 1.00000 1.00000 0.76711 1.00000
## steel 1.00000 1.00000 1.00000 0.01287 1.00000 1.00000 1.00000 1.00000
## water 1.00000 1.00000 1.00000 0.00648 1.00000 1.00000 1.00000 1.00000
## ghost grass ground ice normal poison psychic rock
## dark - - - - - - - -
## dragon - - - - - - - -
## electric - - - - - - - -
## fairy - - - - - - - -
## fighting - - - - - - - -
## fire - - - - - - - -
## flying - - - - - - - -
## ghost - - - - - - - -
## grass 1.00000 - - - - - - -
## ground 1.00000 1.00000 - - - - - -
## ice 1.00000 1.00000 1.00000 - - - - -
## normal 1.00000 1.00000 1.00000 1.00000 - - - -
## poison 1.00000 1.00000 1.00000 1.00000 1.00000 - - -
## psychic 1.00000 0.19518 1.00000 1.00000 1.00000 1.00000 - -
## rock 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 0.28045 -
## steel 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000
## water 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000
## steel
## dark -
## dragon -
## electric -
## fairy -
## fighting -
## fire -
## flying -
## ghost -
## grass -
## ground -
## ice -
## normal -
## poison -
## psychic -
## rock -
## steel -
## water 1.00000
##
## P value adjustment method: holm
ПОХУЙ, получаем -1 балл, потому что не ебем как делать эту хуету, но пара grass&electric seems to be showing the most significant difference.
Is the attack skill of legendary pokemons statistifcally significantly higher than among the non-legendary pokemons? Check with a formal test.
First - visualise:
boxplot(pokemon$sp_attack ~ pokemon$is_legendary)
Here we see that the means are different, but we can’t say is this difference signofocant. Let’s chech it with T-test.
t.test(pokemon$sp_attack ~ pokemon$is_legendary, var.equal = T)
##
## Two Sample t-test
##
## data: pokemon$sp_attack by pokemon$is_legendary
## t = -12.568, df = 799, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -53.78143 -39.25132
## sample estimates:
## mean in group 0 mean in group 1
## 67.24077 113.75714
P-value < 0.05 - the difference is stastistically significant. Now we can surely say: the attack skill of legendary pokemons is statistifcally significantly higher than among the non-legendary pokemons.
Now, predict the hit point (hp) of pokemons with their defense skill, attack skill, and their legendary status. Your boss asks you to check whether defense works differently for legendary pokemons as compared to non-legendary ones. (Variables: hp, sp_attack, sp_defense, is_legendary).
In order to predict the hit point let’s use linear regression. First - models
model_ad <- lm(hp ~ sp_attack + sp_defense + is_legendary, data = pokemon)
summary(model_ad)
##
## Call:
## lm(formula = hp ~ sp_attack + sp_defense + is_legendary, data = pokemon)
##
## Residuals:
## Min 1Q Median 3Q Max
## -70.475 -13.666 -3.656 9.178 181.106
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 41.90938 2.58978 16.183 < 2e-16 ***
## sp_attack 0.15816 0.03165 4.997 7.17e-07 ***
## sp_defense 0.20428 0.03566 5.729 1.43e-08 ***
## is_legendary 14.71406 3.31557 4.438 1.04e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 23.85 on 797 degrees of freedom
## Multiple R-squared: 0.1975, Adjusted R-squared: 0.1945
## F-statistic: 65.4 on 3 and 797 DF, p-value: < 2.2e-16
In this model, we can see that p-value is less than 0.05 meaning that our model is better than having no model. Adjusted R-squared value is equal to 0.19 meaning that the model explains about 19% of variance of the predicted variable.
The equation linear model is the following:
\[ HitPointOfPokemons = 41.9 + 0.16 * AttackSkill + 0.2 * DefenseSkill + 14.7 * IfLegendary \]
Which means, that when pokemon’s defense skill and attack skill are equal to 0 and they are not legendary, their hit point is equal to 41.9. Meanwhile, With every additional attack skill, a pokemon’s hit point is 0.16 more, With every additional defense skill, a pokemon’s hit point is 0.2 more, and if it is legendary, theur hit point increases by 14.7.
To check whether defense works differently for legendary pokemons as compared to non-legendary ones I will create interaction linear model:
model_int <- lm(hp ~ sp_defense * is_legendary, data = pokemon)
summary(model_int)
##
## Call:
## lm(formula = hp ~ sp_defense * is_legendary, data = pokemon)
##
## Residuals:
## Min 1Q Median 3Q Max
## -98.076 -14.111 -2.736 9.734 171.765
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 44.7677 2.4947 17.945 < 2e-16 ***
## sp_defense 0.3187 0.0343 9.292 < 2e-16 ***
## is_legendary 53.5197 10.7736 4.968 8.29e-07 ***
## sp_defense:is_legendary -0.3468 0.1047 -3.312 0.000967 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 24.06 on 797 degrees of freedom
## Multiple R-squared: 0.1836, Adjusted R-squared: 0.1806
## F-statistic: 59.76 on 3 and 797 DF, p-value: < 2.2e-16
In this model, we can see that p-value is less than 0.05 meaning that our model is better than having no model. Adjusted R-squared value is equal to 0.18 meaning that the model explains about 18% of variance of the predicted variable.
The first equation linear model is for both legendary pokemons and not. and the second one is only for non-legendary pokemons:
\[ HitPointOfPokemons(L) = 44.7 + 0.31 * DefenseSkill + 53.5 * IfLegendary - 0.34 * DefenseSkill * IfLegendary \]
\[ HitPointOfPokemons(NL) = 44.7 + 0.31 * DefenseSkill \] In other words, if pokemon’s defense skill is 50 and it is legendary, it’s hit point will be equal to:
\[ 44.7 + 0.31 * 50 + 53.5 - 0.34 * 50 = 96.7 \] … and if it is non-legendary, it’s hit point will be equal to:
\[ 44.7 + 0.31 * 50 = 60.2 \]
On the graph below we can see that defense works differently for legendary pokemons as compared to non-legendary ones while their defense skill is less than 125. When their defense skell is more than 125, the difference is not that significant anymore.
plot_model(model_int, type="int")