class: center, middle, inverse, title-slide .title[ # An Introduction to
[ comment ]
and RStudio for Educational Researchers ] .subtitle[ ##
Descriptive and Inferential Statistics:
Parametric Tests ] .author[ ### Jorge Sinval ] .date[ ### 2025-11-18 ] --- class: inverse, center, middle # _t_-test <style> .orange { color: #EB811B; } .kbd { display: inline-block; padding: .2em .5em; font-size: 0.75em; line-height: 1.75; color: #555; vertical-align: middle; background-color: #fcfcfc; border: solid 1px #ccc; border-bottom-color: #bbb; border-radius: 3px; box-shadow: inset 0 -1px 0 #bbb } </style>
--- class: inverse, center, middle # _t_-test (independent samples) <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- # _t_-test (independent samples) .panelset[ .panel[.panel-name[Assumptions] * quantitative dependent variable * independent samples * `\(Y_i\sim \mathcal{N}(\mu_i, \sigma_i)\)` (d.v. follows normal distribution in both groups) * `\(\sigma^2_1=\sigma^2_2\)` (homogeneity of variance) ] .panel[.panel-name[Hypotheses] `\(H_0: \mu_1 = \mu_2\)` vs. `\(H_1: \mu_1 \neq \mu_2\)` (two-tailed test) `\(H_0: \mu_1 \leq \mu_2\)` vs. `\(H_1: \mu_1 > \mu_2\)` (right-tailed test) `\(H_0: \mu_1 \geq \mu_2\)` vs. `\(H_1: \mu_1 < \mu_2\)` (left-tailed test) ] .panel[.panel-name[Test Statistic] .pull-left[ `\(T=\frac{(\bar X_1-\bar X_2)-(\mu_1 - \mu_2)}{\hat S\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}} \sim \mathcal{t}_{(n_1 + n_2 -2)}\)` ] .pull-right[ where `\(\hat S = \sqrt{\frac{(n_1-1)S'^2_1+(n_2-1)S'^2_2}{n_1+n_2-2}}\)` ] ] .panel[.panel-name[Decision] In<svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>: `\(p\leq\alpha; \text{ reject }H_0.\)` ] .panel[.panel-name[R Code] ``` r ds <- readr::read_csv(trimws('https://ndownloader.figshare.com/files/22299075 ')) #assure that the independent variable is a factor ds$Country <- as.factor(ds$Country) #t-test independent samples *t.test(formula = Age~Country, data = ds, var.equal=T) ``` `var.equal` argument can be: * `FALSE` for a .orange[heteroscedasticity] (default) it will conduct a Welch _t_-test, * `TRUE` for a .orange[homoscedasticity], `alternative` argument can be: * `"two.sided"` for a .orange[two-tailed test] (default), * `"less"` for a .orange[left-tailed test], and, * `"greater"` for a .orange[right-tailed test]. ] .panel[.panel-name[Output] .scroll-box-16[ ``` ## ## Two Sample t-test ## ## data: Age by Country *## t = -1.1685, df = 1039, p-value = 0.2429 ## alternative hypothesis: true difference in means between group Brazil and group Portugal is not equal to 0 ## 95 percent confidence interval: ## -1.9309654 0.4896006 ## sample estimates: ## mean in group Brazil mean in group Portugal ## 35.11006 35.83074 ``` ] ] .panel[.panel-name[Effect Size] .pull-left[ `\(d=\frac{|\bar X_1-\bar X_2|}{\hat S} \approx \frac{2t}{\sqrt{n_1+n_2-2}}\)` where `\(\hat S = \sqrt{\frac{(n_1-1)S'^2_1+(n_2-1)S'^2_2}{n_1+n_2-2}}\)` ] .pull-right[ <table> <caption>How to classify? ⚠️</caption> <thead> <tr> <th style="text-align:left;"> Effect Size </th> <th style="text-align:left;"> \( d \) 📔 </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Small </td> <td style="text-align:left;"> 0.2 </td> </tr> <tr> <td style="text-align:left;"> Medium </td> <td style="text-align:left;"> 0.5 </td> </tr> <tr> <td style="text-align:left;"> Large </td> <td style="text-align:left;"> 0.8 </td> </tr> </tbody> <tfoot> <tr> <td style = 'padding: 0; border:0;' colspan='100%'><sup></sup> ⚠️ No precise definitions, it is always context dependent.</td> </tr> </tfoot> </table> .font60[ <sup>📔</sup> Cohen, J. (1992). A power primer. _Psychological Bulletin, 112_(1), 155–159. [https://doi.org/10.1037/0033-2909.112.1.155](https://doi.org/10.1037/0033-2909.112.1.155) ] ] ] .panel[.panel-name[R Code] ``` r *library(lsr) #the package estimate the effect size measure(s) *cohensD(formula = Age~Country, data = ds, method = "pooled") ``` `method`: * `"pooled"` when equal variances are assumed; * `"paired"` for paired samples. * `"unequal"` when equal variances are not assumed (i.e., Welch _t_-test). ] .panel[.panel-name[Output] ``` *## [1] 0.07243522 ``` ] .panel[.panel-name[Statistical Analysis] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ It was not possible to validate the Normality assumption `\((W_{Br(527)} = 0.949; p_{Br} < .001;\)` `\(W_{Pt(514)} = 0.956; p_{Pt} < .001)\)` for both groups, but the skewness, and kurtosis values for both groups did not reveal severe normality violations `\((sk_{Br}=0.896;\)` `\(ku_{Br}=0.844;\)` `\(sk_{Pt}=0.69;\)` `\(ku_{Pt}=-0.044)\)`. The means of age were compared with a _t_-Student test after validating the assumptions of homoscedasticity with the Levene test `\((F_{(1;1039)} = 0.129;\)` `\(p =0.719)\)`. An `\(\alpha = .05\)` is considered for all statistical analyses. ]] .panel[.panel-name[Results] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ The mean age of the workers from Brazil `\((M = 35.11;\)` `\(SD=10.133;\)` `\(n=599)\)` was slightly lower than the mean age from the workers from Portugal `\((M = 35.831;\)` `\(SD=9.758;\)` `\(n=572)\)`. However, these differences are not statistically significant `\((t_{(1039)}= -1.168;\)` `\(p =0.243;\)` `\(d=0.072;\)` `\(IC 95\% \rbrack -1.93 ; 0.49 \lbrack)\)`. .tr[📚 Marôco, J. (2021). _Análise estatística com o SPSS statistics_ (8th ed.). ReportNumber.]]] ] --- class: inverse, center, middle # Welch's _t_-test <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- # Welch's _t_-test .panelset[ .panel[.panel-name[Assumptions] * quantitative dependent variable * independent samples * `\(Y_i\sim \mathcal{N}(\mu_i, \sigma_i)\)` (d.v. follows normal distribution in both groups) ] .panel[.panel-name[Hypotheses] `\(H_0: \mu_1 = \mu_2\)` vs. `\(H_1: \mu_1 \neq \mu_2\)` (two-tailed test) `\(H_0: \mu_1 \leq \mu_2\)` vs. `\(H_1: \mu_1 > \mu_2\)` (right-tailed test) `\(H_0: \mu_1 \geq \mu_2\)` vs. `\(H_1: \mu_1 < \mu_2\)` (left-tailed test) ] .panel[.panel-name[Test Statistic] .pull-left[ `\(T=\frac{(\bar X_1-\bar X_2)-(\mu_1 - \mu_2)}{\sqrt{\frac{S'^2_1}{n_1}+\frac{S'^2_2}{n_2}}} \sim \mathcal{t}_{(\nu)}\)` ] .pull-right[ where `\(\nu = \frac{\left[\frac{S'^2_1}{n_1}+\frac{S'^2_2}{n_2}\right]^2 }{\frac{\left(\frac{S'^2_1}{n_1}\right)^2}{n_1-1}+\frac{\left(\frac{S'^2_2}{n_2}\right)^2}{n_2-1}}\)` ] ] .panel[.panel-name[Decision] In<svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>: `\(p\leq\alpha; \text{ reject }H_0.\)` ] .panel[.panel-name[R Code] ``` r ds <- readr::read_csv(trimws('https://ndownloader.figshare.com/files/22299075 ')) #assure that the independent variable is a factor ds$Country <- as.factor(ds$Country) #t-test independent samples *t.test(formula = Age~Country, data = ds, var.equal=F) ``` `var.equal` argument can be: * `FALSE` for a .orange[heteroscedasticity] (default) it will conduct a Welch _t_-test, * `TRUE` for a .orange[homoscedasticity], `alternative` argument can be: * `"two.sided"` for a .orange[two-tailed test] (default), * `"less"` for a .orange[left-tailed test], and, * `"greater"` for a .orange[right-tailed test]. ] .panel[.panel-name[Output] .scroll-box-16[ ``` ## ## Welch Two Sample t-test ## ## data: Age by Country *## t = -1.169, df = 1038.8, p-value = 0.2427 ## alternative hypothesis: true difference in means between group Brazil and group Portugal is not equal to 0 ## 95 percent confidence interval: ## -1.9303961 0.4890313 ## sample estimates: ## mean in group Brazil mean in group Portugal ## 35.11006 35.83074 ``` ] ] .panel[.panel-name[Effect Size] .pull-left[ `\(d=\frac{|\bar X_1-\bar X_2|}{\sqrt{\frac{S'^2_1+S'^2_2}{2}}}\)` ] .pull-right[ <table> <caption>How to classify? ⚠️</caption> <thead> <tr> <th style="text-align:left;"> Effect Size </th> <th style="text-align:left;"> \( d \) 📔 </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Small </td> <td style="text-align:left;"> 0.2 </td> </tr> <tr> <td style="text-align:left;"> Medium </td> <td style="text-align:left;"> 0.5 </td> </tr> <tr> <td style="text-align:left;"> Large </td> <td style="text-align:left;"> 0.8 </td> </tr> </tbody> <tfoot> <tr> <td style = 'padding: 0; border:0;' colspan='100%'><sup></sup> ⚠️ No precise definitions, it is always context dependent.</td> </tr> </tfoot> </table> .font60[ <sup>📔</sup> Cohen, J. (1992). A power primer. _Psychological Bulletin, 112_(1), 155–159. [https://doi.org/10.1037/0033-2909.112.1.155](https://doi.org/10.1037/0033-2909.112.1.155) ] ] ] .panel[.panel-name[R Code] ``` r *library(lsr) #the package estimate the effect size measure(s) *cohensD(formula = Age~Country, data = ds, method = "unequal") ``` `method`: * `"pooled"` when equal variances are assumed; * `"paired"` for paired samples. * `"unequal"` when equal variances are not assumed (i.e., Welch _t_-test). ] .panel[.panel-name[Output] ``` *## [1] 0.07245228 ``` ] .panel[.panel-name[Statistical Analysis] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ It was not possible to validate the Normality assumption `\((W_{Br(527)} = 0.949; p_{Br} < .001;\)` `\(W_{Pt(514)} = 0.956; p_{Pt} < .001)\)` for both groups), but the skewness, and kurtosis values for both groups did not reveal severe normality violations `\((sk_{Br}=0.896;\)` `\(ku_{Br}=0.844;\)` `\(sk_{Pt}=0.69;\)` `\(ku_{Pt}=-0.044)\)`. The means of age were compared with a Welch's _t_-Student. An `\(\alpha = .05\)` is considered for all statistical analyses. ]] .panel[.panel-name[Results] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ The mean age of the workers from Brazil `\((M = 35.11;\)` `\(SD=10.133;\)` `\(n=599)\)` was slightly lower than the mean age from the workers from Portugal `\((M = 35.831;\)` `\(SD=9.758;\)` `\(n=572)\)`. However, these differences are not statistically significant `\((t_{(1038.833)}= -1.169;\)` `\(p =0.243;\)` `\(d=0.072;\)` `\(IC 95\% \rbrack -1.93 ; 0.49 \lbrack)\)`. .tr[📚 Marôco, J. (2021). _Análise estatística com o SPSS statistics_ (8th ed.). ReportNumber.]]] ] --- class: inverse, center, middle # _t_-test (paired samples) <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- # _t_-test (paired samples) .panelset[ .panel[.panel-name[Assumptions] * quantitative dependent variable * paired samples * `\(Y\sim \mathcal{N}(\mu_D, \sigma_D)\)` <sup>⚠️</sup> <br> <br> <br> .footnote[<sup>⚠️</sup>Remember: if `\(X_1 \sim \mathcal{N}(\mu_{X_1};\sigma_{X_1})\)` and if `\(X_2 \sim \mathcal{N}(\mu_{X_2};\sigma_{X_2})\)` thus `\(X_1-X_2 \sim \mathcal{N}(\mu_{X_1}-\mu_{X_2};\sqrt{\sigma^2_{X_1}+\sigma^2_{X_2}})\)`] ] .panel[.panel-name[Hypotheses] `\(H_0: \mu_1 = \mu_2\)` vs. `\(H_1: \mu_1 \neq \mu_2\)` (two-tailed test) `\(H_0: \mu_1 \leq \mu_2\)` vs. `\(H_1: \mu_1 > \mu_2\)` (right-tailed test) `\(H_0: \mu_1 \geq \mu_2\)` vs. `\(H_1: \mu_1 < \mu_2\)` (left-tailed test) ] .panel[.panel-name[Test Statistic] .pull-left[ `\(T=\frac{\bar D-(\mu_1 - \mu_2)}{\sqrt{\frac{S'_D}{\sqrt{n}}}} \sim \mathcal{t}_{(n-1)}\)` ] .pull-right[ where `\(D = X_1-X_2\)` ] ] .panel[.panel-name[Decision] In<svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>: `\(p\leq\alpha; \text{ reject }H_0.\)` ] .panel[.panel-name[R Code] ``` r ds <- readr::read_csv(trimws('https://ndownloader.figshare.com/files/22299075 ')) #create two composite variables (using the mean) ds$disengagement <- rowMeans(x = ds[,c("OLBI1","OLBI3")], na.rm = T) ds$exhaustion <- rowMeans(x = ds[,c("OLBI2","OLBI4")], na.rm = T) *t.test(ds$exhaustion, ds$disengagement, paired = T) #paired samples t-test ``` `paired` argument can be: * `TRUE` for a .orange[paired samples t-test], * `FALSE` for a .orange[independent samples t-test], `alternative` argument can be: * `"two.sided"` for a .orange[two-tailed test] (default), * `"less"` for a .orange[left-tailed test], and, * `"greater"` for a .orange[right-tailed test]. ] .panel[.panel-name[Output] .scroll-box-16[ ``` ## ## Paired t-test ## ## data: ds$exhaustion and ds$disengagement *## t = 28.497, df = 1111, p-value < 2.2e-16 ## alternative hypothesis: true mean difference is not equal to 0 ## 95 percent confidence interval: ## 0.8139162 0.9342853 ## sample estimates: ## mean difference ## 0.8741007 ``` ] ] .panel[.panel-name[Effect Size] .pull-left[ `\(d=\frac{|\bar D|}{S'_D}\)` where `\(D = X_1-X_2\)` ] .pull-right[ <table> <caption>How to classify? ⚠️</caption> <thead> <tr> <th style="text-align:left;"> Effect Size </th> <th style="text-align:left;"> \( d \) 📔 </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Small </td> <td style="text-align:left;"> 0.2 </td> </tr> <tr> <td style="text-align:left;"> Medium </td> <td style="text-align:left;"> 0.5 </td> </tr> <tr> <td style="text-align:left;"> Large </td> <td style="text-align:left;"> 0.8 </td> </tr> </tbody> <tfoot> <tr> <td style = 'padding: 0; border:0;' colspan='100%'><sup></sup> ⚠️ No precise definitions, it is always context dependent.</td> </tr> </tfoot> </table> .font60[ <sup>📔</sup> Cohen, J. (1992). A power primer. _Psychological Bulletin, 112_(1), 155–159. [https://doi.org/10.1037/0033-2909.112.1.155](https://doi.org/10.1037/0033-2909.112.1.155) ] ] ] .panel[.panel-name[R Code] ``` r *library(lsr) #the package estimate the effect size measure(s) *cohensD(ds$exhaustion, ds$disengagement, method = "paired") ``` `method`: * `"pooled"` when equal variances are assumed; * `"paired"` for paired samples. * `"unequal"` when equal variances are not assumed (i.e., Welch _t_-test). ] .panel[.panel-name[Output] ``` *## [1] 0.8545659 ``` ] .panel[.panel-name[Statistical Analysis] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ The mean of the differences between exhaustion and disengagement scores was compared with a _t_-Student test (paired samples). It was not possible to validate the Normality assumption for the variable of the difference `\((W_{D(1112)} = 0.977; p_{D} < .001)\)`, but the skewness, and kurtosis values for the difference of exhaustion and disengagement did not reveal severe normality violations `\((sk_{D}=-0.027;\)` `\(ku_{D}=0.224)\)`. An `\(\alpha = .05\)` is considered for all statistical analyses. ]] .panel[.panel-name[Results] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ The mean of differences between exhaustion and disengagement `\((M = 0.874;\)` `\(SD=1.023;\)` `\(n=1112)\)` presented statistically significant differences `\((t_{(1111)}= 28.497;\)` `\(p < .001;\)` `\(d=0.855;\)` `\(IC 95\% \rbrack 0.81 ; 0.93 \lbrack)\)`. .tr[📚 Marôco, J. (2021). _Análise estatística com o SPSS statistics_ (8th ed.). ReportNumber.]]] ] --- class: inverse, center, middle # ANOVA <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- # ANOVA The **AN**alysis **O**f **VA**riance (.orange[ANOVA]) is one of the most used methods of inferential statistics. It allows to compare means via the analysis of variance. There are various types of ANOVAs depending on the design: * one-way (two or more independent samples, one i.v., and one d.v.) * two-way (two or more independent samples, two i.v., and one d.v.) * repeated measures/randomized blocks (two or more paired samples, one i.v., and one d.v.) * ... --- name: anova_ss # ANOVA .orange[Sum of Squares] (SS) Type (I, II, or III)<sup>🤓</sup>: Usually the hypothesis of interest is about the significance of one factor while controlling for the level of the other factors. This equates to using type II, or III SS. In general, if there is no significant interaction effect, then type II is more powerful, and follows the principle of marginality. If interaction is present, then type II is inappropriate while type III can still be used, but results need to be interpreted with caution (in the presence of interactions, main effects are rarely interpretable). .footnote[<sup>🤓</sup> when data is balanced, the factors are orthogonal, and types I, II, and III all give the same results. In conclusion, type I SS should be avoided, type II SS should be used if there are no interactions, type III SS are always usable. ] --- class: inverse, center, middle # One-Way ANOVA <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- # One-Way ANOVA .panelset[ .panel[.panel-name[Assumptions] * quantitative dependent variable * independent samples * `\(Y_i\sim \mathcal{N}(\mu_i, \sigma_i); (i=1,2,...,k)\)` (d.v. follows normal distribution in all groups) * `\(\sigma^2_1=\sigma^2_2=...=\sigma^2_k\)` (homogeneity of variance) ] .panel[.panel-name[Hypotheses] <br> <br> <br> `\(H_0: \mu_1 = \mu_2=...= \mu_k\)` vs. `\(H_1:\exists i,j: \mu_i \neq \mu_j;i\neq j;i,j=1,...,k\)` (two-tailed test)<sup>⚠️</sup> `\(H_0:\)` there are no significant differences between the `\(k\)` population means `\(H_1:\)` there at least one pair of means with significant differences <sup>⚠️</sup> an ANOVA is always a two-tailed test. ] .panel[.panel-name[Test Statistic] ANOVA compares means, analyzing variances, how? The sum of squares total (SST) is divided into two additive components: 1. .orange[Between-group or factorial (SSF)]: d.v. variation due to the effect of treatments 2. .orange[Within-group or residual variation (SSE)]: d.v. variation due to measurement errors, natural variability between subjects, etc... <table class="table" style="font-size: 13px; color: black; margin-left: auto; margin-right: auto;"> <tbody> <tr> <td style="text-align:center;"> Source of Variation </td> <td style="text-align:center;"> Sums of Squares (\(SS\)) </td> <td style="text-align:center;"> Degrees of Freedom (\(df\)) </td> <td style="text-align:center;"> Mean Squares (\(MS\)) </td> <td style="text-align:center;"> \(\mathcal{F}_{(k-1; n-k)} \) </td> </tr> <tr> <td style="text-align:center;"> Factor </td> <td style="text-align:center;"> \(SSF=\sum \limits^{k}_{i=1} n_i (\bar Y_i - \bar Y)^2\) </td> <td style="text-align:center;"> \(df_{F}=k-1\) </td> <td style="text-align:center;"> \(MSF=\frac{SSF}{k-1}\) </td> <td style="text-align:center;"> \(\frac{MSF}{MSE}\) </td> </tr> <tr> <td style="text-align:center;"> Error </td> <td style="text-align:center;"> \(SSE=\sum \limits^{k}_{i=1} \sum \limits^{n_i}_{j=1} (Y_{ij} - \bar Y_i)^2 =\sum \limits^{k}_{i=1} (n_i-1)S'^2_i\) </td> <td style="text-align:center;"> \(df_{E}=n-k\) </td> <td style="text-align:center;"> \(MSE=\frac{SSE}{n-k}\) </td> <td style="text-align:center;"> </td> </tr> <tr> <td style="text-align:center;"> Total </td> <td style="text-align:center;"> \(SST= \sum \limits^{k}_{i=1} \sum \limits^{n_i}_{j=1} (Y_{ij} - \bar Y)^2 = (n-1)S'^2\) </td> <td style="text-align:center;"> \(df_{T}=n-1\) </td> <td style="text-align:center;"> \(MST=\frac{SST}{n-1}\) </td> <td style="text-align:center;"> </td> </tr> </tbody> </table> ] .panel[.panel-name[Decision] In<svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>: `\(p\leq\alpha; \text{ reject }H_0.\)` If the null hypothesis is rejected (and more than two groups were compared) it means that there are at least a couple of significantly different means (i.e., groups). To identify the means that are different, it is necessary to do the .orange[Post-Hoc] tests (see, [Tukey-Kramer's test](#TukeyKramer)) ] .panel[.panel-name[R Code] Example: >Compare the age means between the academic levels: Unfinished graduation, Graduation, and Masters ``` r ds <- readr::read_csv(trimws('https://ndownloader.figshare.com/files/22299075 ')) #select just three occupational groups and create a new dataset ds_a <- dplyr::filter(ds, Academic_level == "Unfinished graduation"| Academic_level == "Graduation"|Academic_level == "Master") #make sure it is a factor ↓ ds_a$Academic_level <- as.factor(ds_a$Academic_level) #check normality using Shapiro-wilk or |sk| and |ku| values #... #check homogeneity of variances using Levene's test #if you have homoscedasticity use the one-way ANOVA *anova_output <- aov(formula = Age~Academic_level, data = ds_a) #one ANOVA *summary(anova_output) #see the summary of the object ``` ] .panel[.panel-name[Output] .scroll-box-16[ ``` ## Df Sum Sq Mean Sq F value Pr(>F) *## Academic_level 2 610 304.98 3.365 0.0352 * ## Residuals 670 60726 90.64 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` ]] .panel[.panel-name[Effect Size] .pull-left[ `\(\eta^2=SSF/SST\)` `\(f=\sqrt{\frac{\eta^2}{1-\eta^2}}\)` `\(\omega^2=\frac{SSF-df_F \times MSE}{SST+MSE}\)` ] .pull-right[ <table> <caption>How to classify? ⚠️</caption> <thead> <tr> <th style="text-align:left;"> Effect Size </th> <th style="text-align:left;"> \( f \) 📔 </th> <th style="text-align:left;"> \(\eta^2\); 📔 \(\omega^2\) 📜 </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Small </td> <td style="text-align:left;"> 0.1 </td> <td style="text-align:left;"> 0.01 </td> </tr> <tr> <td style="text-align:left;"> Medium </td> <td style="text-align:left;"> 0.25 </td> <td style="text-align:left;"> 0.06 </td> </tr> <tr> <td style="text-align:left;"> Large </td> <td style="text-align:left;"> 0.4 </td> <td style="text-align:left;"> 0.14 </td> </tr> </tbody> <tfoot> <tr> <td style = 'padding: 0; border:0;' colspan='100%'><sup></sup> ⚠️ No precise definitions, it is always context dependent.</td> </tr> </tfoot> </table> .font60[ <sup>📔</sup> Cohen, J. (1992). A power primer. _Psychological Bulletin, 112_(1), 155–159. [https://doi.org/10.1037/0033-2909.112.1.155](https://doi.org/10.1037/0033-2909.112.1.155) <sup>📜</sup> Kirk, R. E. (1996). Practical significance: A concept whose time has come. _Educational and Psychological Measurement, 56_(5), 746–759. [https://doi.org/10.1177/0013164496056005002](https://doi.org/10.1177/0013164496056005002) ] ] ] .panel[.panel-name[R Code] ``` r *library(effectsize) #the package estimate the effect size measure(s) anova_output <- aov(Age~Academic_level,data = ds_a) #anova object *omega_squared(model = anova_output, partial = F, ci = NULL) *cohens_f(model = anova_output, partial = F, ci = NULL) *eta_squared(model = anova_output, partial = F, ci = NULL) ``` `ci` confidence interval: * numeric value `0 < ci < 1` (default `.95`) * `NULL` does not present the confidence interval `partial`: * `TRUE` the effects of other independent variables and interactions are partialled out. * `FALSE` the effects of other independent variables and interactions are **not** partialled out. ] .panel[.panel-name[Output] ``` ## # Effect Size for ANOVA ## ## Parameter | Omega2 ## ------------------------- *## Academic_level | 6.98e-03 ``` ``` ## # Effect Size for ANOVA ## ## Parameter | Cohen's f ## -------------------------- *## Academic_level | 0.10 ``` ``` ## # Effect Size for ANOVA ## ## Parameter | Eta2 ## ------------------------- *## Academic_level | 9.94e-03 ``` ] .panel[.panel-name[Statistical Analysis] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ It was not possible to validate the Normality assumption in all groups: Unfinished graduation `\((p < .001)\)`, Graduation `\((p < .001)\)`, and Master `\((p < .001)\)`. However, the skewness, and kurtosis values did not reveal severe normality violations for all groups: Unfinished graduation `\((sk =0.675;\)` `\(ku=-0.343)\)`, Graduation `\((sk =0.993;\)` `\(ku =1.373)\)`, and Master `\((sk =1.085;\)` `\(ku=0.413)\)`. The means of age were compared with a one-way ANOVA test after validating the assumptions of homoscedasticity with the Levene test `\((F_{(2; 670)} = 1.617;\)` `\(p =0.199)\)`. An `\(\alpha = .05\)` is considered for all statistical analyses. ]] .panel[.panel-name[Results] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ The mean age of the Unfinished graduation group `\((M = 32.376;\)` `\(SD=9.69;\)` `\(n=93)\)`, Graduation group `\((M = 35.169;\)` `\(SD=9.928;\)` `\(n=332)\)`, and Master group `\((M = 35.085;\)` `\(SD=8.877;\)` `\(n=248)\)` presented statistically significant differences `\((F_{(2; 670)}= 3.365;\)` `\(p =0.035;\)` `\(\omega^2 =0.007)\)`. .tr[📚 Marôco, J. (2021). _Análise estatística com o SPSS statistics_ (8th ed.). ReportNumber.]]] ] --- class: inverse, center, middle # Post-Hoc <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- # Post-Hoc The post-hoc analysis (from Latin _post hoc_, "after this") also known as _a posteriori tests_ or _unplanned tests_ or _multiple comparison tests_ -- **If `\(k\)` groups existe, why not do `\(m\)` (i.e., `\(m=\frac{k!}{[2! (k-2!)]}\)`) _t_-Student tests?** Example: If there are 3 groups, why do not use 3 _t_-Student tests (i.e., group I vs. group II, group I vs. group III, and group II vs. group III)? -- .pull-left[ The comparison of more than `\(k \geq 3\)` groups with _t_-Student tests, creates an increased family-wise error rate (FWER). If `\(m\)` independent comparisons are performed, the FWER is given by: `\(\bar \alpha = 1- (1-alpha)^m\)` ] <style type="text/css"> /* custom.css */ .left-code { color: #777; width: 38%; height: 92%; float: left; } .right-plot { width: 60%; float: right; padding-left: 1%; } .plot-callout { height: 187.5; width: 375px; bottom: 15%; right: 5%; position: absolute; padding: 0px; z-index: 100; } .plot-callout img { width: 100%; border: 4px solid #23373B; } </style> .pull-right[ .plot-callout[ <img src="data:image/png;base64,#slides7of9_files/figure-html/large-plot-callout-1.png" width="100%" height="99%" /> ] ] --- # Post-Hoc <br> <center> <img src="data:image/png;base64,#slides7of9_files/figure-html/large-plot-callout-1.png" width="900px" height="99%" /> --- # Post-Hoc For one-way ANOVA (when comparing more than two groups) the Tukey-kramer test will be used. It is similar to the Tukey's Honest Significant Differences (HSD) test, but the Tukey-Kramer test accounts for comparisons with unequal sample sizes. .content-box-blue[ **Other** frequently used post-hoc tests are: * Fisher LSD (should only be used for 4 or fewer groups) * Games-Howell * Nemenyi * Conover * ...] --- class: inverse, center, middle # Tukey-Kramer's test <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- name: TukeyKramer # Tukey-Kramer's test .panelset[ .panel[.panel-name[Assumptions] **Tukey-Kramer's test** has the same assumptions of ANOVA, in fact, .orange[it should only be used if ANOVA's H<sub>0</sub> is rejected.] ] .panel[.panel-name[Hypotheses] <br> <br> <br> `\(H_0: \mu_i = \mu_j\)` vs. `\(H_1:\mu_i \neq \mu_j\)` for all possible pairs of means `\(i,j\)` ] .panel[.panel-name[Test Statistic] A `\(Q\)` test statistic should be calculated for each pair of means `\(i\)` and `\(j\)` of the factor under study: `\(Q=\frac{|\bar Y_i - \bar Y_j |}{\sqrt{\frac{MSE}{2} \times \bigg(\frac{1}{n_i}+\frac{1}{n_j} \bigg) }}\)` ] .panel[.panel-name[Decision] In<svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>: `\(p\leq\alpha; \text{ reject }H_0\)`; concluding that the pair of means `\(i\)` and `\(j\)` are different. ] .panel[.panel-name[R Code] Example: >Identify which age means of the academic levels compared (i.e., Unfinished graduation, Graduation, and Master) differ between them: ``` r ds <- readr::read_csv(trimws('https://ndownloader.figshare.com/files/22299075 ')) #select just three socioeconomic level groups and create a new dataset ds_a <- dplyr::filter(ds,Academic_level == "Unfinished graduation"| Academic_level == "Graduation"|Academic_level == "Master") ds_a$Academic_level <- as.factor(ds_a$Academic_level) #make sure it is a factor anova_output <- aov(formula = Age~Academic_level, data = ds_a) summary(anova_output) #one-way anova, if H0 is rejected, Tukey-Kramer can be done *library(agricolae) *post_hoc <- HSD.test(anova_output, "Academic_level",group = F,unbalanced =T) *post_hoc$comparison ``` `unbalanced = T` when groups have unequal sample sizes, if sample sizes are equal use `F` ] .panel[.panel-name[Output] .scroll-box-16[ ``` ## difference pvalue signif. LCL *## Graduation - Master 0.08399728 0.9964 -2.3518287 *## Graduation - Unfinished graduation 2.79233061 0.0199 * 0.3565046 *## Master - Unfinished graduation 2.70833333 0.0249 * 0.2725074 ## UCL ## Graduation - Master 2.519823 ## Graduation - Unfinished graduation 5.228157 ## Master - Unfinished graduation 5.144159 ``` ]] .panel[.panel-name[Statistical Analysis] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ As a post-hoc test to the conducted ANOVA for independent samples, the Tukey-Kramer test was used (Kramer, 1956; Salkind, 2007). An `\(\alpha = .05\)` is considered for all statistical analyses. ]] .panel[.panel-name[Results] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ After rejecting the ANOVA's `\(H_0\)` the post-hoc results revealed that there were statistically significant differences between the means of age among the groups Graduation and Unfinished graduation `\((p = 0.02)\)` , and between the groups Unfinished graduation and Master `\((p = 0.025)\)`. However, no statistically significant differences were observed between the groups Master and Graduation `\((p = 0.996)\)`. .tr[📚 Fox, J., & Weisberg, S. (2019). _An R Companion to applied regression_ (3rd ed.). Sage. [https://socialsciences.mcmaster.ca/jfox/Books/Companion/](https://socialsciences.mcmaster.ca/jfox/Books/Companion/)]]] ] --- class: inverse, center, middle # Welch's ANOVA <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- background-image: url("data:image/png;base64,#assets/img/welchanova.jpg") background-size: contain # Welch's ANOVA .font60[.pull-down[Source: [https://www.questionpro.com/blog/anova-testing/](https://www.questionpro.com/blog/anova-testing/)]] --- name: welch_one_anova # Welch's ANOVA .panelset[ .panel[.panel-name[Assumptions] * quantitative dependent variable * independent samples * `\(Y_i\sim \mathcal{N}(\mu_i, \sigma_i); (i=1,2,...,k)\)` (d.v. follows normal distribution in all groups, however Welch's ANOVA is robust to the violation of the assumption) ⚠️ when the assumptions of the one-way ANOVA are all valid, the one-way ANOVA has higher statistical power than Welch's ANOVA! ] .panel[.panel-name[Hypotheses] <br> <br> <br> `\(H_0: \mu_1 = \mu_2=...= \mu_k\)` vs. `\(H_1:\exists i,j: \mu_i \neq \mu_j;i\neq j;i,j=1,...,k\)` (two-tailed test)<sup>⚠️</sup> `\(H_0:\)` there are no significant differences between the `\(k\)` population means `\(H_1:\)` there at least one pair of means with significant differences <sup>⚠️</sup> an ANOVA is always a two-tailed test. ] .panel[.panel-name[Test Statistic] .pull-left[ `\(F_{W}=\frac{\frac{1}{k-1}\sum\limits^k_{i=1}w_i(\bar Y_i- \bar Y^*)^2}{1+\bigg[\frac{2(k-2)}{k^2-1}\bigg]\sum\limits^k_{i=1}\frac{\big[1-\frac{w_i}{W}\big]^2}{n_i-1}} \sim \mathcal{F}_{(df_n;df_d)}\)` ] .pull-right[ with degrees of freedom: * `\(df_{n}=k-1\)` * `\(df_{d}=\frac{k^2-1}{3\sum\limits^k_{i=1}\frac{\big[1-\frac{w_i}{W}\big]^2}{n_i-1}}\)` ] ] .panel[.panel-name[Decision] In<svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>: `\(p\leq\alpha; \text{ reject }H_0.\)` If the null hypothesis is rejected (and more than two groups were compared) it means that there are at least a couple of significantly different means (i.e., groups). To identify the means that are different, it is necessary to do the .orange[Post-Hoc] tests (see, [Games-Howell's test](#GamesHowell)) ] .panel[.panel-name[R Code] Example: >Compare the age means between the socioeconomic levels: A1, A2, and B2. ``` r ds <- readr::read_csv(trimws('https://ndownloader.figshare.com/files/22299075 ')) #select just three socioeconomic levels and create a new dataset ds_s <- dplyr::filter(ds,Socioeconomic_status == "A1"| Socioeconomic_status == "A2"| Socioeconomic_status == "B2") ds_s$Socioeconomic_status <- as.factor(ds_s$Socioeconomic_status) #as factor #check normality using Shapiro-wilk or |sk| and |ku| values #check homogeneity of variances using Levene test car::leveneTest(Age~Socioeconomic_status, ds_s, "mean") #if you have heteroscedasticity use the Welch's ANOVA #Welch's ANOVA *oneway.test(Age~Socioeconomic_status, data = ds_s,var.equal = F) ``` ] .panel[.panel-name[Output] .scroll-box-16[ ``` ## *## One-way analysis of means (not assuming equal variances) ## ## data: Age and Socioeconomic_status *## F = 12.654, num df = 2.00, denom df = 157.27, p-value = 0.000008018 ``` ]] .panel[.panel-name[Effect Size] .pull-left[ `\(\eta^2=SSF/SST\)` `\(f=\sqrt{\frac{\eta^2}{1-\eta^2}}\)` `\(\omega^2=\frac{SSF-df_F \times MSE}{SST+MSE}\)` ] .pull-right[ <table> <caption>How to classify? ⚠️</caption> <thead> <tr> <th style="text-align:left;"> Effect Size </th> <th style="text-align:left;"> \( f \) 📔 </th> <th style="text-align:left;"> \(\eta^2\); 📔 \(\omega^2\) 📜 </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Small </td> <td style="text-align:left;"> 0.1 </td> <td style="text-align:left;"> 0.01 </td> </tr> <tr> <td style="text-align:left;"> Medium </td> <td style="text-align:left;"> 0.25 </td> <td style="text-align:left;"> 0.06 </td> </tr> <tr> <td style="text-align:left;"> Large </td> <td style="text-align:left;"> 0.4 </td> <td style="text-align:left;"> 0.14 </td> </tr> </tbody> <tfoot> <tr> <td style = 'padding: 0; border:0;' colspan='100%'><sup></sup> ⚠️ No precise definitions, it is always context dependent.</td> </tr> </tfoot> </table> .font60[ <sup>📔</sup> Cohen, J. (1992). A power primer. _Psychological Bulletin, 112_(1), 155–159. [https://doi.org/10.1037/0033-2909.112.1.155](https://doi.org/10.1037/0033-2909.112.1.155) <sup>📜</sup> Kirk, R. E. (1996). Practical significance: A concept whose time has come. _Educational and Psychological Measurement, 56_(5), 746–759. [https://doi.org/10.1177/0013164496056005002](https://doi.org/10.1177/0013164496056005002) ]]] .panel[.panel-name[R Code] ``` r *library(effectsize) #the package estimate the effect size measure(s) welch_anova_output <- oneway.test(Age~Socioeconomic_status, data = ds_s,var.equal = F) #Welch ANOVA object *omega_squared(model = welch_anova_output, partial = F, ci = NULL) *cohens_f(model = welch_anova_output, partial = F, ci = NULL) *eta_squared(model = welch_anova_output, partial = F, ci = NULL) ``` `ci` confidence interval: * numeric value `0 < ci < 1` (default `.95`) * `NULL` does not present the confidence interval `partial`: * `TRUE` the effects of other independent variables and interactions are partialled out. * `FALSE` the effects of other independent variables and interactions are **not** partialled out. .font60[<sup>⚠️</sup> For one-way between subjects designs, the partial versions are equivalent to the regular ones.] ] .panel[.panel-name[Output] ``` ## # Effect Size for ANOVA ## ## Omega2 ## ------ *## 0.13 ``` ``` ## # Effect Size for ANOVA ## ## Cohen's f ## --------- *## 0.40 ``` ``` ## # Effect Size for ANOVA ## ## Eta2 ## ---- *## 0.14 ``` .font60[<sup>⚠️</sup> `var.equal = FALSE` — effect size is an approximation.] ] .panel[.panel-name[Statistical Analysis] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ It was not possible to validate the Normality assumption in none of the three groups: _A1_ `\((p < .001)\)`, _A2_ `\((p < .001)\)`, and _B2_ `\((p < .001)\)`. However, the skewness, and kurtosis values on the _A1_ `\((sk=0.346;\)` `\(ku=-0.957)\)`, _A2_ `\((sk=0.767;\)` `\(ku=0.671)\)`, and _B2_ `\((sk=0.993;\)` `\(ku=0.335)\)` groups did not reveal severe normality violations. The assumption of homoscedasticity did not hold `\((F_{(2; 712)} = 3.807;\)` `\(p =0.023)\)`. Due to the lack of homoscedasticity the age means were compared with a Welch's ANOVA. An `\(\alpha = .05\)` is considered for all statistical analyses. ]] .panel[.panel-name[Results] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ The mean age of the _A1_ group `\((M = 40.292;\)` `\(SD=10.736;\)` `\(n=120)\)`, _A2_ group `\((M = 35.788;\)` `\(SD=9.414;\)` `\(n=518)\)`, and _B2_ group `\((M = 33.234;\)` `\(SD=9.818;\)` `\(n=77)\)` presented statistically significant differences `\((F_{w(2; 157.268)}= 12.654;\)` `\(p <.001;\)` `\(\omega^2 =0.127)\)`. .tr[📚 Marôco, J. (2021). _Análise estatística com o SPSS statistics_ (8th ed.). ReportNumber.]]]] --- class: inverse, center, middle # Post-Hoc <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- class: inverse, center, middle # Games-Howell's test <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- name: GamesHowell # Games-Howell's test .panelset[ .panel[.panel-name[Assumptions] **Games-Howell's test**<sup>📜</sup> has the same assumptions of Welch's ANOVA, in fact, .orange[it should only be used if Welch's ANOVA H<sub>0</sub> is rejected.] .font50[ <sup>📜</sup> Games, P. A., & Howell, J. F. (1976). Pairwise multiple comparison procedures with unequal n’s and/or variances: A Monte Carlo study. _Journal of Educational Statistics, 1_(2), 113–125. [https://doi.org/10.2307/1164979](https://doi.org/10.2307/1164979) ] ] .panel[.panel-name[Hypotheses] <br> <br> <br> `\(H_0: \mu_i = \mu_j\)` vs. `\(H_1:\mu_i \neq \mu_j\)` for all possible pairs of means `\(i,j\)` ] .panel[.panel-name[Test Statistic] .pull-left[ A `\(t\)` test statistic should be calculated for each pair of means `\(i\)` and `\(j\)` of the factor under study: `\(T_{ij}=\frac{\bar Y_i - \bar Y_j }{\sqrt{{\left(\frac{S'^2_i}{n_i} + \frac{S'^2_j}{n_j}\right)}}} \sim \mathcal{t}_{(\nu_{ij})}\)` `\(S'^2_i\)` is the variance of the `\(i\)`-th group. `\(S'^2_j\)` is the variance of the `\(j\)`-th group. ] .pull-right[ With Welch's approximate solution for calculating the degrees of freedom `\(\nu\)` for each pair `\(i\)`, and `\(j\)`: `\(\nu_{ij} = \frac{\left(\frac{S'^2_i}{n_i} + \frac{S'^2_j}{n_j}\right)^2}{\frac{\left(\frac{S'^2_i}{n_i}\right)^2}{n_i-1} +\frac{\left(\frac{S'^2_j}{n_j}\right)^2}{n_j-1}}\)` ] ] .panel[.panel-name[Decision] For `\(k\)` groups, the number of comparisons is given by `\(m = k(k − 1)/2\)`. `\(p-value = \mathrm{P} \left\{ |t_{ij}| \sqrt{2} \ge q_{m; v_{ij}; \alpha}\right\}_{ij}\)` In<svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>: `\(p\leq\alpha; \text{ reject }H_0\)`; concluding that the pair of means `\(i\)` and `\(j\)` are different. ] .panel[.panel-name[R Code] Example: >Identify which age means of the socioeconomic levels compared (i.e., A1, A2, and B2) differ between them: ``` r ds <- readr::read_csv(trimws('https://ndownloader.figshare.com/files/22299075 ')) #select just three occupational groups and create a new dataset ds_s <- dplyr::filter(ds,Socioeconomic_status == "A1"|Socioeconomic_status == "A2"| Socioeconomic_status == "B2") ds_s$Socioeconomic_status <- as.factor(ds_s$Socioeconomic_status) #as factor #... #if you have heteroscedasticity use the Welch's ANOVA #Welch's ANOVA with Games-Howell post-hoc test for unequal variances oneway.test(Age~Socioeconomic_status, data = ds_s,var.equal = F) *library(rosetta) *posthocTGH(y = ds_s$Age, x = ds_s$Socioeconomic_status) ``` ] .panel[.panel-name[Output] .scroll-box-16[ ``` ## n means variances ## A1 120 40 115 ## A2 518 36 89 ## B2 77 33 96 NA *NA *NA *NA ``` ``` ## diff ci.lo ci.hi t df p ## A2-A1 -4.5 -7.0 -1.99 4.2 164 <.01 ## B2-A1 -7.1 -10.6 -3.54 4.7 173 <.01 ## B2-A2 -2.6 -5.4 0.28 2.1 98 .09 NA *NA *NA *NA ``` ] ] .panel[.panel-name[Statistical Analysis] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ As a post-hoc test to the conducted Welch's ANOVA for independent samples (Field, Miles, and Field, 2012), the Games-Howell test was used via the package _rosetta_ (Peters and Verboon, 2023). An `\(\alpha = .05\)` is considered for all statistical analyses. To conduct the statistical analysis the program _R_ (R Core Team, 2021) through the integrated development environment, _RStudio_ (RStudio Team, 2021) was used. ]] .panel[.panel-name[Results] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ After rejecting the Welch's ANOVA `\(H_0\)` the post-hoc results revealed that there were statistically significant differences between the means of age among the groups A2 and A1 `\((p < .001)\)`, and among the groups B2 and A1 `\((p < .001)\)`. However, no statistically significant differences were observed between the groups B2 and A2 `\((p = 0.087)\)`. .tr[📚 Marôco, J. (2021). _Análise estatística com o SPSS statistics_ (8th ed.). ReportNumber. 💻 Peters, G.-J. Y., & Verboon, P. (2023). rosetta: Parallel use of statistical packages in teaching (R package version 0.3.12) [Computer software] (0.3.12). https://cran.r-project.org/package=rosetta ]]] ] --- class: inverse, center, middle # Two-Way ANOVA <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- # Two-Way ANOVA **When to use?** Are there statistically significant differences between the levels of two independent factors (i.e., _two-way_). .center[  .caption[ The variance partition. Source:Pace (2012) ]] -- .center[  .caption[Visualizing two-way ANOVA. Source: Pace (2012)]] --- # Two-Way ANOVA **When to use?** Assess whether the effect of one factor depends on the levels at which the other factor occurs (**interaction** or **moderation** effect). .center[ <div class="figure"> <img src="data:image/png;base64,#slides7of9_files/figure-html/int_2way-1.png" alt="Visualizing interactions" width="57%" /> <p class="caption">Visualizing interactions</p> </div> ] --- # Two-Way ANOVA .panelset[ .panel[.panel-name[Assumptions] * quantitative dependent variable * independent samples * `\(Y_{ij}\sim \mathcal{N}(\mu_{ij}, \sigma_{ij}); (i=1,2,...,a;j=1,...,b)\)` (d.v. follows normal distribution in all groups) * `\(\sigma^2_{11}=\sigma^2_{12}=...=\sigma^2_{ij}; (i=1,...,a;j=1,...,b)\)` (homogeneity of variance) ] .panel[.panel-name[Hypotheses] **Factor A:** `\(H_0^A: \mu_1 = \mu_2=...= \mu_a\)` vs. `\(H_1^A:\exists i,j: \mu_i \neq \mu_j;i\neq j;i,j=1,...,a\)`<sup>⚠️</sup> **Factor B:** `\(H_0^B: \mu_1 = \mu_2=...= \mu_b\)` vs. `\(H_1^B:\exists i,j: \mu_i \neq \mu_j;i\neq j;i,j=1,...,b\)`<sup>⚠️</sup> **Interaction/Moderation<sup>💡</sup>:** `\(H_0^\gamma: \gamma = 0\)` (there is no interaction) vs. `\(H_1^\gamma:\gamma \neq 0\)` (there is an interaction) <sup>⚠️</sup> <br> <sup>⚠️</sup> an ANOVA is always a two-tailed test. <sup>💡</sup> In factorial designs, one should always start by analyzing the interaction hypothesis, since if the interaction is statistically significant, the effect of each of the main factors is conditioned by the level(s) at which the other factors occur. ] .panel[.panel-name[Test Statistic] <table class="table" style="font-size: 13px; color: black; margin-left: auto; margin-right: auto;"> <tbody> <tr> <td style="text-align:center;"> Source of Variation </td> <td style="text-align:center;"> Sums of Squares (\(SS\)) </td> <td style="text-align:center;"> Degrees of Freedom (\(df\)) </td> <td style="text-align:center;"> Mean Squares (\(MS\)) </td> <td style="text-align:center;"> \(\mathcal{F}_{(df_{A/B/AB};df_E)}\) </td> </tr> <tr> <td style="text-align:center;"> Factor A </td> <td style="text-align:center;"> \(SSF_A= b \times r \times \sum \limits^{a}_{i=1} (\bar Y_i - \bar Y)^2\) </td> <td style="text-align:center;"> \(df_{F_A}=a-1\) </td> <td style="text-align:center;"> \(MSF_A=\frac{SSF_A}{a-1}\) </td> <td style="text-align:center;"> \(F_A=\frac{MSF_A}{MSE}\) </td> </tr> <tr> <td style="text-align:center;"> Factor B </td> <td style="text-align:center;"> \(SSF_B= a \times r \times \sum \limits^{b}_{j=1} (\bar Y_j - \bar Y)^2\) </td> <td style="text-align:center;"> \(df_{F_B}=b-1\) </td> <td style="text-align:center;"> \(MSF_B=\frac{SSF_B}{b-1}\) </td> <td style="text-align:center;"> \(F_B=\frac{MSF_B}{MSE}\) </td> </tr> <tr> <td style="text-align:center;"> Factor AB </td> <td style="text-align:center;"> \(SSF_{AB}= r \times \sum \limits^{a}_{i=1} \sum \limits^{b}_{j=1} (\bar Y_{ij} - \bar Y_{i} -\bar Y_{j} + \bar Y)^2\) </td> <td style="text-align:center;"> \(df_{F_{AB}}=(a-1)(b-1)\) </td> <td style="text-align:center;"> \(MSF_{AB}=\frac{SSF_{AB}}{(a-1)(b-1)}\) </td> <td style="text-align:center;"> \(F_{AB}=\frac{MSF_{AB}}{MSE}\) </td> </tr> <tr> <td style="text-align:center;"> Error </td> <td style="text-align:center;"> \(SSE=\sum \limits^{a}_{i=1} \sum \limits^{b}_{j=1} \sum \limits^{r}_{l=1} (Y_{ijl} - \bar Y_{ij})^2\) </td> <td style="text-align:center;"> \(df_{E}=(r-1)ab\) </td> <td style="text-align:center;"> \(MSE=\frac{SSE}{(r-1)ab}\) </td> <td style="text-align:center;"> </td> </tr> <tr> <td style="text-align:center;"> Total </td> <td style="text-align:center;"> \(SST=\sum \limits^{a}_{i=1} \sum \limits^{b}_{j=1} \sum \limits^{r}_{l=1} (Y_{ijl} - \bar Y)^2\) </td> <td style="text-align:center;"> \(df_{T}=n-1\) </td> <td style="text-align:center;"> \(MST=\frac{SST}{n-1}\) </td> <td style="text-align:center;"> </td> </tr> </tbody> </table> One source of variation for each hypotheses' pair. If the there is a balanced design: `\(\tiny SQF_A=b\times r \times (a-1)S'^2_{\bar Y_{A.}}; SSF_B=a\times r \times (b-1)S'^2_{\bar Y_{.B}} ; SSE=\sum \limits_{i=1}^a \sum \limits_{j=1}^b (n_{ij}-1)S'^2_{ij}; SST=(n-1)S'^2 \normalsize\)` .font60[in<svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg> use type III SS (see [SS](#anova_ss))] ] .panel[.panel-name[Decision] In<svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>: `\(p\leq\alpha; \text{ reject }H_0.\)` If the null hypothesis is rejected (and more than two groups were compared) it means that there are at least a couple of significantly different means (i.e., groups). To identify the means that are different, it is necessary to do the .orange[Post-Hoc] tests (see, [Tukey-Kramer's test](#TukeyKramertwoway)) ] .panel[.panel-name[R Code] Example: >Compare is there an interaction between socioeconomic levels (i.e., _A1_, and _B2_), and sex (i.e., _Female_, and _Male_) regarding age means. ``` r ds <- readr::read_csv(trimws('https://ndownloader.figshare.com/files/22299075 ')) ds_s <- dplyr::filter(ds, Socioeconomic_status == "A1"| Socioeconomic_status == "B2")#two socioeconomic levels ds_s$Socioeconomic_status <- as.factor(ds_s$Socioeconomic_status) #← as a factor #check normality using Shapiro-Wilk or |sk| and |ku| values *library(car) #if you have homoscedasticity (check with Levene's test) use the two-way ANOVA leveneTest(Age~Sex*Socioeconomic_status,data= ds_s, center="mean") #two-way ANOVA (using type III SS) *Anova(lm(Age ~ Socioeconomic_status * Sex, data=ds_s, *contrasts=list(Socioeconomic_status=contr.sum,Sex=contr.sum)), type="III") ``` ] .panel[.panel-name[Output] ``` ## Anova Table (Type III tests) ## ## Response: Age ## Sum Sq Df F value Pr(>F) ## (Intercept) 234036 1 2213.0413 < 2.2e-16 *** *## Socioeconomic_status 2837 1 26.8289 0.0000005567 *** *## Sex 4 1 0.0420 0.83788 *## Socioeconomic_status:Sex 579 1 5.4706 0.02036 * ## Residuals 20410 193 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` .center[ ] ] .panel[.panel-name[Effect Size] .pull-left[ For designs with **more than one factor**, use the **partial versions** of the estimators. <sup>🤓</sup> `\(\omega^2_p=\frac{F-1}{F+\frac{df_E+1}{df_F}}; \varepsilon^2_p=\frac{F-1}{F+\frac{df_E}{df_F}}\)`<sup>💡</sup> `\(\eta^2_p=\frac{F\times df_F}{F\times df_F + df_E}\)` <br> <br> .font60[<sup>💡</sup> The recommended one is `\(\omega^2_p\)` or `\(\varepsilon^2_p\)`, since they are less biased than the `\(\eta^2_p\)` (Albers and Lakens, 2018; Keselman, 1975; Carroll and Nordholm, 1975). <sup>🤓</sup> It suggested to use the partial versions of the effect size estimators, since they refer to the variance of the DV accounted by one particular IV, with the effects of the other IVs partialed out.] ] .pull-right[ <table> <caption>How to classify? ⚠️</caption> <thead> <tr> <th style="text-align:left;"> Effect Size </th> <th style="text-align:left;"> \( \varepsilon^2_p\) 📜 \(;\eta_p^2\) 📔 \(;\omega^2_p\) 📜 </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Small </td> <td style="text-align:left;"> 0.01 </td> </tr> <tr> <td style="text-align:left;"> Medium </td> <td style="text-align:left;"> 0.06 </td> </tr> <tr> <td style="text-align:left;"> Large </td> <td style="text-align:left;"> 0.14 </td> </tr> </tbody> <tfoot> <tr> <td style = 'padding: 0; border:0;' colspan='100%'><sup></sup> ⚠️ No precise definitions, it is always context dependent.</td> </tr> </tfoot> </table> .font60[ <sup>📔</sup> Cohen, J. (1992). A power primer. _Psychological Bulletin, 112_(1), 155–159. [https://doi.org/10.1037/0033-2909.112.1.155](https://doi.org/10.1037/0033-2909.112.1.155) <sup>📜</sup> Kirk, R. E. (1996). Practical significance: A concept whose time has come. _Educational and Psychological Measurement, 56_(5), 746–759. [https://doi.org/10.1177/0013164496056005002](https://doi.org/10.1177/0013164496056005002) ] ] ] .panel[.panel-name[R Code] ``` r *library(effectsize) #the package estimate the effect size measure(s) anova_output <- car::Anova(lm(Age ~ Socioeconomic_status * Sex, data=ds_s, contrasts=list(Socioeconomic_status=contr.sum,Sex=contr.sum)), type="III")#anova object *omega_squared(model = anova_output, partial = T, ci = NULL) *epsilon_squared(model = anova_output, partial = T, ci = NULL) *eta_squared(model = anova_output, partial = T, ci = NULL) ``` `ci` confidence interval: * numeric value `0 < ci < 1` (default `.95`) * `NULL` does not present the confidence interval `partial`: * `TRUE` the effects of other i. v.'s and interactions are partialled out. * `FALSE` the effects of other i. v.'s and interactions are **not** partialled out. ] .panel[.panel-name[Output] .scroll-box-16[ ``` ## # Effect Size for ANOVA (Type III) ## ## Parameter | Omega2 (partial) ## ------------------------------------------- *## Socioeconomic_status | 0.12 *## Sex | 0.00 *## Socioeconomic_status:Sex | 0.02 ``` ``` ## # Effect Size for ANOVA (Type III) ## ## Parameter | Epsilon2 (partial) ## --------------------------------------------- *## Socioeconomic_status | 0.12 *## Sex | 0.00 *## Socioeconomic_status:Sex | 0.02 ``` ``` ## # Effect Size for ANOVA (Type III) ## ## Parameter | Eta2 (partial) ## ----------------------------------------- *## Socioeconomic_status | 0.12 *## Sex | 2.17e-04 *## Socioeconomic_status:Sex | 0.03 ``` ] ] .panel[.panel-name[Statistical Analysis] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ It was not possible to validate the Normality assumption in all groups: Female-A1 `\((p < .001)\)`, Female-B2 `\((p < .001)\)`, Male-A1 `\((p =0.406)\)`, and Male-B2 `\((p =0.1)\)`. However, the skewness, and kurtosis values did not reveal severe normality violations for all groups: Female-A1 `\((sk =0.531;\)` `\(ku=-0.87)\)`, Female-B2 `\((sk =1.045;\)` `\(ku =0.079)\)`, Male-A1 `\((sk =0.153;\)` `\(ku=-0.812)\)`, and Female-B2 `\((sk =0.868;\)` `\(ku=0.61)\)`. The means of age were compared with a two-way ANOVA test after validating the assumptions of homoscedasticity with the Levene test `\((F_{(3; 193)} = 1.013;\)` `\(p =0.388)\)`. An `\(\alpha = .05\)` is considered for all statistical analyses. ]] .panel[.panel-name[Results] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ The mean age of the Female-A1 group `\((M = 38.897;\)` `\(SD=11.046;\)` `\(n=78)\)`, Female-B2 group `\((M = 34.449;\)` `\(SD=10.19;\)` `\(n=49)\)`, Male-A1 group `\((M = 42.881;\)` `\(SD=9.739;\)` `\(n=42)\)`, and Male-B2 group `\((M = 31.107;\)` `\(SD=8.908;\)` `\(n=28)\)` presented statistically significant differences `\((F_{\gamma (1; 193)}= 5.471;\)` `\(p_{\gamma} =0.02;\)` `\(\omega^2_{p_{\gamma}} =0.022)\)`. .tr[📚 Marôco, J. (2021). _Análise estatística com o SPSS statistics_ (8th ed.). ReportNumber.]]] ] --- class: inverse, center, middle # Post-Hoc <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- class: inverse, center, middle # Tukey-Kramer's test <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- name: TukeyKramertwoway # Tukey-Kramer's test .panelset[ .panel[.panel-name[Assumptions] **Tukey-Kramer's test** has the same assumptions of ANOVA, in fact, .orange[it should only be used if ANOVA's H<sub>0</sub> is rejected.] ] .panel[.panel-name[Hypotheses] <br> <br> <br> `\(H_0: \mu_i = \mu_j\)` vs. `\(H_1:\mu_i \neq \mu_j\)` for all possible pairs of means `\(i,j\)` ] .panel[.panel-name[Test Statistic] A `\(Q\)` test statistic should be calculated for each pair of means `\(i\)` and `\(j\)` of the factor under study: `\(Q=\frac{|\bar Y_i - \bar Y_j |}{\sqrt{\frac{MSE}{2} \times \bigg(\frac{1}{n_i}+\frac{1}{n_j} \bigg) }}\)` ] .panel[.panel-name[Decision] In<svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>: `\(p\leq\alpha; \text{ reject }H_0\)`; concluding that the pair of means `\(i\)` and `\(j\)` are different. ] .panel[.panel-name[R Code] Example: >Identify which age means of the interaction socioeconomic level (i.e., _A1_, and _B2_) `\(\times\)` sex (i.e., _Female_, and _Male_) differ between them: ``` r ds <- readr::read_csv(trimws('https://ndownloader.figshare.com/files/22299075 ')) ds_s <- dplyr::filter(ds, Socioeconomic_status == "A1"| Socioeconomic_status == "B2")#two socioeconomic levels ds_s$Socioeconomic_status <- as.factor(ds_s$Socioeconomic_status) #← as a factor #check normality using Shapiro-wilk or |sk| and |ku| values car::leveneTest(Age~Sex*Socioeconomic_status,ds_s, center="mean")#Homoscedasticity #two-way ANOVA (using type III SS) tw_m <- lm(Age ~ Socioeconomic_status * Sex, data=ds_s, contrasts=list(Socioeconomic_status=contr.sum,Sex=contr.sum)) *library(agricolae) *post_hoc <- HSD.test(tw_m, c("Socioeconomic_status","Sex"),group=F,unbalanced=T) *post_hoc$comparison ``` ] .panel[.panel-name[Output] .scroll-box-16[ ``` ## difference pvalue signif. LCL UCL *## A1:Female - A1:Male -3.983516 0.2771 -9.722765 1.755732 *## A1:Female - B2:Female 4.448456 0.1883 -1.290793 10.187705 *## A1:Female - B2:Male 7.790293 0.0030 ** 2.051044 13.529542 *## A1:Male - B2:Female 8.431973 0.0011 ** 2.692724 14.171222 *## A1:Male - B2:Male 11.773810 0.0000 *** 6.034561 17.513058 *## B2:Female - B2:Male 3.341837 0.4341 -2.397412 9.081086 ``` ] ] .panel[.panel-name[Statistical Analysis] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ As a post-hoc test to the conducted two-way ANOVA for (independent samples), the Tukey-Kramer test was used (Kramer, 1956; Salkind, 2007). An `\(\alpha = .05\)` is considered for all statistical analyses. ]] .panel[.panel-name[Results] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ After rejecting the ANOVA's two-way interaction `\(H_0^{\gamma}\)` the post-hoc results revealed that there were statistically significant differences between the means of age among the groups A1:Female vs. B2:Male `\((p = 0.003)\)`, between the groups A1:Male vs. B2:Female `\((p = 0.001)\)`, and between the groups A1:Male vs. B2:Male `\((p < .001)\)`. However, no statistically significant differences were observed in all other comparisons considering `\((\alpha = .05)\)`. .tr[📚 Fox, J., & Weisberg, S. (2019). _An R Companion to applied regression_ (3rd ed.). Sage. [https://socialsciences.mcmaster.ca/jfox/Books/Companion/](https://socialsciences.mcmaster.ca/jfox/Books/Companion/)]]] ] --- class: inverse, center, middle # Repeated-Measures ANOVA <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- # Repeated-Measures ANOVA **When to use?** Compare population means of two or more repeated measurements/paired samples. Check whether the two or more paired samples under study come from the same population. .center[  .caption[ The variance partition. Source:Pace (2012) ]] -- .center[  .caption[Visualizing repeated measures ANOVA. Source: Pace (2012)]] --- # Repeated-Measures ANOVA .panelset[ .panel[.panel-name[Assumptions] * quantitative dependent variable * paired samples * `\(Y_{i}\sim \mathcal{N}(\mu_{i}, \sigma_{i}); (i=1,2,...,k)\)` (d.v. follows normal distribution in all groups) * `\(\bf{\Sigma} = \sigma^2 \bf{I}\)` (sphericity; homogeneous variances, and null covariances) ] .panel[.panel-name[Hypotheses] `\(H_0: \mu_1 = \mu_2=...= \mu_k\)` vs. `\(H_1:\exists i,j: \mu_i \neq \mu_j;i\neq j;i,j=1,...,k\)` (two-tailed test)<sup>⚠️</sup> `\(H_0:\)` there are no significant differences between the `\(k\)` population means `\(H_1:\)` there at least one pair of means with significant differences <sup>⚠️</sup> an ANOVA is always a two-tailed test. <br> <br> <br> ] .panel[.panel-name[Test Statistic] Same basic principle of .orange[Analysis of Variance]: Compare the variation between samples (due to treatment), with the variation within samples (due to errors and natural variation between individuals). **But** now **discounting the variation between Blocks or Subjects**. <table class="table" style="font-size: 13px; color: black; margin-left: auto; margin-right: auto;"> <tbody> <tr> <td style="text-align:center;"> Source of Variation </td> <td style="text-align:center;"> Sums of Squares (\(SS\)) </td> <td style="text-align:center;"> Degrees of Freedom (\(df\)) </td> <td style="text-align:center;"> Mean Squares (\(MS\)) </td> <td style="text-align:center;"> \(\mathcal{F}_{(df_F;df_E)}\) </td> </tr> <tr> <td style="text-align:center;"> Factor </td> <td style="text-align:center;"> \(SSF= b (k-1) S'^2_{\bar Y_{.j}}\) </td> <td style="text-align:center;"> \(df_{F}=k-1\) </td> <td style="text-align:center;"> \(MSF=\frac{SSF}{k-1}\) </td> <td style="text-align:center;"> \(F=\frac{MSF}{MSE}\) </td> </tr> <tr> <td style="text-align:center;"> Block or Subject </td> <td style="text-align:center;"> \(SSB= k (b-1) S'^2_{\bar Y_{i.}}\) </td> <td style="text-align:center;"> \(df_{B}=b-1\) </td> <td style="text-align:center;"> \(MSB=\frac{SSB}{b-1}\) </td> <td style="text-align:center;"> </td> </tr> <tr> <td style="text-align:center;"> Error </td> <td style="text-align:center;"> \(SSE=SST-SSF-SSB\) </td> <td style="text-align:center;"> \(df_{E}=(k-1)(b-1)\) </td> <td style="text-align:center;"> \(MSE=\frac{SSE}{(k-1)(b-1)}\) </td> <td style="text-align:center;"> </td> </tr> <tr> <td style="text-align:center;"> Total </td> <td style="text-align:center;"> \(SST=(n-1)S'^2\) </td> <td style="text-align:center;"> \(df_{T}=kb-1\) </td> <td style="text-align:center;"> \(MST=\frac{SST}{kb-1}\) </td> <td style="text-align:center;"> </td> </tr> </tbody> </table> ] .panel[.panel-name[Decision] In<svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>: `\(p\leq\alpha; \text{ reject }H_0.\)` <sup>⚠️</sup> If the null hypothesis is rejected (and more than two groups were compared) it means that there are at least a couple of significantly different means (i.e., groups). To identify the means that are different, it is necessary to do the .orange[Post-Hoc] tests (see, [Pairwise paired _t_-test](#Pairwisepaired_t_test)) ] .panel[.panel-name[R Code] Example: >Are the levels of three UWES dimensions (i.e., _Vigor_, _Dedication_, and _Absorption_) equal? ``` r ds <- readr::read_csv(trimws('https://ndownloader.figshare.com/files/22299075 ')) ds$vi <- (ds$UWES1+ds$UWES2+ds$UWES5)/3 #create vigor variable ds$ded <- (ds$UWES3+ds$UWES4+ds$UWES7)/3 #create dedication variable ds$abs <- (ds$UWES6+ds$UWES8+ds$UWES9)/3 #create absorption variable #check normality using Shapiro-wilk or |sk| and |ku| values #the Mauchly's test will be produced together with the repeated measures ANOVA ds <- dplyr::rename(ds,"id"="...1") #change variable "...1" name to "id" *ds_long <- rstatix::gather(ds,"vi", "ded", "abs", *key = dv, value = score) #pass the data to the long format *rep_aov <- afex::aov_ez(id = 'id',dv = 'score', fun_aggregate = mean,data = *ds_long,within='dv',include_aov = T,type = "III") #ANOVA repeated measures *summary(rep_aov, multivariate=FALSE, univariate=TRUE) ``` ] .panel[.panel-name[Output] .font70[ ``` ## ## Univariate Type III Repeated-Measures ANOVA Assuming Sphericity ## ## Sum Sq num Df Error SS den Df F value Pr(>F) ## (Intercept) 59097 1 5904.5 1099 10999.65 < 2.2e-16 *** *## dv 91 2 867.2 2198 114.84 < 2.2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## ## Mauchly Tests for Sphericity ## ## Test statistic p-value *## dv 0.92439 1.7983e-19 ## ## ## Greenhouse-Geisser and Huynh-Feldt Corrections ## for Departure from Sphericity ## ## GG eps Pr(>F[GG]) *## dv 0.92971 < 2.2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## HF eps Pr(>F[HF]) *## dv 0.9312177 4.687739e-45 ``` ] ] .panel[.panel-name[Effect Size] .pull-left[ In repeated-measures designs `\(\eta^2_G\)` or `\(\omega^2_G\)`<sup>🤓</sup> should be prefered, as it can estimate what a within-subject effect size would have been had that predictor been manipulated between-subjects (Olejnik and Algina, 2003). `\(\eta^2_G=\frac{SSF}{SSF+SSB+SSE}\)`<sup>💡</sup> <br> `\(\omega^2_G=\frac{df_F \times \left(MSF-MSE\right)}{SST+MSB}\)`<sup>💡</sup> .font60[<sup>💡</sup> For one-way repeated measures `\(\eta^2_G\)` and `\(\omega^2_G\)` will be equivalent to `\(\eta^2\)` and `\(\omega^2\)` respectively. <sup>🤓</sup> The package `MOTE` (Buchanan, Gillenwaters, Scofield, and Valentine, 2019) has a function where all details can be inserted manually to calculate `\(\omega^2_G\)`.] ] .pull-right[ <table> <caption>How to classify? ⚠️</caption> <thead> <tr> <th style="text-align:left;"> Effect Size </th> <th style="text-align:left;"> \(\eta_G^2\) 📔 \(;\omega^2_G\) 📜 </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Small </td> <td style="text-align:left;"> 0.01 </td> </tr> <tr> <td style="text-align:left;"> Medium </td> <td style="text-align:left;"> 0.06 </td> </tr> <tr> <td style="text-align:left;"> Large </td> <td style="text-align:left;"> 0.14 </td> </tr> </tbody> <tfoot> <tr> <td style = 'padding: 0; border:0;' colspan='100%'><sup></sup> ⚠️ No precise definitions, it is always context dependent.</td> </tr> </tfoot> </table> .font60[ <sup>📔</sup> Cohen, J. (1992). A power primer. _Psychological Bulletin, 112_(1), 155–159. [https://doi.org/10.1037/0033-2909.112.1.155](https://doi.org/10.1037/0033-2909.112.1.155) <sup>📜</sup> Kirk, R. E. (1996). Practical significance: A concept whose time has come. _Educational and Psychological Measurement, 56_(5), 746–759. [https://doi.org/10.1177/0013164496056005002](https://doi.org/10.1177/0013164496056005002) ] ] ] .panel[.panel-name[R Code] ``` r #continue from the previous code (i.e., `rep_aov` object) *effectsize::eta_squared(model = rep_aov, generalized = "dv", ci = NULL) #load package and run function *library(MOTE) #the package to estimate the effect size measure(s) *srma <- summary(rep_aov, multivariate=FALSE, univariate=TRUE) # create summary object *omega.partial.SS.rm(dfm = srma$univariate.tests["dv","num Df"], #df_F *dfe = srma$univariate.tests["dv","den Df"], #df_E *msm = srma$univariate.tests["dv","Sum Sq"]/srma$univariate.tests["dv","num Df"], #MSF *mse = srma$univariate.tests["dv","Error SS"]/srma$univariate.tests["dv","den Df"], #MSE *mss = srma$univariate.tests["(Intercept)","Error SS"]/srma$univariate.tests["(Intercept)","den Df"], #MSB *ssm = srma$univariate.tests["dv","Sum Sq"], #SSF *sse = srma$univariate.tests["dv","Error SS"], #SSE *sss = srma$univariate.tests["(Intercept)","Error SS"], #SSB *a = .05) #alpha level ``` `generalized`: the name of the independent variable must be inserted. `ci`: confidence interval:numeric value `0 < ci < 1` (default `.95`) or `NULL` `a`: desired alpha level: numeric value `0 < a < 1` (default `.05`) ] .panel[.panel-name[Output] .scroll-box-16[ ``` ## # Effect Size for ANOVA (Type III) ## *## Parameter | Eta2 (generalized) ## ------------------------------ *## dv | 0.01 ## *## - Observed variables: dv$omega *## [1] 0.01307909 ## ## $omegalow ## [1] 0.004954374 ## ## $omegahigh ## [1] 0.02362347 ## ## $dfm ## [1] 2 ## ## $dfe ## [1] 2198 ## ## $F ## [1] 114.8373 ## ## $p ## [1] 3.665186e-48 ## ## $estimate ## [1] "$\\omega^2_{p}$ = 0.01, 95\\% CI [0.00, 0.02]" ## ## $statistic ## [1] "$F$(2, 2198) = 114.84, $p$ < .001" ``` ] ] .panel[.panel-name[Statistical Analysis] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt3[ It was not possible to validate the Normality assumption in all groups: Vigor `\((p < .001)\)`, Dedication `\((p < .001)\)`, and Absorption `\((p < .001)\)`. However, the skewness, and kurtosis values did not reveal severe normality violations for all groups: Vigor `\((sk =-0.645;\)` `\(ku=-0.261)\)`, Dedication `\((sk =-0.888;\)` `\(ku =0.049)\)`, and Absorption `\((sk =-0.938;\)` `\(ku=0.35)\)`. The means of age were compared with a repeated measures ANOVA test (Marôco, 2021). Regarding the sphericity, Mauchly's test `\(H_0\)` was rejected `\((w_{(2)}=0.924; p< .001)\)` thus the Huynd-Feldt correction was used `\((\varepsilon = 0.931)\)`. An `\(\alpha = .05\)` is considered for all statistical analyses. ]] .panel[.panel-name[Results] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ The mean of vigor `\((M = 4;\)` `\(SD=1.412)\)`, Dedication `\((M = 4.376;\)` `\(SD=1.456)\)`, and Absorption `\((M = 4.32;\)` `\(SD=1.431)\)` presented statistically significant differences `\((F_{(1.862; 2046.817)}= 114.837;\)` `\(p <.001;\)` `\(\eta^2_G =0.013; \omega^2_G = 0.013)\)`. .tr[📚 Marôco, J. (2021). _Análise estatística com o SPSS statistics_ (8th ed.). ReportNumber.]]] ] --- class: inverse, center, middle # Post-Hoc <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- class: inverse, center, middle # Pairwise Paired _t_-test <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- name: Pairwisepaired_t_test # Pairwise Paired _t_-test .panelset[ .panel[.panel-name[Assumptions] **Pairwise paired _t_-test** has the same assumptions of the repeated measures ANOVA, in fact, .orange[it should only be used if repeated measures ANOVA's H<sub>0</sub> is rejected.] ] .panel[.panel-name[Hypotheses] <br> <br> <br> For all possible pairs of means `\(i,j\)`: <br> `\(H_0: \mu_i = \mu_j\)` <br> vs. <br> `\(H_1:\mu_i \neq \mu_j\)` ] .panel[.panel-name[Test Statistic] A `\(T\)` test statistic should be calculated for each pair of means `\(i\)` and `\(j\)` of the factor under study: `\(T=\frac{\bar D }{\sqrt{\frac{S'^2_D}{\sqrt{b}}}}\)` with `\(D = Y_i-Y_j\)` `\(b\)` — number of blocks or subjects ] .panel[.panel-name[Decision] In<svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>: `\(p\leq\alpha; \text{ reject }H_0\)`; concluding that the pair of means `\(i\)` and `\(j\)` are different. ] .panel[.panel-name[R Code] Example: >Identify which of the means (i.e., vigor, dedication, and absorption) differ between them: ``` r ds <- readr::read_csv(trimws('https://ndownloader.figshare.com/files/22299075 ')) ds$vi <- (ds$UWES1+ds$UWES2+ds$UWES5)/3 #create vigor variable ds$ded <- (ds$UWES3+ds$UWES4+ds$UWES7)/3 #create dedication variable ds$abs <- (ds$UWES6+ds$UWES8+ds$UWES9)/3 #create absorption variable #check normality using Shapiro-wilk or |sk| and |ku| values #the Mauchly's test will be produced together with the repeated measures ANOVA ds <- dplyr::rename(ds,"id"="...1") #change variable "...1" name to "id" *ds_long <- rstatix::gather(ds,"vi", "ded", "abs", *key = dv, value = score) #pass the data to the long format rep_aov <- afex::aov_ez(id = 'id',dv = 'score', fun_aggregate = mean,data = ds_long,within='dv',include_aov = T,type = "III") #ANOVA repeated measures *library(rstatix) *pairwise_t_test(ds_long, score~dv,paired=TRUE,p.adjust.method = "none") ``` `paired`: `TRUE` for paired samples, `FALSE` for independent samples. ] .panel[.panel-name[Output] .scroll-box-16[ ``` ## # A tibble: 3 × 10 ## .y. group1 group2 n1 n2 statistic df p p.adj p.adj.signif ## * <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <chr> *## 1 score abs ded 1100 1100 -2.17 1099 3 e- 2 3 e- 2 * *## 2 score abs vi 1100 1100 10.6 1099 3.8 e-25 3.8 e-25 **** *## 3 score ded vi 1100 1100 15.6 1099 1.52e-49 1.52e-49 **** ``` ] ] .panel[.panel-name[Statistical Analysis] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ As a post-hoc test to the conducted repeated measures ANOVA, the pairwise paired _t_-test was used. An `\(\alpha = .05\)` is considered for all statistical analyses. ]] .panel[.panel-name[Results] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ After rejecting the repeated measures ANOVA's `\(H_0\)` the post-hoc results revealed that there were statistically significant differences between the means of all groups: _absorption_ vs. _dedication_ `\((p = 0.03)\)`, _absorption_ vs. _vigor_ `\((p < .001)\)`, and _dedication_ vs. _vigor_ `\((p < .001)\)`. .tr[📚 Fox, J., & Weisberg, S. (2019). _An R Companion to applied regression_ (3rd ed.). Sage. [https://socialsciences.mcmaster.ca/jfox/Books/Companion/](https://socialsciences.mcmaster.ca/jfox/Books/Companion/)]]] ] --- # References Albers, C. and D. Lakens (2018). "When power analyses based on pilot data are biased: Inaccurate effect size estimators and follow-up bias". In: _Journal of Experimental Social Psychology_ 74, pp. 187-195. DOI: [10.1016/j.jesp.2017.09.004](https://doi.org/10.1016%2Fj.jesp.2017.09.004). URL: [https://linkinghub.elsevier.com/retrieve/pii/S002210311630230X](https://linkinghub.elsevier.com/retrieve/pii/S002210311630230X). Buchanan, E. M., A. Gillenwaters, J. E. Scofield, et al. (2019). _MOTE: Measure of the effect: Package to assist in effect size calculations and their confidence intervals (R package version 1.0.2) [Computer software]_. URL: [http://github.com/doomlab/MOTE](http://github.com/doomlab/MOTE). Carroll, R. M. and L. A. Nordholm (1975). "Sampling characteristics of Kelley's `\(\epsilon\)` and Hays' `\(\omega\)`". In: _Educational and Psychological Measurement_ 35.3, pp. 541-554. ISSN: 0013-1644. DOI: [10.1177/001316447503500304](https://doi.org/10.1177%2F001316447503500304). URL: [http://journals.sagepub.com/doi/10.1177/001316447503500304](http://journals.sagepub.com/doi/10.1177/001316447503500304). Field, A. P., J. Miles, and Z. Field (2012). _Discovering statistics using R_. London: Sage. ISBN: 978-1-4462-0045-2. --- # References Keselman, H. J. (1975). "A Monte Carlo investigation of three estimates of treatment magnitude: Epsilon squared, eta squared, and omega squared.". In: _Canadian Psychological Review/Psychologie canadienne_ 16.1, pp. 44-48. ISSN: 0318-2096. DOI: [10.1037/h0081789](https://doi.org/10.1037%2Fh0081789). URL: [http://doi.apa.org/getdoi.cfm?doi=10.1037/h0081789](http://doi.apa.org/getdoi.cfm?doi=10.1037/h0081789). Kramer, C. Y. (1956). "Extension of multiple range tests to group means with unequal numbers of replications". In: _Biometrics_ 12.3, pp. 307-310. ISSN: 0006341X. DOI: [10.2307/3001469](https://doi.org/10.2307%2F3001469). URL: [https://www.jstor.org/stable/3001469?origin=crossref](https://www.jstor.org/stable/3001469?origin=crossref). Marôco, J. (2021). _Análise estatística com o SPSS statistics_. 8th ed. Pêro Pinheiro: ReportNumber. Olejnik, S. and J. Algina (2003). "Generalized eta and omega squared statistics: Measures of effect size for some common research designs". In: _Psychological Methods_ 8.4, pp. 434-447. ISSN: 1939-1463. DOI: [10.1037/1082-989X.8.4.434](https://doi.org/10.1037%2F1082-989X.8.4.434). --- # References Pace, L. A. (2012). _Beginning R: An introduction to statistical programming_. Berkeley, CA: Apress. ISBN: 978-1-4302-4554-4. DOI: [10.1007/978-1-4302-4555-1](https://doi.org/10.1007%2F978-1-4302-4555-1). URL: [http://link.springer.com/10.1007/978-1-4302-4555-1](http://link.springer.com/10.1007/978-1-4302-4555-1). Peters, G. Y. and P. Verboon (2023). _rosetta: Parallel use of statistical packages in teaching (R package version 0.3.12) [Computer software]_. URL: [https://cran.r-project.org/package=rosetta](https://cran.r-project.org/package=rosetta). R Core Team (2021). _R: A language and environment for statistical computing (version 4.1.1) [Computer software]_. Vienna. URL: [https://www.r-project.org/](https://www.r-project.org/). RStudio Team (2021). _RStudio: Integrated development for R (version 1.4.1717) [Computer software]_. Boston, MA. URL: [http://www.rstudio.com/](http://www.rstudio.com/). --- class: center, bottom, inverse # More info -- Slides created with the <svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg> package [`xaringan`](https://github.com/yihui/xaringan). -- <svg viewBox="0 0 512 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;fill:currentColor;position:relative;display:inline-block;top:.1em;"> <g label="icon" id="layer6" groupmode="layer"> <path id="path2" d="M 132.62426,316.69067 C 119.2805,301.94483 112.56962,274.5073 112.56962,234.39862 v -54.79191 c 0,-37.32217 -5.81677,-63.58084 -17.532347,-78.83466 -11.6757,-15.293118 -31.159702,-22.922596 -58.353466,-22.922596 -5.958581,0 -11.409226,0.22492 -16.45319,0.5917 -5.04455,0.427121 -9.742846,1.037046 -14.1564111,1.83092 V 95.057199 H 16.671281 c 12.325533,0 20.908335,3.82414 25.667559,11.532201 4.77973,7.74964 7.139712,25.48587 7.139712,53.14663 v 68.01321 c 0,42.12298 13.016861,74.19672 39.233939,96.16314 19.627549,16.47424 46.636229,27.23363 81.030059,32.40064 v -20.17708 c -16.3928,-4.27176 -29.04346,-10.51565 -37.11829,-19.44413 z m 246.75144,0 c 13.34377,-14.74584 20.05466,-42.18337 20.05466,-82.29205 v -54.79191 c 0,-37.32217 5.81673,-63.58084 17.53235,-78.83466 11.67568,-15.293118 31.15971,-22.922596 58.35348,-22.922596 5.95858,0 11.40922,0.22492 16.45315,0.5917 5.04457,0.427121 9.74287,1.037046 14.15645,1.83092 v 14.785125 h -10.59712 c -12.32549,0 -20.90826,3.82414 -25.66752,11.532201 -4.77974,7.74964 -7.13972,25.48587 -7.13972,53.14663 v 68.01321 c 0,42.12298 -13.01688,74.19672 -39.23394,96.16314 -19.6275,16.47424 -46.63622,27.23363 -81.03006,32.40064 v -20.17708 c 16.39279,-4.27176 29.04347,-10.51565 37.11827,-19.44413 z M 303.95857,87.165762 c 8.42049,-6.691524 25.52576,-10.536158 51.23486,-11.492333 V 63.999997 H 156.80716 v 11.673432 c 26.1755,0.956175 43.38268,4.800809 51.68248,11.492333 8.31852,6.73139 12.40691,20.033568 12.40691,39.904818 V 384.6851 c 0,20.80641 -4.08839,34.5146 -12.40691,41.02332 -8.2998,6.56905 -25.50698,10.10729 -51.68248,10.65744 V 448 h 197.71597 l 0.67087,-11.63414 c -25.50471,-0.54955 -42.56835,-4.35266 -51.07201,-11.40918 -8.4182,-6.95638 -12.73153,-20.44184 -12.73153,-40.27158 V 127.07058 c 0,-19.87125 4.16983,-33.173428 12.56922,-39.904818 z" style="stroke-width:0.0753388"></path> </g></svg> + <svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg> = <svg viewBox="0 0 512 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:red;"> [ comment ] <path d="M462.3 62.6C407.5 15.9 326 24.3 275.7 76.2L256 96.5l-19.7-20.3C186.1 24.3 104.5 15.9 49.7 62.6c-62.8 53.6-66.1 149.8-9.9 207.9l193.5 199.8c12.5 12.9 32.8 12.9 45.3 0l193.5-199.8c56.3-58.1 53-154.3-9.8-207.9z"></path></svg> -- <svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg> has infinite possibilities. -- Practice is the best strategy for learning. -- . -- _In God we trust, all others bring data_ -- Edwards Deming -- . -- . -- . -- THE END --- class: center, bottom, inverse 