class: center, middle, inverse, title-slide .title[ # An Introduction to
[ comment ]
and RStudio for Educational Researchers ] .subtitle[ ##
Descriptive and Inferential Statistics:
Statistical Decision Theory ] .author[ ### Jorge Sinval ] .date[ ### 2025-11-18 ] --- class: inverse, center, middle <style> .orange { color: #EB811B; } .kbd { display: inline-block; padding: .2em .5em; font-size: 0.75em; line-height: 1.75; color: #555; vertical-align: middle; background-color: #fcfcfc; border: solid 1px #ccc; border-bottom-color: #bbb; border-radius: 3px; box-shadow: inset 0 -1px 0 #bbb } </style>
# 3. Statistical Inference <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- # 3. Statistical Inference >Extrapolation of the results obtained from the study of the sample to the population in order to infer about the parameter values of the theoretical population from which the samples were obtained and/or to validate hypotheses (on which the theories are based) about these parameters. Two main fields: -- 1 - **Estimation Theory:** _seeks to estimate the value of the parameters of the theoretical population from sample estimates_ .pull-left[ - Point estimation  ] .pull-right[ - Interval estimation  ] --- # Point estimation Estimates .orange[one] value for the population parameter from a supposedly representative sample of the population. Some frequent estimators: - Population proportion `\((\pi)\)`: `\(\hat \pi=\frac{X}{n}\)` - Population mean `\((\mu)\)`: `\(\bar X=\frac{1}{n} \sum\limits_{i=1}^{n} X_i\)` - Population variance `\((\sigma^2)\)`: `\(\hat S'^2=\frac{1}{n-1} \sum\limits_{i=1}^{n} (\bar X-X_i)^2\)` The problem with this type of estimation is that it does not have any measure of certainty (or uncertainty) associated with the estimate<sup>⚠️</sup> .orange[The solution lies in the interval estimation...] .footnote[ ⚠️ Multiple samples lead to multiple estimates of a population parameter which .orange[unique]. ] --- # Interval estimation Estimation of population parameters through confidence intervals. <br> -- <br> **Confidence Interval (CI)**: Range of values that, with a certain level of significance `\((\alpha)\)` or confidence level `\([1-\alpha \text{ or } (1-\alpha) \times 100\%]\)` will contain, in `\((1-\alpha) \times 100\%\)` of the intervals, the true value of the population parameter. -- e.g. if `\(X\sim N(\mu;\sigma)\)`, a `\((1-\alpha)\times100\%\)` C.I. for the `\(\bar X\)` is an interval of the type: -- <br> `$$\begin{aligned}\bigg]\bar X - z_{1-\frac{\alpha}{2}}\times\frac{\sigma}{\sqrt{n}};\bar X + z_{1-\frac{\alpha}{2}}\times\frac{\sigma}{\sqrt{n}}\bigg[\end{aligned}$$` --- # Interval estimation Out of 100 intervals calculated for 100 representative samples of the same population where `\(X\sim N(0;1)\)`, `\(95\)` contain the true value of the population mean (i.e. `\(\mu=0\)`).<sup>⚠️</sup> .footnote[ ⚠️This is not the same as saying that `\(P[\mu\in CI95\%]=0.95)\)` ] .center[ <iframe height="315" src="assets/vid/ci.mp4" frameborder="0" allowfullscreen></iframe> ] --- # Decision theory <br> 2 - **Decision theory :** _seeks to substantiate decisions, using hypothesis tests relating to the parameters of the population, supported by a concrete measure of the degree of "(un)certainty" regarding the decision taken_ <br> <br> - Hypothesis Tests<sup>🤓</sup> <br> <br> >Inference about the value of a population parameter(s), or operational or theoretical hypotheses, based on sample statistics. .footnote[ 🤓 It is the most used area in statistics and data analysis! ] --- # Hypotheses Tests ## Hypotheses **Null Hypothesis** `\((H_0)\)`: Assumed to be true until significant evidence to the contrary. e.g. The defendant is not guilty **Alternative Hypothesis** `\((H_1)\)`: Alternative to `\(H_0\)` (the one that is really intended as true)<sup>🤓</sup> .pull-left[Hypotheses can be defined in accordance with the test type: .orange[Two-tailed test] `\(H_0: \mu = \mu_0 \text{ vs. } H_1: \mu \neq \mu_0\)` .orange[Left-tailed test] `\(H_0: \mu \geq \mu_0 \text{ vs. } H_1: \mu < \mu_0\)` .orange[Right-tailed test] `\(H_0: \mu \leq \mu_0 \text{ vs. } H_1: \mu > \mu_0\)` ] -- ⚠️ A statistical hypothesis `\((H_0 \text{ or } H_1)\)` is never accepted. Only gets rejected or not rejected `\(H_0\)`! .orange[Note:] Not rejecting `\(H_0\)` does not prove that `\(H_1\)` is correct. Example: Not rejecting `\(H_0\)`: "the defendant is not guilty”, does not prove that he is innocent! It only proves that there is not enough evidence "beyond a reasonable doubt" to reject `\(H_0\)` and conclude that the defendant is guilty… -- 🤓 It is the most used area in statistics and data analysis! --- # Hypotheses Tests ## Test statistic **Relative distance** (e.g. units of standard deviation) between the hypothetical parameter (defined in `\(H_0\)`) and the sample estimate (calculated on the sample). -- If this distance is **_too large_**, then — assuming that the sample is representative — the hypothetical population value might not be correct, so `\(H_0\)` should be rejected. -- <br> <br> <br> **How do we know if the .orange[Test Statistic] is _too big_ or not?** -- Example: `\(H_0: \mu = \mu_0 \text{ vs. } H_1: \mu \neq \mu_0\)` `$$\begin{aligned} T = \frac{\bar X- \mu_0}{\frac{S'}{\sqrt{n}}}\end{aligned}$$` `\(\bar X\)` - sample mean; `\(\mu_0\)` - population mean (under `\(H_0\)`); `\(\frac{S'}{\sqrt{n}}\)` - standard error of the mean --- # Hypotheses Tests ## Decision How do we know if .orange[test statistic] is **too big** or **not**? -- Knowing the .orange[test statistic sampling distribution] (_t_-student, F, `\(\chi^2\)`, etc...). Two possibilities: **A. Neyman & Pearson:** If the test statistic belongs to the **rejection region**, reject `\(H_0\)`. -- .orange[Rejection Region]: region whose probability of containing the test statistic value is `\(\alpha\)` `\((\alpha\)`: Level of significance or probability of type I error). **B. R. Fisher:** Reject `\(H_0\)` if _p_-value `\(\leq\)` critical probability (e.g. 1/20 or 0.05; usual value) -- .orange[_p_-value]: probability of obtaining an equal or more extreme value (in the direction of `\(H_1\)`) than the test statistic if `\(H_0\)` is true. --- # Hypotheses Tests **Statistical errors** When .orange[statistical decision] is taken, it can be wrong. In statistics, it is possible to **quantify** the probability of taking the wrong decision (.orange[Theory of Neyman & Pearson]): .center[**Decision**] | `\(\downarrow\)` Population|Reject `\(H_0\)`|Not Reject `\(H_0\)`| |-----------------: | --------------: | ---------: | | `\(H_0\)` is false <br> (the effect is present) |.orange[Correct decision] <br> (effect detected) <br> `\(P[Corr.Dec.]=1-\beta\)`|.orange[Type II Error] <br> (effect not detected) <br> `\(P[\text{Type II Error}]=\beta\)` | | `\(H_0\)` is true <br> (the effect is not present) | .orange[Type I Error] <br> (effect detected, but it is not true) <br> `\(P[\text{Type I Error}]=\alpha\)` | .orange[Correct decision] <br> (effect not detected) <br> `\(P[Corr.Dec.]=1-\alpha\)` | -- .footnote[ <br> <br> <br> <br> - Usual values: `\(\alpha = .001, .010, .05\)` - Usual values: `\((1-\beta)\geq.80\)` ] --- # Hypotheses Tests **Statistical Significance vs. practical significance** A .orange[statistically significant] result just means that the result obtained is different from the one we would hope to obtain by mere chance… -- _while_ -- ...A statistically significant result may have no practical meaning (and vice-versa). Differences between **practical significance** and **statistical significance** are generally associated with: -- + **Sample size**: the greater the `\(n\)`, the greater the statistical significance, but not necessarily the practical significance. At the limit, any effect can become statistically by sufficiently increasing the sample size. --- # Hypotheses Tests **Statistical Significance vs. practical significance** `\((\bar x_1= 10;\bar x_2= 12;\bar x_3= 13.5; s'_i =2)\)` .panelset[ .panel[.panel-name[N=12] .center[] ] .panel[.panel-name[N=24] .center[] ] ] --- # Hypotheses Tests **Statistical Significance vs. practical significance** - **Significance level `\((\alpha)\)`**: the greater the `\(\alpha\)`, the greater the probability of obtaining statistically significant results. -- - **Test power `\((1-\beta)\)`**: the greater the power of a test `\((1-\beta)\)`, the greater the probability of detecting it a significant difference between the groups -- - **effect size**: the greater the effect size, the greater the practical significance. --- # Hypotheses Tests **Statistical Significance vs. practical significance** **What is the ".orange[effect size]"?** The "**practical significance**" of a result depends not only on differences between groups (e.g. experimental vs. control) but also from the experimental context, socioeconomic, etc… where these differences or effects occur… The effect size is a component of practical significance… -- >Index, standardized or not, that assesses the magnitude of the difference between groups in a given dependent variable or on the effect of a factor on the variation of a dependent variable. <sup>📜</sup> .footnote[ 📜 Wilkinson, L. (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54(8), 594–604. [https://doi.org/10.1037/0003-066X.54.8.594](https://doi.org/10.1037/0003-066X.54.8.594) ] --- # Hypotheses Tests **Statistical Significance vs. practical significance** ### Effect sizes • In addition to the .orange[test statistic] and .orange[_p_-values], a measure of the effect size and/or a C.I should be given. -- • If the measurement units are meaningful (e.g. number of cars produced per month) a non-standardized measure (difference between means, regression coefficient) should be provided -- • If the units of measure are meaningless (e.g. score on a job satisfaction scale), a standardized effect size should be provided (Cohen's `\(d\)`, Hedges' `\(g\)`, Cohen's `\(f\)`, `\(\eta_p^2\)`, `\(R^2\)`. --- # Hypotheses Tests **Statistical Significance vs. practical significance** ### Effect sizes Cohen's `\(d\)`: Relative difference between means in units of S.D. `\(d=\frac{\bar X_1-\bar X_2}{\sqrt{\frac{S'^2_1+S'^2_1}{2}}}\)` -- `\(\eta^2\)`: Proportion of total variability explained by the factor: The `\(\eta^2\)` for each factor depends on the number of other factors and the size of their effects. The sum of `\(\eta^2\)` gives the proportion of the total variation of d.v. explained by the factors. `\(\eta^2=SSF/SST\)` -- Cohen's `\(f\)`: Index of deviation to "no effect": `\(f=\sqrt{\frac{\eta^2}{1-\eta^2}}\)` --- # Hypotheses Tests **Statistical Significance vs. practical significance** ### Effect sizes `\(\eta^2_p\)`: Proportion of factor variability and error explained by the factor: `\(\eta^2_p=\frac{SSF}{SSF+SSE}\)` -- `\(\omega^2\)`: since `\(\eta\)` tends to overestimate the true size of the effect in the population, it might be preferable to use `\(\omega^2\)` which is a centered estimator of the size of the population effect (do not calculate if `\(F<1\)`): `\(\omega^2=\frac{SSF-df_F \times MSE}{SST+MSE}\)` --- # Hypotheses Tests ## Parametric Tests - Focus on population parameters - Require the sampling distribution of the dependent variable (under study) to be known; usually `\(N(\mu,\sigma)\)`. - Various other assumptions (e.g. homogeneity of the dependent variable between on groups under study; sphericity). -- ## Non-Parametric Tests - Do not require the sampling distribution of the variable under study to be known; or more strictly, do not require this to follow `\(N(\mu;\sigma)\)`. - Alternative to parametric tests when the conditions for applying these tests are not fulfilled. <sup>⚠️</sup> - Generally only consider counts or ranks of observations - Suitable for qualitative variables .footnote[<sup>⚠️</sup>usually non-parametric tests have less statistical power `\\(\(1+\beta)\\)`.] --- # Hypotheses Tests ## Parametric vs. Non-Paramentric tests **Why not just use non-parametric tests, as these tests do not require validating the assumptions of parametric tests?** -- 1. Because they generally have .orange[less statistical power] than parametric tests. -- 2. They are .orange[more conservative], i.e. they require greater differences between the population parameter and sample statistics, to classify this difference as significant; and/or require larger sample sizes for the same difference. -- 3. Less .orange["availability"] of non-parametric tests than of parametric tests in Statistical analysis software, especially for methods multivariate. --- class: inverse, center, middle # Independent samples <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- # Independent samples There is no type of relationship or unifying factor between the elements of the samples. The elements are selected at random by group. Theoretical probability of a given observation belonging to more than one sample is null; e.g. control group vs. experimental group --- class: inverse, center, middle # Paired samples <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- # Paired samples **Repeated measurements**: constituted using the same experimental subjects, e.g., longitudinal studies, before vs. later) **Blocks**: groups with a common relationship where treatments are applied (e.g., twins, couples) --- # Hypotheses Tests Main parametric methods of use in social and human sciences: - T-Student (.orange[compare one or two means]) - ANOVA (.orange[compare two or more means]) - General Linear Models (ANOVA and linear regression are particular cases of it) -- Assumptions? --- class: inverse, center, middle # Assumptions <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- class: inverse, center, middle # Normality <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- # Assumptions ## Normality 1. Dependent variable is quantitative and has normal distribution. Options: * Q-Q plots – not exact, but for most cases, as tests parameters are robust to small deviations from normality * Kolmogorov-Smirnov test with Lilliefors correction for samples – very sensitive to sample size (small deviations from normality are undervalued in small samples; but are overvalued in large samples) * Shapiro-Wilk – less sensitive to sample size (especially for samples up to 50 elements) * Skewness `\((|sk| \leq 3)\)` and kurtosis `\((|ku|\leq7)\)` values --- # Assumptions ## Normality: q-q plots ```r x_var <- rnorm(n = 1000, mean = 0, sd = 1) *car::qqPlot(x = x_var) ``` .center[] --- # Kolmogorov-Smirnov Test<sup>🤓</sup> .panelset[ .panel[.panel-name[Hypotheses] `\(H_0^i: Y_i \sim N(\mu_i;\sigma_i)\)` vs. `\(H_1^i: Y_i \nsim N(\mu_i;\sigma_i);~(i=1,...,k)\)` ] .panel[.panel-name[Test Statistic] `\(D_n=max|F(x_i)-F_0(x_i)|,|F(x_{i-1})-F_0(x_{i-1})|\)` .pull-left[ where `\(F(x_i)\)` - `\(X\)` distribution function `\(F_0(x_i)\)` - `\(X\)` theoretical distribution function `\((N)\)` with `\(\mu=\bar x\)`, and `\(\sigma=s'\)` ] .pull-right[ .center[] ] ] .panel[.panel-name[Decision] In<svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>: `\(p\leq\alpha; \text{ reject }H_0.\)` ] .panel[.panel-name[Statistical Analysis] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ The Kolmogorov-Smirnov test (with Lilliefors correction for samples) was used to assess normal distribution. ]] .panel[.panel-name[Results] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ The variable `\(Age\)` does not follow a normal distribution `\((D_{(1041)} = 0.103; p < .001)\)`. ]] .panel[.panel-name[R Code] ```r ds <- readr::read_csv("data/pone.0231474.s001.csv") #test normal distribution library(nortest) *lillie.test(x = ds[complete.cases(ds$Age),]$Age) ``` ] .panel[.panel-name[Output] .scroll-box-16[ ``` ## ## Lilliefors (Kolmogorov-Smirnov) normality test ## ## data: ds[complete.cases(ds$Age), ]$Age *## D = 0.10314, p-value < 2.2e-16 ``` ] ] ] .footnote[<sup>🤓</sup> with Lilliefors corretion for samples] --- # Shapiro-Wilk Test <sup>⚠️</sup> .panelset[ .panel[.panel-name[Hypotheses] `\(H_0^i: Y_i \sim N(\mu_i;\sigma_i)\)` vs. `\(H_1^i: Y_i \nsim N(\mu_i;\sigma_i);~(i=1,...,k)\)` ] .panel[.panel-name[Test Statistic] `\(W = \frac{\bigg(\sum \limits^n_{i=1} a_i x_{(i)}\bigg)^2}{\sum \limits^n_{i=1} x_i- \bar x}\)` Small `\(W\)` values indicate that the distribution is not normal. The test statistic `\(W\)` is equivalent to the correlation coefficient between the quantiles of the Normal Distribution `\((a_i)\)`, and the quantiles of the observed distribution. ] .panel[.panel-name[Decision] In<svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>: `\(p-value=P[W\geq w_n]\leq\alpha; \text{ reject }H_0.\)` ] .panel[.panel-name[Statistical Analysis] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ The Shapiro-Wilk test was used to assess normal distribution. ]] .panel[.panel-name[Results] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ The variable `\(Age\)` does not follow a normal distribution `\((W_{(1041)} = 0.955; p < .001)\)`. ]] .panel[.panel-name[R Code] ```r ds <- readr::read_csv("data/pone.0231474.s001.csv") #test normal distribution library(stats) *shapiro.test(x = ds[complete.cases(ds$Age),]$Age) ``` ] .panel[.panel-name[Output] .scroll-box-16[ ``` ## ## Shapiro-Wilk normality test ## ## data: ds[complete.cases(ds$Age), ]$Age *## W = 0.95481, p-value < 2.2e-16 ``` ] ] ] .footnote[ ⚠️ should be prefered to Kolmogorov-Smirnov (with Lilliefors correction for samples) when `\(n_i<51\)`. ] --- class: inverse, center, middle # Homoscedasticity <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- # Assumptions ## Homoscedasticity When two or more **independent samples** are compared, the variances of the populations from which these groups were extracted are homogeneous (homoscedasticity or homogeneity of variances) * Levene Test --- # Levene Test .panelset[ .panel[.panel-name[Hypotheses] `\(H_0: \sigma^2_1 = \sigma^2_2 = ... = \sigma^2_k\)` vs. `\(H_1: \exists i,j: \sigma^2_i \neq \sigma^2_j; i \neq j; i,j=1,...,k\)` ] .panel[.panel-name[Test Statistic] 1. Calculate `\(Z_{ij}=|Y_{ij}- \bar Y_i|\)` if `\(Y \sim N(\mu; \sigma)\)` or `\(Z_{ij}=|Y_{ij}- \tilde Y_i|\)` if `\(Y \nsim N(\mu; \sigma)\)` 2. Calculate the test statistics where `\(k\)` is the number of samples: `\(W = \frac{(N-k)}{(k-1)} . \frac{\sum \limits^k_{i=1} n_i (\bar Z_i - \bar Z)^2}{\sum \limits^k_{i=1} \sum \limits^{n_i}_{j=1} (Z_{ij} - \bar Z_i)^2} \sim F_{(k-1;N-k)}\)` ] .panel[.panel-name[Decision] In<svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>: `\(p-value=P[W\geq f_{1-\alpha; [(k-1);(N-k)]}]\leq\alpha; \text{ reject }H_0.\)` ] .panel[.panel-name[Statistical Analysis] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ The Levene test was used to assess the homoscedasticity. ]] .panel[.panel-name[Results] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ The variable `\(Age\)` presents homoscedasticity `\((W_{(1;1038)} = 3.567; p = .059)\)`. ]] .panel[.panel-name[R Code] ```r ds <- readr::read_csv("data/pone.0231474.s001.csv") #test homoscedasticity library(car) *leveneTest(y = ds$Age, ds$Sex, center = "mean") ``` The `center = "mean"` should be selected when normality is assumed in all compared samples. If one sample does not hold the normality assumption, the `center = "median"` should be selected. ] .panel[.panel-name[Output] .scroll-box-16[ ``` ## Levene's Test for Homogeneity of Variance (center = "mean") ## Df F value Pr(>F) *## group 1 3.5669 0.05922 . *## 1038 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` ] ] ] --- ## Assumptions When three <sup>⚠️</sup> or more **paired samples** are compared, the variances of the populations from which these sample were extracted are homogeneous (homoscedasticity or homogeneity of variances) and their covariances are null, in other words **sphericity**. * Mauchly Test The variances between repeated measurements (or blocks) are homogeneous and the covariances between repeated measurements (or blocks) are null - .orange[Mauchly's Sphericity Test] or Epsilon value (rule of thumb; `\(\varepsilon > .7\)`) .footnote[ <sup>⚠️</sup> in case of only two paired samples, there is only one covariance so it makes no sense to test if all covariances are null – this assumption does not apply to two paired samples. ] --- class: inverse, center, middle # Sphericity <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- class: inverse, center, middle # Mauchly Test <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- # Sphericity correction If the Mauchly's test `\(H_0\)` is rejected, two corrections to the degrees of freedom can be used, Greenhouse-Geisser correction (Greenhouse and Geisser, 1959), and Huynd-Feldt (Huynh and Feldt, 1976). Those corrections consists in different `\(\varepsilon\)` which might be multiplied by the factor's degrees of freedom and error's degrees of freedom: - `\(df_{factor_{corrected}}=\varepsilon (k-1)\)` - `\(df_{error_{corrected}}=\varepsilon (k-1)(b-1)\)` -- The Greenhouse-Geisser correction tends to underestimate `\(\varepsilon\)` when `\(\varepsilon\)` is close to `\(1\)` (i.e., it is a conservative correction), whilst the Huynd-Feldt correction tends to overestimate epsilon `\(\varepsilon\)` (i.e., it is a more liberal correction). The recommendation is to use the Greenhouse-Geisser correction, especially if estimated epsilon `\(\varepsilon\)` is less than `\(.75\)`. Some recommend using the Huynd-Feldt correction if estimated `\(\varepsilon\)` is greater than `\(.75\)`. Both corrections produce very similar corrections, so if estimated `\(\varepsilon\)` is greater than `\(.75\)` using either of them can be equally justified. --- # Mauchly Test .panelset[ .panel[.panel-name[Hypotheses] `\(H_0: \bf{\Sigma} = \sigma^2 \bf{I}\)` (homoscedasticity, and null covariances) vs. `\(H_1: \bf{\Sigma} \neq \sigma^2 \bf{I}\)` (heteroskedasticity, and/or covariances not null) ] .panel[.panel-name[Test Statistic] `\(W=|\Theta|/[tr(\Theta)/k]^k\)` `\(\Theta = M'AM\)` with `\(A=(Y-XB)'W(Y-XB)\)` is the SSE matrix. With `\(df = \frac{k(k-1)}{2}-1\)` ] .panel[.panel-name[Decision] In<svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>: `\(p-value \leq \alpha; \text{ reject }H_0.\)` ] .panel[.panel-name[Statistical Analysis] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ The Mauchly test was used to assess the sphericity assumption. ]] .panel[.panel-name[Results] .bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt5[ The three **job satistaction** items do not present sphericity `\((W = 0.281;p < .001)\)`. ]] .panel[.panel-name[R Code] ```r ds <- readr::read_csv("data/pone.0231474.s001.csv") #join three different measurements job_satisfaction <- cbind(ds$SIJS1, ds$SIJS2, ds$SIJS4) # Make an mlm object mlm <- lm(job_satisfaction ~ 1) # Mauchly's test, test the sphericity *mlmfit <- mauchly.test(mlm, x = ~ 1) mlmfit ``` ] .panel[.panel-name[Output] .scroll-box-16[ ``` ## ## Mauchly's test of sphericity ## ## data: SSD matrix from lm(formula = job_satisfaction ~ 1) *## W = 0.28137, p-value < 2.2e-16 ``` ] ] ] --- # References Greenhouse, S. W. and S. Geisser (1959). "On methods in the analysis of profile data". In: _Psychometrika_ 24.2, pp. 95-112. ISSN: 0033-3123. DOI: [10.1007/BF02289823](https://doi.org/10.1007%2FBF02289823). Huynh, H. and L. S. Feldt (1976). "Estimation of the box correction for degrees of freedom from sample data in randomized block and split-plot designs". In: _Journal of Educational Statistics_ 1.1, pp. 69-82. ISSN: 0362-9791. DOI: [10.3102/10769986001001069](https://doi.org/10.3102%2F10769986001001069). --- class: center, bottom, inverse # More info -- Slides created with the <svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg> package [`xaringan`](https://github.com/yihui/xaringan). -- <svg viewBox="0 0 512 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;fill:currentColor;position:relative;display:inline-block;top:.1em;"> <g label="icon" id="layer6" groupmode="layer"> <path id="path2" d="M 132.62426,316.69067 C 119.2805,301.94483 112.56962,274.5073 112.56962,234.39862 v -54.79191 c 0,-37.32217 -5.81677,-63.58084 -17.532347,-78.83466 -11.6757,-15.293118 -31.159702,-22.922596 -58.353466,-22.922596 -5.958581,0 -11.409226,0.22492 -16.45319,0.5917 -5.04455,0.427121 -9.742846,1.037046 -14.1564111,1.83092 V 95.057199 H 16.671281 c 12.325533,0 20.908335,3.82414 25.667559,11.532201 4.77973,7.74964 7.139712,25.48587 7.139712,53.14663 v 68.01321 c 0,42.12298 13.016861,74.19672 39.233939,96.16314 19.627549,16.47424 46.636229,27.23363 81.030059,32.40064 v -20.17708 c -16.3928,-4.27176 -29.04346,-10.51565 -37.11829,-19.44413 z m 246.75144,0 c 13.34377,-14.74584 20.05466,-42.18337 20.05466,-82.29205 v -54.79191 c 0,-37.32217 5.81673,-63.58084 17.53235,-78.83466 11.67568,-15.293118 31.15971,-22.922596 58.35348,-22.922596 5.95858,0 11.40922,0.22492 16.45315,0.5917 5.04457,0.427121 9.74287,1.037046 14.15645,1.83092 v 14.785125 h -10.59712 c -12.32549,0 -20.90826,3.82414 -25.66752,11.532201 -4.77974,7.74964 -7.13972,25.48587 -7.13972,53.14663 v 68.01321 c 0,42.12298 -13.01688,74.19672 -39.23394,96.16314 -19.6275,16.47424 -46.63622,27.23363 -81.03006,32.40064 v -20.17708 c 16.39279,-4.27176 29.04347,-10.51565 37.11827,-19.44413 z M 303.95857,87.165762 c 8.42049,-6.691524 25.52576,-10.536158 51.23486,-11.492333 V 63.999997 H 156.80716 v 11.673432 c 26.1755,0.956175 43.38268,4.800809 51.68248,11.492333 8.31852,6.73139 12.40691,20.033568 12.40691,39.904818 V 384.6851 c 0,20.80641 -4.08839,34.5146 -12.40691,41.02332 -8.2998,6.56905 -25.50698,10.10729 -51.68248,10.65744 V 448 h 197.71597 l 0.67087,-11.63414 c -25.50471,-0.54955 -42.56835,-4.35266 -51.07201,-11.40918 -8.4182,-6.95638 -12.73153,-20.44184 -12.73153,-40.27158 V 127.07058 c 0,-19.87125 4.16983,-33.173428 12.56922,-39.904818 z" style="stroke-width:0.0753388"></path> </g></svg> + <svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg> = <svg viewBox="0 0 512 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:red;"> [ comment ] <path d="M462.3 62.6C407.5 15.9 326 24.3 275.7 76.2L256 96.5l-19.7-20.3C186.1 24.3 104.5 15.9 49.7 62.6c-62.8 53.6-66.1 149.8-9.9 207.9l193.5 199.8c12.5 12.9 32.8 12.9 45.3 0l193.5-199.8c56.3-58.1 53-154.3-9.8-207.9z"></path></svg> -- <svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg> has infinite possibilities. -- Practice is the best strategy for learning. -- . -- _In God we trust, all others bring data_ -- Edwards Deming -- . -- . -- . -- THE END --- class: center, bottom, inverse 