Y X Distribution
Continuous Scalar T
Cont Binary(2) T
Cont Category >2 F
Cat. Cat. \(\chi^2\)

>2 Groups F-Test (ANOVA)

\(H_0:\mu_1=\mu_2=\mu_3\)
\(H_1:\neg(\mu_1=\mu_2=\mu_3)\) F-Test \[\begin{aligned} f_{stat}=\frac{\textrm{average variance between groups}}{\textrm{average variance within groups}} \\ \text{between groups}=\frac{n_{1}(\bar{y}_{1} - \bar{y})^{2}+ ... + n_{G}(\bar{y}_{G} - \bar{y})^{2} }{df=G-1} \\ \text{within groups}=\frac{(n_{1}-1)s_{1}^{2}+ ... + (n_{G}-1)s_{G}^{2} }{df=N-G}\\ \text{ where }N=\text{sum(n) in all},G=\text{# of Groups} \\ \text{compare }f_{stat} \text{ to } \text{qf}(cl,df_1,df_2) \\ \text{or compare }p_{value}=\text{1-pf}(f_{stat},df_1,df_2) \text{ to }\alpha\end{aligned}\]

(91 * (3.23 - 3.89)^2 + 111 * (3.9 - 3.89)^2 + 74 * (4.7 - 3.89)^2)/2
## [1] 44.10105
# Two samples - for use in determining to pool or not to pool in a t-test.
# var.test() or for multiple variables
datafilename = "http://personality-project.org/r/datasets/R.appendix1.data"
data.ex1 <- read.table(datafilename, header = T)
aov.ex1 <- aov(Alertness ~ Dosage, data = data.ex1)
summary(aov.ex1)
##             Df Sum Sq Mean Sq F value  Pr(>F)   
## Dosage       2  426.2  213.12   8.789 0.00298 **
## Residuals   15  363.8   24.25                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

\(\chi^2\)

  1. \(H_0\): The variables are independent of each other.
    \(H_1\): The variables are not independent (ie, they are dependent).
  2. Calculate \(f_{e}\) for each cell. Shortcut: \(f_{e} = \frac{\textrm{(row total)(column total)}}{\textrm{overall total}}\)
  3. Calculate \(\chi^{2}= \sum \frac{(f_{o}-f_{e})^{2}}{f_{e}}\)
  4. Calculate \(df = (r-1)(c-1)\)
  5. Calculate the \(\chi_{crit}=\)qchisq\((cl,df)\) and reject the null if the test statistic \(\chi^{2}\) (3) is greater than it. Or calculate the \(p_{value}=1-\)pchisq\((\chi^2,df)\) directly and reject the null if it is less than your chosen \(\alpha\).
(sexparty <- data.frame(dem = c(573, 386), indep = c(516, 475), rep = c(422, 399), 
    row.names = c("female", "male")))
##        dem indep rep
## female 573   516 422
## male   386   475 399
chisq.test(sexparty)
## 
##  Pearson's Chi-squared test
## 
## data:  sexparty
## X-squared = 16.202, df = 2, p-value = 0.0003033