13.4, 13.11, 14.4, 14.20, 14.23
In a famous sociological study called Middletown, Lynd and Lynd (1956) ad- ministered questionnaires to 784 white high school students. The students were asked which two of ten given attributes were most desirable in their fathers. The following table shows how the desirability of the attribute “being a college grad- uate” was rated by male and female students. Did the males and females value this attribute differently?
| Male | Female | |
|---|---|---|
| Mentioned | 86 | 55 |
| Not Mentioned | 283 | 360 |
Solution
First let’s state the test being formulated and tested
\(H_0:\) the distribution of males and females are the same
\(H_1:\) the populations are different
To determine if the the two populations value this attribute differently let’s fill out a table depicting all the population numbers
| Male | Female | Totals | |
|---|---|---|---|
| Mentioned | 86 | 55 | 141 |
| Not Mentioned | 283 | 360 | 643 |
| Totals | 369 | 415 | 784 |
First lets calculate the \(df = (2-1) * (2-1) = 1\)
Let’s now write the expected cell counts
Using usual notation these are given by
\(N_{11} = (141 * 369)/784 = 66.36\)
\(N_{21} = (643 * 369)/784 = 302.6365\)
\(N_{12} = (141 * 415)/784 = 74.64\)
\(N_{22} = (643 * 415)/784 = 340.36\)
and we can calculate the test statistic, to finish the rest we resort to R
expected <- c(66.36, 302.64, 74.64, 340.36)
observed <- c(86, 283, 55, 360)
test_stat <- sum((expected - observed)^2 / expected)
p_value <- 1 - pchisq(q = test_stat, df = 1); p_value
## [1] 0.0002531856
And so we can reject the null, and conclude that there is a difference between the way males and females value this attribute.
Derive the likelihood ratio test of homogeneity.
Calculate the likelihood ratio test statistic for the example of Section 13.3, and compare it to Pearson’s chi- square statistic.
Derive the likelihood ratio test of independence.
Calculate the likelihood ratio test statistic for the example of Section 13.4, and compare it to Pearson’s chi-square statistic.
Solution
The likelihood ratio test is the same that we developed using the multinomial. Simply applying those results here we have that under the null \[2 \sum \text{Obs}_i \log\left( \frac{\text{Obs}_i}{\text{Exp}_i} \right)\]
We do this in R
admiror_o <- c(83,29,15,22,43,4)
austin_o <- c(434,62,86,236,161,38)
row_totals <- admiror_o + austin_o
admiror_e <- row_totals*sum(admiror_o)/sum(row_totals)
austin_e <- row_totals*sum(austin_o)/sum(row_totals)
2*(sum(admiror_o*log(admiror_o/admiror_e))+ sum(austin_o*log(austin_o/austin_e)))
## [1] 31.73677
calculate_lr <- function(pop1, pop2) {
row_totals <- pop1 + pop2
pop1_expected <- row_totals*sum(pop1)/sum(row_totals)
pop2_expected <- row_totals*sum(pop2)/sum(row_totals)
return(2 * (sum(pop1*log(pop1/pop1_expected))+ sum(pop2*log(pop2/pop2_expected))))
}
college <- c(550, 61)
no_college <- c(681, 144)
print(calculate_lr(college, no_college))
## [1] 16.54576
Give the form of the design matrix for such a model.
Solution
The design matrix is as follows:
\[\mathbf{X} = \begin{bmatrix} I_F(1) &I_M(1) &x_1\\ I_F(2) &I_M(2) &x_2\\ I_F(3) &I_M(3) &x_3\\ \vdots &\vdots &\vdots \\ I_F(n) &I_M(n) &x_n \end{bmatrix} \]