13.4, 13.11, 14.4, 14.20, 14.23

13.4

In a famous sociological study called Middletown, Lynd and Lynd (1956) ad- ministered questionnaires to 784 white high school students. The students were asked which two of ten given attributes were most desirable in their fathers. The following table shows how the desirability of the attribute “being a college grad- uate” was rated by male and female students. Did the males and females value this attribute differently?

Male Female
Mentioned 86 55
Not Mentioned 283 360

Solution

First let’s state the test being formulated and tested

\(H_0:\) the distribution of males and females are the same

\(H_1:\) the populations are different

To determine if the the two populations value this attribute differently let’s fill out a table depicting all the population numbers

Male Female Totals
Mentioned 86 55 141
Not Mentioned 283 360 643
Totals 369 415 784

First lets calculate the \(df = (2-1) * (2-1) = 1\)

Let’s now write the expected cell counts

Using usual notation these are given by

and we can calculate the test statistic, to finish the rest we resort to R

expected <- c(66.36, 302.64, 74.64, 340.36)
observed <- c(86, 283, 55, 360)

test_stat <- sum((expected - observed)^2 / expected)
p_value <- 1 - pchisq(q = test_stat, df = 1); p_value
## [1] 0.0002531856

And so we can reject the null, and conclude that there is a difference between the way males and females value this attribute.

13.11

  1. Derive the likelihood ratio test of homogeneity.

  2. Calculate the likelihood ratio test statistic for the example of Section 13.3, and compare it to Pearson’s chi- square statistic.

  3. Derive the likelihood ratio test of independence.

  4. Calculate the likelihood ratio test statistic for the example of Section 13.4, and compare it to Pearson’s chi-square statistic.


Solution

  1. The likelihood ratio test is the same that we developed using the multinomial. Simply applying those results here we have that under the null \[2 \sum \text{Obs}_i \log\left( \frac{\text{Obs}_i}{\text{Exp}_i} \right)\]

  2. We do this in R

admiror_o <- c(83,29,15,22,43,4)
austin_o <- c(434,62,86,236,161,38)

row_totals <- admiror_o + austin_o
admiror_e <- row_totals*sum(admiror_o)/sum(row_totals)
austin_e <- row_totals*sum(austin_o)/sum(row_totals)

2*(sum(admiror_o*log(admiror_o/admiror_e))+ sum(austin_o*log(austin_o/austin_e)))
## [1] 31.73677
calculate_lr <- function(pop1, pop2) {
  row_totals <- pop1 + pop2
  pop1_expected <- row_totals*sum(pop1)/sum(row_totals)
  pop2_expected <- row_totals*sum(pop2)/sum(row_totals)
  return(2 * (sum(pop1*log(pop1/pop1_expected))+ sum(pop2*log(pop2/pop2_expected))))
}
college <- c(550, 61)
no_college <- c(681, 144)

print(calculate_lr(college, no_college))
## [1] 16.54576

14.4

Give the form of the design matrix for such a model.


Solution

The design matrix is as follows:

\[\mathbf{X} = \begin{bmatrix} I_F(1) &I_M(1) &x_1\\ I_F(2) &I_M(2) &x_2\\ I_F(3) &I_M(3) &x_3\\ \vdots &\vdots &\vdots \\ I_F(n) &I_M(n) &x_n \end{bmatrix} \]

14.20