Exercise 9.1 Consider the data set DaytonSurvey (described in Example 2.6), giving results of a survey of use of alcohol (A), cigarettes (C), and marijuana (M) among high school seniors. For this exercise, ignore the variables sex and race, by working with the marginal table Dayton.ACM, a 2 × 2 × 2 table in frequency data frame form.
data("DaytonSurvey",package="vcdExtra")
Dayton.ACM = aggregate(Freq~cigarette+alcohol+marijuana,data=DaytonSurvey, FUN=sum)
Dayton.ACM
## cigarette alcohol marijuana Freq
## 1 Yes Yes Yes 911
## 2 No Yes Yes 44
## 3 Yes No Yes 3
## 4 No No Yes 2
## 5 Yes Yes No 538
## 6 No Yes No 456
## 7 Yes No No 43
## 8 No No No 279
Fit the following models:
Joint independence [AC][M]
COnditional independence [AM][CM]
Homogeneous model [AC][AM][CM]
Mutual independence [A][C][M]
Saturated model [ACM]
library(MASS)
Dayton.joint=loglm(Freq~alcohol*cigarette+marijuana, data = Dayton.ACM)
Dayton.cond=loglm(Freq~marijuana*(alcohol+cigarette),data=Dayton.ACM)
Dayton.hom=loglm(Freq~(alcohol+cigarette+marijuana)^2,data=Dayton.ACM)
Dayton.mutual=loglm(Freq~alcohol+cigarette+marijuana,data=Dayton.ACM)
Dayton.sat=loglm(Freq~(alcohol*cigarette*marijuana),data=Dayton.ACM)
Prepare a table comparing the GOF of these models:
anova(Dayton.joint,Dayton.cond,Dayton.hom,Dayton.mutual,Dayton.sat, test="chisq")
## LR tests for hierarchical log-linear models
##
## Model 1:
## Freq ~ alcohol + cigarette + marijuana
## Model 2:
## Freq ~ alcohol * cigarette + marijuana
## Model 3:
## Freq ~ marijuana * (alcohol + cigarette)
## Model 4:
## Freq ~ (alcohol + cigarette + marijuana)^2
## Model 5:
## Freq ~ (alcohol * cigarette * marijuana)
##
## Deviance df Delta(Dev) Delta(df) P(> Delta(Dev)
## Model 1 1286.0199544 4
## Model 2 843.8266437 3 442.1933108 1 0.00000
## Model 3 187.7543029 2 656.0723408 1 0.00000
## Model 4 0.3739859 1 187.3803170 1 0.00000
## Model 5 0.0000000 0 0.3739859 1 0.54084
## Saturated 0.0000000 0 0.0000000 0 1.00000
The Mutual independence model shows to be the most acceptable model in terms of fit. This model has the lowest pearson chi square and highestr X^2 than the rest of the models.