library(logmult)
## Loading required package: gnm
##
## Attaching package: 'logmult'
## The following object is masked from 'package:gnm':
##
## se
library(MASS)
library(vcd)
## Loading required package: grid
##
## Attaching package: 'vcd'
## The following object is masked from 'package:logmult':
##
## assoc
library(vcdExtra)
Exercise 5.1 The data set criminal in the package logmult gives the 4 x 5 table below of the number of men aged 15-19 charged with a criminal case for whom charges were dropped in Denmark from 1955-1958.
(a) Test whether there is an association between Year and Age.
Is there evidence that dropping of charges in relation to age changed over the years recorded here?
data("criminal")
criminal
## Age
## Year 15 16 17 18 19
## 1955 141 285 320 441 427
## 1956 144 292 342 441 396
## 1957 196 380 424 462 427
## 1958 212 424 399 442 430
loglm( ~Year + Age, criminal)
## Call:
## loglm(formula = ~Year + Age, data = criminal)
##
## Statistics:
## X^2 df P(> X^2)
## Likelihood Ratio 38.24466 12 0.0001400372
## Pearson 38.41033 12 0.0001315495
It seems “Year” is the response variable, and “Age” is the predictor variable. In this case, the minimal baseline model is the mutual independence model, [Year][Age],
which states that Year is independent of Age. This is also indicated by both the likelihood ratio and Pearson (above).
(b) Use mosaic() with the option shade = TRUE to display the pattern of signs and magnitudes of the residuals. Compare this with the result of mosaic() using “Friendly shading,” from the option gp=shading_Friendly. Describe verbally what you see in each regarding the pattern of association in this table.
We note that there is a strong association between age 19 and the year 1955, as well as age 16 and the year 1958. In each of the two cases there is a strong indication that charges were dropped.
Also, only for the years, 15 and 16, does there seem to be a pattern of the dropping of charges in relation to age, over the years: the numbers are steadily increasing over the years.
More specically, there is a strong association between age 16 and year 1958; and, there is a strong association between age 19 and year 1955. (See below.)
mosaic(criminal, shade = TRUE, legend = FALSE)

Below, we measure, as well, the strength of the association between Year and Age:
vcd::assocstats(criminal)
## X^2 df P(> X^2)
## Likelihood Ratio 38.245 12 0.00014004
## Pearson 38.410 12 0.00013155
##
## Phi-Coefficient : NA
## Contingency Coeff.: 0.074
## Cramer's V : 0.043
We note that Cramer’s V is closer to 0 than it is to 1.0. This means that there is complete independence: there is significant association.
Below, we look at the result mosaic() using “Friendly shading” from the option gp = shading_Friendly to compare the patterns of association: we compare Friendly with mosaic() option’s shade=TRUE.
pairs(criminal, gp = shading_Friendly)

Since Year and Age are independent (also by the design of the experiment), we can use gp = shading_Friendly to emphasize this.
Both options indicate that there seems to be stronger associations between age 19 and year 1955; and, age 16 and year 1958.
Exercise 5.9
Bertin’s data are contained in “Accident” in vcdExtra. A frequency data frame representing his 5 x 2 x 4 x 2 table of the variables, age, result (died or injured), mode of transportation, and gender, is shown:
The structure of the data is as follows:
data("Accident")
str(Accident, vec.len=2)
## 'data.frame': 80 obs. of 5 variables:
## $ age : Ord.factor w/ 5 levels "0-9"<"10-19"<..: 5 5 5 5 5 ...
## $ result: Factor w/ 2 levels "Died","Injured": 1 1 1 1 1 ...
## $ mode : Factor w/ 4 levels "4-Wheeled","Bicycle",..: 4 4 2 2 3 ...
## $ gender: Factor w/ 2 levels "Female","Male": 2 1 2 1 2 ...
## $ Freq : int 704 378 396 56 742 ...
(a) Use the loglm() function to fit the model of mutual independence,
Freq ~ age + mode + gender + result.
loglm(Freq~age + mode + gender + result, Accident)
## Call:
## loglm(formula = Freq ~ age + mode + gender + result, data = Accident)
##
## Statistics:
## X^2 df P(> X^2)
## Likelihood Ratio 60320.05 70 0
## Pearson 76865.31 70 0
Both tests of association seem to indicate that there is a strong independence, or complete independence–the perfect model–between the variables. (i.e., G-square = 0 and Chi-square = 0)
(b) Use the mosaic function to produce an interpretable mosaic plot of the associations among all variables under the model of mutual independence. Try different orders of the variables in the mosaic. (Hint: the abbreviate component of the “labeling_args” argument to mosaic() will be useful to avoid some overlap of the category labels.)
FA <-loglm(Freq ~ age + mode + result + gender, Accident)
mosaic(FA, shade = TRUE, labeling_args = TRUE, Legend = FALSE)

FMA <- loglm(Freq~mode * age * gender * result, Accident)
mosaic(FMA, shade =TRUE, labeling_args = TRUE, Legend = FALSE)
## Warning in legend(residuals, gpfun, residuals_type): All residuals are
## zero.

Please note that all combinations do not work.
FM <- loglm(Freq ~ mode + age + gender + result, Accident)
mosaic(FM, shade = TRUE, labeling_args = FALSE, Legend = FALSE)

(c) Treat result (“Died” vs. “Injured”) as the response variable, and fit the model Freq ~ age * mode * gender + result that asserts independence of result from all others jointly.
loglm(Freq ~ age*mode*gender + result, Accident)
## Call:
## loglm(formula = Freq ~ age * mode * gender + result, data = Accident)
##
## Statistics:
## X^2 df P(> X^2)
## Likelihood Ratio 2217.72 39 0
## Pearson 2347.60 39 0
LL <- loglm(Freq ~ age*mode*gender + result, Accident)
(d) Construct a mosaic display for the residual associations in this model. Which combinations of the predictor factors are more likely to result in death?
Let’s look at the mosaic plot of the LL association:
mosaic(LL, shade = TRUE, LEGEND = FALSE)

It is very difficult to read and interpret this graph. Hence, other tools may be preferred. However, the combinations of predictor factors that are attributable to death are: 4-wheel 50+, motorcycle 50+, pedestrian 50+, bicycle 50+, 4-wheel 10-19, bicycle 10-19, motorcycle. It seems both sexes are affected.