Exercise 5.1 The data set criminal in the package logmult gives the 4 × 5 table below of the number of men aged 15–19 charged with a criminal case for whom charges were dropped in Denmark from 1955–1958.

library(MASS)
library(vcd)
library(vcdExtra)
#install.packages("logmult")
library(logmult)
data("criminal",package="logmult")
#criminal
  1. Use loglm() to test whether there is an association between Year and Age. Is there evidence that dropping of charges in relation to age changed over the years recorded here?
loglm(~Year + Age, data = criminal)
## Call:
## loglm(formula = ~Year + Age, data = criminal)
## 
## Statistics:
##                       X^2 df     P(> X^2)
## Likelihood Ratio 38.24466 12 0.0001400372
## Pearson          38.41033 12 0.0001315495

p-value is significant which could suggest that there is an association between Year and age

  1. Use mosaic() with the option shade=TRUE to display the pattern of signs and magnitudes of the residuals. Compare this with the result of mosaic() using “Friendly shading,” from the option gp=shading_Friendly. Describe verbally what you see in each regarding the pattern of association in this table.
mosaic(criminal, shade = TRUE,labeling = labeling_residuals)

mosaic(criminal, gp = shading_Friendly,labeling = labeling_residuals)

There is strong association between age 19 and year 1955 and also between age 16 and year 1958

Exercise 5.9 Bertin (1983, pp. 30–31) used a 4-way table of frequencies of traffic accident victims in France in 1958 to illustrate his scheme for classifying data sets by numerous variables, each of which could have various types and could be assigned to various visual attributes. His data are contained in Accident in vcdExtra, a frequency data frame representing his 5 × 2 × 4 × 2 table of the variables age, result (died or injured), mode of transportation, and gender.

  1. Use loglm() to fit the model of mutual independence, Freq ~ age+mode+gender+result to this data set.
data("Accident",package="vcdExtra")
Accident
##      age  result       mode gender  Freq
## 1    50+    Died Pedestrian   Male   704
## 2    50+    Died Pedestrian Female   378
## 3    50+    Died    Bicycle   Male   396
## 4    50+    Died    Bicycle Female    56
## 5    50+    Died Motorcycle   Male   742
## 6    50+    Died Motorcycle Female    78
## 7    50+    Died  4-Wheeled   Male   513
## 8    50+    Died  4-Wheeled Female   253
## 9    50+ Injured Pedestrian   Male  5206
## 10   50+ Injured Pedestrian Female  5449
## 11   50+ Injured    Bicycle   Male  3863
## 12   50+ Injured    Bicycle Female  1030
## 13   50+ Injured Motorcycle   Male  8597
## 14   50+ Injured Motorcycle Female  1387
## 15   50+ Injured  4-Wheeled   Male  7423
## 16   50+ Injured  4-Wheeled Female  5552
## 17 30-49    Died Pedestrian   Male   223
## 18 30-49    Died Pedestrian Female    49
## 19 30-49    Died    Bicycle   Male   146
## 20 30-49    Died    Bicycle Female    24
## 21 30-49    Died Motorcycle   Male   889
## 22 30-49    Died Motorcycle Female    98
## 23 30-49    Died  4-Wheeled   Male   720
## 24 30-49    Died  4-Wheeled Female   199
## 25 30-49 Injured Pedestrian   Male  3178
## 26 30-49 Injured Pedestrian Female  1814
## 27 30-49 Injured    Bicycle   Male  3024
## 28 30-49 Injured    Bicycle Female  1118
## 29 30-49 Injured Motorcycle   Male 18909
## 30 30-49 Injured Motorcycle Female  3664
## 31 30-49 Injured  4-Wheeled   Male 15086
## 32 30-49 Injured  4-Wheeled Female  7712
## 33 20-29    Died Pedestrian   Male    78
## 34 20-29    Died Pedestrian Female    24
## 35 20-29    Died    Bicycle   Male    55
## 36 20-29    Died    Bicycle Female    10
## 37 20-29    Died Motorcycle   Male   660
## 38 20-29    Died Motorcycle Female    82
## 39 20-29    Died  4-Wheeled   Male   353
## 40 20-29    Died  4-Wheeled Female   107
## 41 20-29 Injured Pedestrian   Male  1521
## 42 20-29 Injured Pedestrian Female   864
## 43 20-29 Injured    Bicycle   Male  1565
## 44 20-29 Injured    Bicycle Female   609
## 45 20-29 Injured Motorcycle   Male 18558
## 46 20-29 Injured Motorcycle Female  4010
## 47 20-29 Injured  4-Wheeled   Male  9084
## 48 20-29 Injured  4-Wheeled Female  4361
## 49 10-19    Died Pedestrian   Male    70
## 50 10-19    Died Pedestrian Female    28
## 51 10-19    Died    Bicycle   Male    76
## 52 10-19    Died    Bicycle Female    31
## 53 10-19    Died Motorcycle   Male   362
## 54 10-19    Died Motorcycle Female    54
## 55 10-19    Died  4-Wheeled   Male   150
## 56 10-19    Died  4-Wheeled Female    61
## 57 10-19 Injured Pedestrian   Male  1827
## 58 10-19 Injured Pedestrian Female  1495
## 59 10-19 Injured    Bicycle   Male  3407
## 60 10-19 Injured    Bicycle Female  7218
## 61 10-19 Injured Motorcycle   Male 12311
## 62 10-19 Injured Motorcycle Female  3587
## 63 10-19 Injured  4-Wheeled   Male  3543
## 64 10-19 Injured  4-Wheeled Female  2593
## 65   0-9    Died Pedestrian   Male   150
## 66   0-9    Died Pedestrian Female    89
## 67   0-9    Died    Bicycle   Male    26
## 68   0-9    Died    Bicycle Female     5
## 69   0-9    Died Motorcycle   Male     6
## 70   0-9    Died Motorcycle Female     6
## 71   0-9    Died  4-Wheeled   Male    70
## 72   0-9    Died  4-Wheeled Female    65
## 73   0-9 Injured Pedestrian   Male  3341
## 74   0-9 Injured Pedestrian Female  1967
## 75   0-9 Injured    Bicycle   Male   378
## 76   0-9 Injured    Bicycle Female   126
## 77   0-9 Injured Motorcycle   Male   181
## 78   0-9 Injured Motorcycle Female   131
## 79   0-9 Injured  4-Wheeled   Male  1593
## 80   0-9 Injured  4-Wheeled Female  1362
str(Accident)
## 'data.frame':    80 obs. of  5 variables:
##  $ age   : Ord.factor w/ 5 levels "0-9"<"10-19"<..: 5 5 5 5 5 5 5 5 5 5 ...
##  $ result: Factor w/ 2 levels "Died","Injured": 1 1 1 1 1 1 1 1 2 2 ...
##  $ mode  : Factor w/ 4 levels "4-Wheeled","Bicycle",..: 4 4 2 2 3 3 1 1 4 4 ...
##  $ gender: Factor w/ 2 levels "Female","Male": 2 1 2 1 2 1 2 1 2 1 ...
##  $ Freq  : int  704 378 396 56 742 78 513 253 5206 5449 ...
loglm(Freq~ age + mode + gender + result, data=Accident)
## Call:
## loglm(formula = Freq ~ age + mode + gender + result, data = Accident)
## 
## Statistics:
##                       X^2 df P(> X^2)
## Likelihood Ratio 60320.05 70        0
## Pearson          76865.31 70        0
  1. Use mosaic() to produce an interpretable mosaic plot of the associations among all variables under the model of mutual independence. Try different orders of the variables in the mosaic. (Hint: the abbreviate component of the labeling_args argument to mosaic() will be useful to avoid some overlap of the category labels.)
Accident$mode <- ordered(Accident$mode, levels=levels(Accident$mode)[c(1,3,2,4)])
mosaic(loglm(Freq ~ age + mode + gender + result, data = Accident), labeling_args = list(abbreviate=c(mode=4,gender=1,result=1)))

  1. Treat result (“Died” vs. “Injured”) as the response variable, and fit the model Freq ~ age mode gender + result that asserts independence of result from all others jointly.
mosaic (Freq ~ gender + mode  + result + age , data = Accident, shade = TRUE, labeling_args = list(clip = c(result = TRUE)))

  1. Construct a mosaic display for the residual associations in this model. Which combinations of the predictor factors are more likely to result in death?
mosaic(loglm(Freq ~ age * mode * gender + result, data = Accident), gp = shading_Friendly, labeling = labeling_residuals)

Male padestrians over 50 years old are more likely to result in death