The data set criminal in the package logmult gives the 4 × 5 table below of the number of men aged 15-19 charged with a criminal case for whom charges were dropped in Denmark from 1955-1958.

library(logmult)
## Loading required package: gnm
## 
## Attaching package: 'logmult'
## The following object is masked from 'package:gnm':
## 
##     se
data ("criminal", package = "logmult")
criminal
##       Age
## Year    15  16  17  18  19
##   1955 141 285 320 441 427
##   1956 144 292 342 441 396
##   1957 196 380 424 462 427
##   1958 212 424 399 442 430
? loglm()
## starting httpd help server ... done
  1. Use loglm() to test whether there is an association between Year and Age. Is there evidence that dropping of charges in relation to age changed over the years recorded here?
xtabs (Freq ~ Age + Year, data = criminal)
##     Year
## Age  1955 1956 1957 1958
##   15  141  144  196  212
##   16  285  292  380  424
##   17  320  342  424  399
##   18  441  441  462  442
##   19  427  396  427  430
  1. Use mosaic() with the option shade=TRUE to display the pattern of signs and magnitudes of the residuals. Compare this with the result of mosaic() using “Friendly shading,” from the option gp=shading_Friendly. Describe verbally what you see in eaCh regarding the pattern of association in this table.
library(vcd)
## Loading required package: grid
## 
## Attaching package: 'vcd'
## The following object is masked from 'package:logmult':
## 
##     assoc
mosaic (xtabs (Freq ~ Age + Year, data = criminal))

margin.table(criminal, 1:2)
##       Age
## Year    15  16  17  18  19
##   1955 141 285 320 441 427
##   1956 144 292 342 441 396
##   1957 196 380 424 462 427
##   1958 212 424 399 442 430
criminalmargin <- margin.table(criminal, 1:2)
mosaic(criminalmargin, shade = TRUE)

mosaic(criminalmargin, gp=shading_Friendly)

If the values are independent to each other, then rows are aligned to each other. When the difference between Observed frequency and expected frequency is negative, then the color of the tile is red. When the difference is positive, tile color is blue. The color intensity of the shading depends how big Pearson residual is. For example, when it is 1, the color would be light, when it is 2, the shade would little darker than 1, when it is 3, the should be darker than 2 and so on. In the year 1958 for age 16, and year 1955 for age 19, observed frequency is more than expected frequency as the tile color is blue.

Bertin (1983, pp. 30-31) used a 4-way table of frequencies of traffic accident victims in France in 1958 to illustrate his scheme for classifying data sets by numerous variables, each of which could have various types and could be assigned to various visual attributes. His data are contained in Accident in vcdExtra, a frequency data frame representing his 5 × 2 × 4 × 2 table of the variables age, result (died or injured), mode of transportation, and gender.

data ("Accident", package="vcdExtra")
str(Accident, vec.len=2)
## 'data.frame':    80 obs. of  5 variables:
##  $ age   : Ord.factor w/ 5 levels "0-9"<"10-19"<..: 5 5 5 5 5 ...
##  $ result: Factor w/ 2 levels "Died","Injured": 1 1 1 1 1 ...
##  $ mode  : Factor w/ 4 levels "4-Wheeled","Bicycle",..: 4 4 2 2 3 ...
##  $ gender: Factor w/ 2 levels "Female","Male": 2 1 2 1 2 ...
##  $ Freq  : int  704 378 396 56 742 ...
  1. Use loglm() to fit the model of mutual independence, Freq ~ age+mode+gender+result to this data set.
summary(Accident)
##     age         result           mode       gender        Freq        
##  0-9  :16   Died   :40   4-Wheeled :20   Female:40   Min.   :    5.0  
##  10-19:16   Injured:40   Bicycle   :20   Male  :40   1st Qu.:   81.0  
##  20-29:16                Motorcycle:20               Median :  634.5  
##  30-49:16                Pedestrian:20               Mean   : 2334.1  
##  50+  :16                                            3rd Qu.: 3218.8  
##                                                      Max.   :18909.0
View(Accident)
xtabs (Freq ~ age + mode+ gender+ result, data = Accident)
## , , gender = Female, result = Died
## 
##        mode
## age     4-Wheeled Bicycle Motorcycle Pedestrian
##   0-9          65       5          6         89
##   10-19        61      31         54         28
##   20-29       107      10         82         24
##   30-49       199      24         98         49
##   50+         253      56         78        378
## 
## , , gender = Male, result = Died
## 
##        mode
## age     4-Wheeled Bicycle Motorcycle Pedestrian
##   0-9          70      26          6        150
##   10-19       150      76        362         70
##   20-29       353      55        660         78
##   30-49       720     146        889        223
##   50+         513     396        742        704
## 
## , , gender = Female, result = Injured
## 
##        mode
## age     4-Wheeled Bicycle Motorcycle Pedestrian
##   0-9        1362     126        131       1967
##   10-19      2593    7218       3587       1495
##   20-29      4361     609       4010        864
##   30-49      7712    1118       3664       1814
##   50+        5552    1030       1387       5449
## 
## , , gender = Male, result = Injured
## 
##        mode
## age     4-Wheeled Bicycle Motorcycle Pedestrian
##   0-9        1593     378        181       3341
##   10-19      3543    3407      12311       1827
##   20-29      9084    1565      18558       1521
##   30-49     15086    3024      18909       3178
##   50+        7423    3863       8597       5206
  1. Use mosaic() to produce an interpretable mosaic plot of the associations among all variables under the model of mutual independence. Try different orders of the variables in the mosaic. (Hint: the abbreviate component of the labeling_args argument to mosaic() will be useful to avoid some overlap of the category labels.)
mosaic (xtabs (Freq ~ age + mode + gender + result , data = Accident))

mosaic (xtabs (Freq ~ gender + mode  + result+ age , data = Accident))

mosaic (xtabs (Freq ~ result + gender + age + mode , data = Accident))

  1. Treat result (“Died” vs. “Injured”) as the response variable, and fit the model Freq ~ agemodegender + result that asserts independence of result from all others jointly.
str(Accident)
## 'data.frame':    80 obs. of  5 variables:
##  $ age   : Ord.factor w/ 5 levels "0-9"<"10-19"<..: 5 5 5 5 5 5 5 5 5 5 ...
##  $ result: Factor w/ 2 levels "Died","Injured": 1 1 1 1 1 1 1 1 2 2 ...
##  $ mode  : Factor w/ 4 levels "4-Wheeled","Bicycle",..: 4 4 2 2 3 3 1 1 4 4 ...
##  $ gender: Factor w/ 2 levels "Female","Male": 2 1 2 1 2 1 2 1 2 1 ...
##  $ Freq  : int  704 378 396 56 742 78 513 253 5206 5449 ...