library(vcd)
## Warning: package 'vcd' was built under R version 3.4.4
## Loading required package: grid
library(vcdExtra)
## Warning: package 'vcdExtra' was built under R version 3.4.4
## Loading required package: gnm
## Warning: package 'gnm' was built under R version 3.4.4
library(ca)
## Warning: package 'ca' was built under R version 3.4.4
library(logmult)
## Warning: package 'logmult' was built under R version 3.4.4
## 
## Attaching package: 'logmult'
## The following object is masked from 'package:gnm':
## 
##     se
## The following object is masked from 'package:vcd':
## 
##     assoc
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.4.4

Exercise 6.2 The data set criminal in the package logmult gives the 4 × 5 table below of the number of men aged 15-19 charged with a criminal case for whom charges were dropped in Denmark from 1955-1958.

Carry out a simple correspondence analysis on this table.

(a)What percentages of the Pearson ??2 for association are explained by the various dimensions?

data("criminal", package = "logmult")
criminal
##       Age
## Year    15  16  17  18  19
##   1955 141 285 320 441 427
##   1956 144 292 342 441 396
##   1957 196 380 424 462 427
##   1958 212 424 399 442 430
crim <- margin.table(criminal, 1:2)
(crim.ca <- ca(crim))
## 
##  Principal inertias (eigenvalues):
##            1        2        3      
## Value      0.004939 0.000491 3.8e-05
## Percentage 90.33%   8.98%    0.69%  
## 
## 
##  Rows:
##              1955     1956      1957      1958
## Mass     0.229751 0.229893  0.268897  0.271459
## ChiDist  0.090897 0.061048  0.047585  0.088033
## Inertia  0.001898 0.000857  0.000609  0.002104
## Dim. 1   1.253085 0.827543 -0.553684 -1.212927
## Dim. 2  -0.984738 0.733468  1.206411 -0.982745
## 
## 
##  Columns:
##                15        16        17       18        19
## Mass     0.098648  0.196584  0.211388 0.254235  0.239146
## ChiDist  0.101134  0.093089  0.044072 0.071068  0.066594
## Inertia  0.001009  0.001703  0.000411 0.001284  0.001061
## Dim. 1  -1.433374 -1.297270 -0.332608 1.000960  0.887539
## Dim. 2  -0.333181 -0.808352  1.676250 0.307874 -1.007063

The Dimension 1 shows that there is a 90.3% association between people 18 to 15 who had charges dropped and the Dimension 2 shows a 9% association between people who are 19 and had the charges dropped.

Dimension 1(90.3%) Dimension 2(8.98%)

(b)Plot the 2D correspondence analysis solution. Describe the pattern of association between year and age.

plot(crim.ca)

We can observe the following associations:

Exercise 6.11 The data set Vietnam in vcdExtra gives a 2 × 5 × 4 contingency table in frequency form reflecting a survey of student opinion on the Vietnam War at the University of North Carolina in May 1967. The table variables are sex, year in school, and response, which has categories: (A) Defeat North Vietnam by widespread bombing and land invasion; (B) Maintain the present policy; (C) De-escalate military activity, stop bombing and begin negotiations; (D) Withdraw military forces immediately.

data("Vietnam", package="vcdExtra")
str(Vietnam)
## 'data.frame':    40 obs. of  4 variables:
##  $ sex     : Factor w/ 2 levels "Female","Male": 1 1 1 1 1 1 1 1 1 1 ...
##  $ year    : int  1 1 1 1 2 2 2 2 3 3 ...
##  $ response: Factor w/ 4 levels "A","B","C","D": 1 2 3 4 1 2 3 4 1 2 ...
##  $ Freq    : int  13 19 40 5 5 9 33 3 22 29 ...

(a) Using the stacking approach, carry out a correspondence analysis corresponding to the loglinear model [R][YS], which asserts that the response is independent of the combinations of year an sex.

Vietnam <- within(Vietnam, {year_sex <-paste(year, toupper(substr(sex,1,1)))})
Vietnam_year_sex <-xtabs(Freq~year_sex +response, data = Vietnam)
Vietnam.ca <-ca(Vietnam_year_sex)
summary(Vietnam_year_sex)
## Call: xtabs(formula = Freq ~ year_sex + response, data = Vietnam)
## Number of cases in table: 3147 
## Number of factors: 2 
## Test for independence of all factors:
##  Chisq = 366.4, df = 27, p-value = 3.387e-61
##  Chi-squared approximation may be incorrect

(b) Construct an informative 2D plot of the solution, and interpret in terms of how the response varies with year for males and females.

plot(Vietnam.ca)

There is association of the following combinations of respone and year/sex:

(c) Use mjca () to carry out an MCA on the three-way table. Make a useful plot of the solution and interpret in terms of the relationship of the response to year and sex.

Vietnam.mjca <-mjca(Vietnam_year_sex)
plot(Vietnam.mjca)

There is association of the following combinations of respone and year/sex: