Homework #5

Exercise 6.2 The data set criminal in the package logmult gives the 4 × 5 table below of the number of men aged 15-19 charged with a criminal case for whom charges were dropped in Denmark from 1955-1958.

library("logmult")

## Warning: package 'logmult' was built under R version 3.4.4

## Loading required package: gnm

## Warning: package 'gnm' was built under R version 3.4.4

## 
## Attaching package: 'logmult'

## The following object is masked from 'package:gnm':
## 
##     se

data("criminal",package = "logmult")
criminal

##       Age
## Year    15  16  17  18  19
##   1955 141 285 320 441 427
##   1956 144 292 342 441 396
##   1957 196 380 424 462 427
##   1958 212 424 399 442 430

Carry out a simple correspondence analysis on this table.

library(ca)

## Warning: package 'ca' was built under R version 3.4.4

criminal_ca=ca(criminal)
summary(criminal_ca)

## 
## Principal inertias (eigenvalues):
## 
##  dim    value      %   cum%   scree plot               
##  1      0.004939  90.3  90.3  ***********************  
##  2      0.000491   9.0  99.3  **                       
##  3      3.8e-050   0.7 100.0                           
##         -------- -----                                 
##  Total: 0.005468 100.0                                 
## 
## 
## Rows:
##     name   mass  qlt  inr    k=1 cor ctr    k=2 cor ctr  
## 1 | 1955 |  230  996  347 |   88 939 361 |  -22  58 223 |
## 2 | 1956 |  230  978  157 |   58 908 157 |   16  71 124 |
## 3 | 1957 |  269  984  111 |  -39 669  82 |   27 315 391 |
## 4 | 1958 |  271  999  385 |  -85 938 399 |  -22  61 262 |
## 
## Columns:
##     name   mass  qlt  inr    k=1 cor ctr    k=2 cor ctr  
## 1 |   15 |   99  998  185 | -101 992 203 |   -7   5  11 |
## 2 |   16 |  197  996  312 |  -91 959 331 |  -18  37 128 |
## 3 |   17 |  211  991   75 |  -23 281  23 |   37 710 594 |
## 4 |   18 |  254  989  235 |   70 980 255 |    7   9  24 |
## 5 |   19 |  239  990  194 |   62 877 188 |  -22 112 243 |

(a)What percentages of the Pearson ??2 for association are explained by the various dimensions?

The summary of the correspondence analysis reveals that there is 90.3% (Dimension 1) correlation between people of ages 15 through 18 whom had their charges dropped. In addition, the summary of the correspondence analysis also reveals a 9.0% (Dimension 2) association between people who are 19 and had the charges dropped.

(b)Plot the 2D correspondence analysis solution. Describe the pattern of association between year and age.

plot(criminal_ca)

The plotted correspondence analysis of the data sets shows there is an association between year and age. Specifically, we can note associations between the year 1958 and age 15 -16, year 1955 and age 19, year 1957 and age 17, and year 1956 and age 18.

Exercise 6.11 The data set Vietnam in vcdExtra gives a 2 × 5 × 4 contingency table in frequency form reflecting a survey of student opinion on the Vietnam War at the University of North Carolina in May 1967. The table variables are sex, year in school, and response, which has categories: (A) Defeat North Vietnam by widespread bombing and land invasion; (B) Maintain the present policy; (C) De-escalate military activity, stop bombing and begin negotiations; (D) Withdraw military forces immediately.

library("vcdExtra")

## Warning: package 'vcdExtra' was built under R version 3.4.4

## Loading required package: vcd

## Warning: package 'vcd' was built under R version 3.4.4

## Loading required package: grid

## 
## Attaching package: 'vcd'

## The following object is masked from 'package:logmult':
## 
##     assoc

data("Vietnam", package="vcdExtra")
str(Vietnam)

## 'data.frame':    40 obs. of  4 variables:
##  $ sex     : Factor w/ 2 levels "Female","Male": 1 1 1 1 1 1 1 1 1 1 ...
##  $ year    : int  1 1 1 1 2 2 2 2 3 3 ...
##  $ response: Factor w/ 4 levels "A","B","C","D": 1 2 3 4 1 2 3 4 1 2 ...
##  $ Freq    : int  13 19 40 5 5 9 33 3 22 29 ...

Using the stacking approach, carry out a correspondence analysis corresponding to the loglinear model [R][YS], which asserts that the response is independent of the combinations of year an sex.

Vietnam=within(Vietnam, {year_sex=paste(year, toupper(substr(sex,1,1)))})
Vietnam_lm=xtabs(Freq ~ year_sex + response, data=Vietnam)
Vietnam_Ca=ca(Vietnam_lm)
summary(Vietnam_Ca)

## 
## Principal inertias (eigenvalues):
## 
##  dim    value      %   cum%   scree plot               
##  1      0.085680  73.6  73.6  ******************       
##  2      0.027881  23.9  97.5  ******                   
##  3      0.002854   2.5 100.0  *                        
##         -------- -----                                 
##  Total: 0.116415 100.0                                 
## 
## 
## Rows:
##      name   mass  qlt  inr    k=1 cor ctr    k=2 cor ctr  
## 1  |   1F |   24  818   13 | -167 452   8 | -150 367  20 |
## 2  |   1M |  139  997  181 |  386 986 242 |  -41  11   8 |
## 3  |   2F |   16  995   35 | -407 647  31 | -299 349  51 |
## 4  |   2M |  140  984  131 |  326 982 175 |  -15   2   1 |
## 5  |   3F |   53  999  112 | -334 453  69 | -367 547 256 |
## 6  |   3M |  138  904   40 |  175 904  49 |   -4   0   0 |
## 7  |   4F |   32  982   37 | -344 887  44 | -113  95  15 |
## 8  |   4M |  149  383   23 |   81 372  11 |   14  11   1 |
## 9  |   5F |   59  994  153 | -453 686 143 | -304 309 197 |
## 10 |   5M |  248 1000  276 | -281 608 228 |  225 391 451 |
## 
## Columns:
##     name   mass  qlt  inr    k=1 cor ctr    k=2 cor ctr  
## 1 |    A |  255  985  381 |  414 985 509 |   -1   0   0 |
## 2 |    B |  235  720   60 |  135 608  50 |   58 112  28 |
## 3 |    C |  419  999  283 | -247 773 298 | -133 226 267 |
## 4 |    D |   92  995  276 | -366 383 143 |  463 612 705 |

Construct an informative 2D plot of the solution, and interpret in terms of how the response varies with year for males and females.

plot(Vietnam_Ca)

The plotted correspondence analysis of the data sets shows there is an association between year, response and sex. Specifically, we can note associations for response A from males of Year 1 & 2, for response B from males year 3 & 4, for response C females from years 1 & 4, and for response D males of year 5.

Use mjca () to carry out an MCA on the three-way table. Make a useful plot of the solution and interpret in terms of the relationship of the response to year and sex.

Vietnam_MCA=mjca(Vietnam_lm)
plot(Vietnam_MCA)

The plotted MCA analysis of the linear model reflects similar associations shown in the question below. Similarly, we can see associations for response A from males of Year 1 & 2, for response B from males year 3 & 4, for response C females from years 1 & 4, and for response D males of year 5.

Homework #5

Juan Colunga

July 4, 2018