library(ca)
data ("criminal", package = "logmult")
criminal
##       Age
## Year    15  16  17  18  19
##   1955 141 285 320 441 427
##   1956 144 292 342 441 396
##   1957 196 380 424 462 427
##   1958 212 424 399 442 430

(a)What percentages of the Pearson ??2 for association are explained by the various dimensions?

(criminal.ca <- ca(criminal))
## 
##  Principal inertias (eigenvalues):
##            1        2        3      
## Value      0.004939 0.000491 3.8e-05
## Percentage 90.33%   8.98%    0.69%  
## 
## 
##  Rows:
##              1955     1956      1957      1958
## Mass     0.229751 0.229893  0.268897  0.271459
## ChiDist  0.090897 0.061048  0.047585  0.088033
## Inertia  0.001898 0.000857  0.000609  0.002104
## Dim. 1   1.253085 0.827543 -0.553684 -1.212927
## Dim. 2  -0.984738 0.733468  1.206411 -0.982745
## 
## 
##  Columns:
##                15        16        17       18        19
## Mass     0.098648  0.196584  0.211388 0.254235  0.239146
## ChiDist  0.101134  0.093089  0.044072 0.071068  0.066594
## Inertia  0.001009  0.001703  0.000411 0.001284  0.001061
## Dim. 1  -1.433374 -1.297270 -0.332608 1.000960  0.887539
## Dim. 2  -0.333181 -0.808352  1.676250 0.307874 -1.007063

(b)Plot the 2D correspondence analysis solution. Describe the pattern of association between year and age.

plot(criminal.ca)

The above diagram shows that 90.3% explained by the 1st and 9% is explained by the 2nd dimension. The diagram shows the scoring of row and columns. Age 15 and Age 18 are closer to the center (horizontal), meaning more independent in the table. When two variables are closer to each other(for example, Age 17 and Year 1957, Age 18 and Year 1956),it indicates that those variables have negative./positive association. (Negative association=expected frequency is more than the observed frequency).

summary(criminal.ca)
## 
## Principal inertias (eigenvalues):
## 
##  dim    value      %   cum%   scree plot               
##  1      0.004939  90.3  90.3  ***********************  
##  2      0.000491   9.0  99.3  **                       
##  3      3.8e-050   0.7 100.0                           
##         -------- -----                                 
##  Total: 0.005468 100.0                                 
## 
## 
## Rows:
##     name   mass  qlt  inr    k=1 cor ctr    k=2 cor ctr  
## 1 | 1955 |  230  996  347 |   88 939 361 |  -22  58 223 |
## 2 | 1956 |  230  978  157 |   58 908 157 |   16  71 124 |
## 3 | 1957 |  269  984  111 |  -39 669  82 |   27 315 391 |
## 4 | 1958 |  271  999  385 |  -85 938 399 |  -22  61 262 |
## 
## Columns:
##     name   mass  qlt  inr    k=1 cor ctr    k=2 cor ctr  
## 1 |   15 |   99  998  185 | -101 992 203 |   -7   5  11 |
## 2 |   16 |  197  996  312 |  -91 959 331 |  -18  37 128 |
## 3 |   17 |  211  991   75 |  -23 281  23 |   37 710 594 |
## 4 |   18 |  254  989  235 |   70 980 255 |    7   9  24 |
## 5 |   19 |  239  990  194 |   62 877 188 |  -22 112 243 |
library(vcd)
## Loading required package: grid
mosaic(criminal, shade=TRUE, labeling=labeling_residuals)

The mosaic graph shows that Age 16 and Year 1958 has the highest association(negative) and Age 19 and Year 1955 has the second highest positive association.

Exercise 6.11:

data ("Vietnam", package = "vcdExtra")
str(Vietnam)
## 'data.frame':    40 obs. of  4 variables:
##  $ sex     : Factor w/ 2 levels "Female","Male": 1 1 1 1 1 1 1 1 1 1 ...
##  $ year    : int  1 1 1 1 2 2 2 2 3 3 ...
##  $ response: Factor w/ 4 levels "A","B","C","D": 1 2 3 4 1 2 3 4 1 2 ...
##  $ Freq    : int  13 19 40 5 5 9 33 3 22 29 ...
View(Vietnam)
  1. Using the stacking approach, carry out a correspondence analysis corresponding to the loglinear model [R][YS], which asserts that the response is independent of the combinations of year an sex.
Vietnam <- within(Vietnam, {YS <- paste(year, toupper(substr(sex, 1, 1)))})
Vietnam.tab <- xtabs(Freq ~ YS + response, data=Vietnam)
Vietnam.tab
##      response
## YS      A   B   C   D
##   1 F  13  19  40   5
##   1 M 175 116 131  17
##   2 F   5   9  33   3
##   2 M 160 126 135  21
##   3 F  22  29 110   6
##   3 M 132 120 154  29
##   4 F  12  21  58  10
##   4 M 145  95 185  44
##   5 F  19  27 128  13
##   5 M 118 176 345 141
  1. Construct an informative 2D plot of the solution, and interpret in terms of how the response varies with year for males and females.
Vietnam.ca <- ca(Vietnam.tab)
summary(Vietnam.ca)
## 
## Principal inertias (eigenvalues):
## 
##  dim    value      %   cum%   scree plot               
##  1      0.085680  73.6  73.6  ******************       
##  2      0.027881  23.9  97.5  ******                   
##  3      0.002854   2.5 100.0  *                        
##         -------- -----                                 
##  Total: 0.116415 100.0                                 
## 
## 
## Rows:
##      name   mass  qlt  inr    k=1 cor ctr    k=2 cor ctr  
## 1  |   1F |   24  818   13 | -167 452   8 | -150 367  20 |
## 2  |   1M |  139  997  181 |  386 986 242 |  -41  11   8 |
## 3  |   2F |   16  995   35 | -407 647  31 | -299 349  51 |
## 4  |   2M |  140  984  131 |  326 982 175 |  -15   2   1 |
## 5  |   3F |   53  999  112 | -334 453  69 | -367 547 256 |
## 6  |   3M |  138  904   40 |  175 904  49 |   -4   0   0 |
## 7  |   4F |   32  982   37 | -344 887  44 | -113  95  15 |
## 8  |   4M |  149  383   23 |   81 372  11 |   14  11   1 |
## 9  |   5F |   59  994  153 | -453 686 143 | -304 309 197 |
## 10 |   5M |  248 1000  276 | -281 608 228 |  225 391 451 |
## 
## Columns:
##     name   mass  qlt  inr    k=1 cor ctr    k=2 cor ctr  
## 1 |    A |  255  985  381 |  414 985 509 |   -1   0   0 |
## 2 |    B |  235  720   60 |  135 608  50 |   58 112  28 |
## 3 |    C |  419  999  283 | -247 773 298 | -133 226 267 |
## 4 |    D |   92  995  276 | -366 383 143 |  463 612 705 |
plot(Vietnam.ca)

The above graph illustrates that Females have C response and Males with age 5 have D response and Males with Age 1 and 2 have A response and Males with age 3 and 4 have B response. `

  1. Use mjca () to carry out an MCA on the three-way table. Make a useful plot of the solution and interpret in terms of the relationship of the response to year and sex.
Vietnam.mca <- mjca(Vietnam)
summary(Vietnam.mca)
## 
## Principal inertias (eigenvalues):
## 
##  dim    value      %   cum%   scree plot               
##  1      0.250000  12.7  12.7  ****                     
##  2      0.246386  12.5  25.3  ****                     
##  3      0.240002  12.2  37.5  ****                     
##  4      0.216676  11.0  48.5  ****                     
##  5      0.204820  10.4  58.9  ***                      
##  6      0.062500   3.2  62.1  *                        
##  7      0.062500   3.2  65.3  *                        
##  8      0.061381   3.1  68.4  *                        
##  9      0.052626   2.7  71.1  *                        
##  10     0.047937   2.4  73.5  *                        
##  11     0.035939   1.8  75.4  *                        
##  12     0.031659   1.6  77.0  *                        
##  13     00000000   0.0  77.0                           
##  14     00000000   0.0  77.0                           
##  15     00000000   0.0  77.0                           
##  16     00000000   0.0  77.0                           
##  17     00000000   0.0  77.0                           
##  18     00000000   0.0  77.0                           
##  19     00000000   0.0  77.0                           
##  20     00000000   0.0  77.0                           
##  21     00000000   0.0  77.0                           
##  22     00000000   0.0  77.0                           
##  23     00000000   0.0  77.0                           
##  24     00000000   0.0  77.0                           
##  25     00000000   0.0  77.0                           
##         -------- -----                                 
##  Total: 1.965000                                       
## 
## 
## Columns:
##            name   mass  qlt  inr     k=1 cor ctr     k=2 cor ctr  
## 1  | sex:Female |  100  260   16 |     0   0   0 |   351 260  50 |
## 2  |   sex:Male |  100  260   16 |     0   0   0 |  -351 260  50 |
## 3  |     year:1 |   40  158   25 |   323  59  17 |   419  99  29 |
## 4  |     year:2 |   40   59   25 |   323  56  17 |    74   3   1 |
## 5  |     year:3 |   40  833   27 | -1291 833 267 |     0   0   0 |
## 6  |     year:4 |   40  574   26 |   323  54  17 | -1000 520 162 |
## 7  |     year:5 |   40  196   25 |   323  56  17 |   507 139  42 |
## 8  | response:A |   50    0   15 |     0   0   0 |    10   0   0 |
## 9  | response:B |   50    0   15 |     0   0   0 |    -7   0   0 |
## 10 | response:C |   50    0   17 |     0   0   0 |     0   0   0 |
## 11 | response:D |   50    0   14 |     0   0   0 |    -3   0   0 |
## 12 |     Freq:3 |    5   71   16 |   323  25   2 |   444  46   4 |
## 13 |     Freq:5 |   10  256   15 |   323  56   4 |   613 200  15 |
## 14 |     Freq:6 |    5  417   16 | -1291 392  33 |   327  25   2 |
## 15 |     Freq:9 |    5   71   16 |   323  25   2 |   442  46   4 |
## 16 |    Freq:10 |    5  122   16 |   323  25   2 |  -645  98   8 |
## 17 |    Freq:12 |    5  120   16 |   323  25   2 |  -638  96   8 |
## 18 |    Freq:13 |   10  415   15 |   323  56   4 |   821 360  27 |
## 19 |    Freq:17 |    5   26   16 |   323  25   2 |    67   1   0 |
## 20 |    Freq:19 |   10  414   15 |   323  56   4 |   820 359  27 |
## 21 |    Freq:21 |   10  201   14 |   323  64   4 |  -472 137   9 |
## 22 |    Freq:22 |    5  418   16 | -1291 392  33 |   334  26   2 |
## 23 |    Freq:27 |    5  198   16 |   323  25   2 |   860 174  15 |
## 24 |    Freq:29 |   10  741   16 | -1291 741  67 |    -3   0   0 |
## 25 |    Freq:33 |    5   71   16 |   323  25   2 |   446  47   4 |
## 26 |    Freq:40 |    5  166   16 |   323  25   2 |   776 142  12 |
## 27 |    Freq:44 |    5  468   16 |   323  25   2 | -1373 444  38 |
## 28 |    Freq:58 |    5  122   16 |   323  25   2 |  -643  97   8 |
## 29 |    Freq:95 |    5  469   16 |   323  25   2 | -1375 445  38 |
## 30 |   Freq:110 |    5  418   16 | -1291 392  33 |   329  25   2 |
## 31 |   Freq:116 |    5   26   16 |   323  25   2 |    65   1   0 |
## 32 |   Freq:118 |    5   31   16 |   323  25   2 |   163   6   1 |
## 33 |   Freq:120 |    5  418   16 | -1291 392  33 |  -332  26   2 |
## 34 |   Freq:126 |    5   46   16 |   323  25   2 |  -300  21   2 |
## 35 |   Freq:128 |    5  200   16 |   323  25   2 |   863 175  15 |
## 36 |   Freq:131 |    5   26   16 |   323  25   2 |    69   1   0 |
## 37 |   Freq:132 |    5  417   16 | -1291 392  33 |  -324  25   2 |
## 38 |   Freq:135 |    5   45   16 |   323  25   2 |  -296  21   2 |
## 39 |   Freq:141 |    5   30   16 |   323  25   2 |   156   6   0 |
## 40 |   Freq:145 |    5  464   16 |   323  25   2 | -1366 439  38 |
## 41 |   Freq:154 |    5  418   16 | -1291 392  33 |  -329  25   2 |
## 42 |   Freq:160 |    5   44   16 |   323  25   2 |  -291  20   2 |
## 43 |   Freq:175 |    5   26   16 |   323  25   2 |    74   1   0 |
## 44 |   Freq:176 |    5   30   16 |   323  25   2 |   154   6   0 |
## 45 |   Freq:185 |    5  467   16 |   323  25   2 | -1371 442  38 |
## 46 |   Freq:345 |    5   30   16 |   323  25   2 |   157   6   1 |
## 47 |     YS:1 F |   20  272   21 |   323  41   8 |   770 231  48 |
## 48 |     YS:1 M |   20   31   25 |   323  30   8 |    69   1   0 |
## 49 |     YS:2 F |   20   99   24 |   323  33   8 |   459  66  17 |
## 50 |     YS:2 M |   20   63   24 |   323  33   8 |  -311  30   8 |
## 51 |     YS:3 F |   20  551   24 | -1291 523 133 |   301  28   7 |
## 52 |     YS:3 M |   20  551   24 | -1291 523 133 |  -301  28   7 |
## 53 |     YS:4 F |   20  157   24 |   323  33   8 |  -629 124  32 |
## 54 |     YS:4 M |   20  567   25 |   323  30   8 | -1371 537 153 |
## 55 |     YS:5 F |   20  291   23 |   323  36   8 |   856 255  59 |
## 56 |     YS:5 M |   20   37   25 |   323  30   8 |   157   7   2 |
plot(Vietnam.mca)