library(ca)
data ("criminal", package = "logmult")
criminal
## Age
## Year 15 16 17 18 19
## 1955 141 285 320 441 427
## 1956 144 292 342 441 396
## 1957 196 380 424 462 427
## 1958 212 424 399 442 430
(a)What percentages of the Pearson ??2 for association are explained by the various dimensions?
(criminal.ca <- ca(criminal))
##
## Principal inertias (eigenvalues):
## 1 2 3
## Value 0.004939 0.000491 3.8e-05
## Percentage 90.33% 8.98% 0.69%
##
##
## Rows:
## 1955 1956 1957 1958
## Mass 0.229751 0.229893 0.268897 0.271459
## ChiDist 0.090897 0.061048 0.047585 0.088033
## Inertia 0.001898 0.000857 0.000609 0.002104
## Dim. 1 1.253085 0.827543 -0.553684 -1.212927
## Dim. 2 -0.984738 0.733468 1.206411 -0.982745
##
##
## Columns:
## 15 16 17 18 19
## Mass 0.098648 0.196584 0.211388 0.254235 0.239146
## ChiDist 0.101134 0.093089 0.044072 0.071068 0.066594
## Inertia 0.001009 0.001703 0.000411 0.001284 0.001061
## Dim. 1 -1.433374 -1.297270 -0.332608 1.000960 0.887539
## Dim. 2 -0.333181 -0.808352 1.676250 0.307874 -1.007063
(b)Plot the 2D correspondence analysis solution. Describe the pattern of association between year and age.
plot(criminal.ca)
The above diagram shows that 90.3% explained by the 1st and 9% is explained by the 2nd dimension. The diagram shows the scoring of row and columns. Age 15 and Age 18 are closer to the center (horizontal), meaning more independent in the table. When two variables are closer to each other(for example, Age 17 and Year 1957, Age 18 and Year 1956),it indicates that those variables have negative./positive association. (Negative association=expected frequency is more than the observed frequency).
summary(criminal.ca)
##
## Principal inertias (eigenvalues):
##
## dim value % cum% scree plot
## 1 0.004939 90.3 90.3 ***********************
## 2 0.000491 9.0 99.3 **
## 3 3.8e-050 0.7 100.0
## -------- -----
## Total: 0.005468 100.0
##
##
## Rows:
## name mass qlt inr k=1 cor ctr k=2 cor ctr
## 1 | 1955 | 230 996 347 | 88 939 361 | -22 58 223 |
## 2 | 1956 | 230 978 157 | 58 908 157 | 16 71 124 |
## 3 | 1957 | 269 984 111 | -39 669 82 | 27 315 391 |
## 4 | 1958 | 271 999 385 | -85 938 399 | -22 61 262 |
##
## Columns:
## name mass qlt inr k=1 cor ctr k=2 cor ctr
## 1 | 15 | 99 998 185 | -101 992 203 | -7 5 11 |
## 2 | 16 | 197 996 312 | -91 959 331 | -18 37 128 |
## 3 | 17 | 211 991 75 | -23 281 23 | 37 710 594 |
## 4 | 18 | 254 989 235 | 70 980 255 | 7 9 24 |
## 5 | 19 | 239 990 194 | 62 877 188 | -22 112 243 |
library(vcd)
## Loading required package: grid
mosaic(criminal, shade=TRUE, labeling=labeling_residuals)
The mosaic graph shows that Age 16 and Year 1958 has the highest association(negative) and Age 19 and Year 1955 has the second highest positive association.
Exercise 6.11:
data ("Vietnam", package = "vcdExtra")
str(Vietnam)
## 'data.frame': 40 obs. of 4 variables:
## $ sex : Factor w/ 2 levels "Female","Male": 1 1 1 1 1 1 1 1 1 1 ...
## $ year : int 1 1 1 1 2 2 2 2 3 3 ...
## $ response: Factor w/ 4 levels "A","B","C","D": 1 2 3 4 1 2 3 4 1 2 ...
## $ Freq : int 13 19 40 5 5 9 33 3 22 29 ...
View(Vietnam)
Vietnam <- within(Vietnam, {YS <- paste(year, toupper(substr(sex, 1, 1)))})
Vietnam.tab <- xtabs(Freq ~ YS + response, data=Vietnam)
Vietnam.tab
## response
## YS A B C D
## 1 F 13 19 40 5
## 1 M 175 116 131 17
## 2 F 5 9 33 3
## 2 M 160 126 135 21
## 3 F 22 29 110 6
## 3 M 132 120 154 29
## 4 F 12 21 58 10
## 4 M 145 95 185 44
## 5 F 19 27 128 13
## 5 M 118 176 345 141
Vietnam.ca <- ca(Vietnam.tab)
summary(Vietnam.ca)
##
## Principal inertias (eigenvalues):
##
## dim value % cum% scree plot
## 1 0.085680 73.6 73.6 ******************
## 2 0.027881 23.9 97.5 ******
## 3 0.002854 2.5 100.0 *
## -------- -----
## Total: 0.116415 100.0
##
##
## Rows:
## name mass qlt inr k=1 cor ctr k=2 cor ctr
## 1 | 1F | 24 818 13 | -167 452 8 | -150 367 20 |
## 2 | 1M | 139 997 181 | 386 986 242 | -41 11 8 |
## 3 | 2F | 16 995 35 | -407 647 31 | -299 349 51 |
## 4 | 2M | 140 984 131 | 326 982 175 | -15 2 1 |
## 5 | 3F | 53 999 112 | -334 453 69 | -367 547 256 |
## 6 | 3M | 138 904 40 | 175 904 49 | -4 0 0 |
## 7 | 4F | 32 982 37 | -344 887 44 | -113 95 15 |
## 8 | 4M | 149 383 23 | 81 372 11 | 14 11 1 |
## 9 | 5F | 59 994 153 | -453 686 143 | -304 309 197 |
## 10 | 5M | 248 1000 276 | -281 608 228 | 225 391 451 |
##
## Columns:
## name mass qlt inr k=1 cor ctr k=2 cor ctr
## 1 | A | 255 985 381 | 414 985 509 | -1 0 0 |
## 2 | B | 235 720 60 | 135 608 50 | 58 112 28 |
## 3 | C | 419 999 283 | -247 773 298 | -133 226 267 |
## 4 | D | 92 995 276 | -366 383 143 | 463 612 705 |
plot(Vietnam.ca)
The above graph illustrates that Females have C response and Males with age 5 have D response and Males with Age 1 and 2 have A response and Males with age 3 and 4 have B response. `
Vietnam.mca <- mjca(Vietnam)
summary(Vietnam.mca)
##
## Principal inertias (eigenvalues):
##
## dim value % cum% scree plot
## 1 0.250000 12.7 12.7 ****
## 2 0.246386 12.5 25.3 ****
## 3 0.240002 12.2 37.5 ****
## 4 0.216676 11.0 48.5 ****
## 5 0.204820 10.4 58.9 ***
## 6 0.062500 3.2 62.1 *
## 7 0.062500 3.2 65.3 *
## 8 0.061381 3.1 68.4 *
## 9 0.052626 2.7 71.1 *
## 10 0.047937 2.4 73.5 *
## 11 0.035939 1.8 75.4 *
## 12 0.031659 1.6 77.0 *
## 13 00000000 0.0 77.0
## 14 00000000 0.0 77.0
## 15 00000000 0.0 77.0
## 16 00000000 0.0 77.0
## 17 00000000 0.0 77.0
## 18 00000000 0.0 77.0
## 19 00000000 0.0 77.0
## 20 00000000 0.0 77.0
## 21 00000000 0.0 77.0
## 22 00000000 0.0 77.0
## 23 00000000 0.0 77.0
## 24 00000000 0.0 77.0
## 25 00000000 0.0 77.0
## -------- -----
## Total: 1.965000
##
##
## Columns:
## name mass qlt inr k=1 cor ctr k=2 cor ctr
## 1 | sex:Female | 100 260 16 | 0 0 0 | 351 260 50 |
## 2 | sex:Male | 100 260 16 | 0 0 0 | -351 260 50 |
## 3 | year:1 | 40 158 25 | 323 59 17 | 419 99 29 |
## 4 | year:2 | 40 59 25 | 323 56 17 | 74 3 1 |
## 5 | year:3 | 40 833 27 | -1291 833 267 | 0 0 0 |
## 6 | year:4 | 40 574 26 | 323 54 17 | -1000 520 162 |
## 7 | year:5 | 40 196 25 | 323 56 17 | 507 139 42 |
## 8 | response:A | 50 0 15 | 0 0 0 | 10 0 0 |
## 9 | response:B | 50 0 15 | 0 0 0 | -7 0 0 |
## 10 | response:C | 50 0 17 | 0 0 0 | 0 0 0 |
## 11 | response:D | 50 0 14 | 0 0 0 | -3 0 0 |
## 12 | Freq:3 | 5 71 16 | 323 25 2 | 444 46 4 |
## 13 | Freq:5 | 10 256 15 | 323 56 4 | 613 200 15 |
## 14 | Freq:6 | 5 417 16 | -1291 392 33 | 327 25 2 |
## 15 | Freq:9 | 5 71 16 | 323 25 2 | 442 46 4 |
## 16 | Freq:10 | 5 122 16 | 323 25 2 | -645 98 8 |
## 17 | Freq:12 | 5 120 16 | 323 25 2 | -638 96 8 |
## 18 | Freq:13 | 10 415 15 | 323 56 4 | 821 360 27 |
## 19 | Freq:17 | 5 26 16 | 323 25 2 | 67 1 0 |
## 20 | Freq:19 | 10 414 15 | 323 56 4 | 820 359 27 |
## 21 | Freq:21 | 10 201 14 | 323 64 4 | -472 137 9 |
## 22 | Freq:22 | 5 418 16 | -1291 392 33 | 334 26 2 |
## 23 | Freq:27 | 5 198 16 | 323 25 2 | 860 174 15 |
## 24 | Freq:29 | 10 741 16 | -1291 741 67 | -3 0 0 |
## 25 | Freq:33 | 5 71 16 | 323 25 2 | 446 47 4 |
## 26 | Freq:40 | 5 166 16 | 323 25 2 | 776 142 12 |
## 27 | Freq:44 | 5 468 16 | 323 25 2 | -1373 444 38 |
## 28 | Freq:58 | 5 122 16 | 323 25 2 | -643 97 8 |
## 29 | Freq:95 | 5 469 16 | 323 25 2 | -1375 445 38 |
## 30 | Freq:110 | 5 418 16 | -1291 392 33 | 329 25 2 |
## 31 | Freq:116 | 5 26 16 | 323 25 2 | 65 1 0 |
## 32 | Freq:118 | 5 31 16 | 323 25 2 | 163 6 1 |
## 33 | Freq:120 | 5 418 16 | -1291 392 33 | -332 26 2 |
## 34 | Freq:126 | 5 46 16 | 323 25 2 | -300 21 2 |
## 35 | Freq:128 | 5 200 16 | 323 25 2 | 863 175 15 |
## 36 | Freq:131 | 5 26 16 | 323 25 2 | 69 1 0 |
## 37 | Freq:132 | 5 417 16 | -1291 392 33 | -324 25 2 |
## 38 | Freq:135 | 5 45 16 | 323 25 2 | -296 21 2 |
## 39 | Freq:141 | 5 30 16 | 323 25 2 | 156 6 0 |
## 40 | Freq:145 | 5 464 16 | 323 25 2 | -1366 439 38 |
## 41 | Freq:154 | 5 418 16 | -1291 392 33 | -329 25 2 |
## 42 | Freq:160 | 5 44 16 | 323 25 2 | -291 20 2 |
## 43 | Freq:175 | 5 26 16 | 323 25 2 | 74 1 0 |
## 44 | Freq:176 | 5 30 16 | 323 25 2 | 154 6 0 |
## 45 | Freq:185 | 5 467 16 | 323 25 2 | -1371 442 38 |
## 46 | Freq:345 | 5 30 16 | 323 25 2 | 157 6 1 |
## 47 | YS:1 F | 20 272 21 | 323 41 8 | 770 231 48 |
## 48 | YS:1 M | 20 31 25 | 323 30 8 | 69 1 0 |
## 49 | YS:2 F | 20 99 24 | 323 33 8 | 459 66 17 |
## 50 | YS:2 M | 20 63 24 | 323 33 8 | -311 30 8 |
## 51 | YS:3 F | 20 551 24 | -1291 523 133 | 301 28 7 |
## 52 | YS:3 M | 20 551 24 | -1291 523 133 | -301 28 7 |
## 53 | YS:4 F | 20 157 24 | 323 33 8 | -629 124 32 |
## 54 | YS:4 M | 20 567 25 | 323 30 8 | -1371 537 153 |
## 55 | YS:5 F | 20 291 23 | 323 36 8 | 856 255 59 |
## 56 | YS:5 M | 20 37 25 | 323 30 8 | 157 7 2 |
plot(Vietnam.mca)