install.packages('vcd',repos = "http://cran.us.r-project.org")
## Installing package into 'C:/Users/yanning/Documents/R/win-library/3.4'
## (as 'lib' is unspecified)
## package 'vcd' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\yanning\AppData\Local\Temp\RtmpMN3xWE\downloaded_packages
install.packages('vcdExtra',repos = "http://cran.us.r-project.org")
## Installing package into 'C:/Users/yanning/Documents/R/win-library/3.4'
## (as 'lib' is unspecified)
## package 'vcdExtra' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\yanning\AppData\Local\Temp\RtmpMN3xWE\downloaded_packages
install.packages('logmult',repos = "http://cran.us.r-project.org")
## Installing package into 'C:/Users/yanning/Documents/R/win-library/3.4'
## (as 'lib' is unspecified)
## package 'logmult' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\yanning\AppData\Local\Temp\RtmpMN3xWE\downloaded_packages
install.packages('ca',repos = "http://cran.us.r-project.org")
## Installing package into 'C:/Users/yanning/Documents/R/win-library/3.4'
## (as 'lib' is unspecified)
## package 'ca' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\yanning\AppData\Local\Temp\RtmpMN3xWE\downloaded_packages
library(vcd)
## Warning: package 'vcd' was built under R version 3.4.3
## Loading required package: grid
library(vcdExtra)
## Warning: package 'vcdExtra' was built under R version 3.4.3
## Loading required package: gnm
## Warning: package 'gnm' was built under R version 3.4.3
library(ca)
## Warning: package 'ca' was built under R version 3.4.3
library(logmult)
## Warning: package 'logmult' was built under R version 3.4.3
##
## Attaching package: 'logmult'
## The following object is masked from 'package:gnm':
##
## se
## The following object is masked from 'package:vcd':
##
## assoc
data("criminal",package = "logmult")
criminal
## Age
## Year 15 16 17 18 19
## 1955 141 285 320 441 427
## 1956 144 292 342 441 396
## 1957 196 380 424 462 427
## 1958 212 424 399 442 430
(a)What percentages of the Pearson χ2 for association are explained by the various dimensions?
criminal.ca <- ca(criminal)
summary(criminal.ca)
##
## Principal inertias (eigenvalues):
##
## dim value % cum% scree plot
## 1 0.004939 90.3 90.3 ***********************
## 2 0.000491 9.0 99.3 **
## 3 3.8e-050 0.7 100.0
## -------- -----
## Total: 0.005468 100.0
##
##
## Rows:
## name mass qlt inr k=1 cor ctr k=2 cor ctr
## 1 | 1955 | 230 996 347 | 88 939 361 | -22 58 223 |
## 2 | 1956 | 230 978 157 | 58 908 157 | 16 71 124 |
## 3 | 1957 | 269 984 111 | -39 669 82 | 27 315 391 |
## 4 | 1958 | 271 999 385 | -85 938 399 | -22 61 262 |
##
## Columns:
## name mass qlt inr k=1 cor ctr k=2 cor ctr
## 1 | 15 | 99 998 185 | -101 992 203 | -7 5 11 |
## 2 | 16 | 197 996 312 | -91 959 331 | -18 37 128 |
## 3 | 17 | 211 991 75 | -23 281 23 | 37 710 594 |
## 4 | 18 | 254 989 235 | 70 980 255 | 7 9 24 |
## 5 | 19 | 239 990 194 | 62 877 188 | -22 112 243 |
Based on above statistics, dimension 1 explains 90.3% of pearsion x2 for association, and dimension 2 explains 9% of pearsion x2 for association.
(b)Plot the 2D correspondence analysis solution. Describe the pattern of association between year and age.
plot(criminal.ca)
Based on the plot, closer variables indicate a pattern of association. So age 17 and year 1957, age 18 and year 1956, age 16 and year 1958 indicate pattern of assocation.
Exercise 6.11
data("Vietnam", package = "vcdExtra")
str (Vietnam)
## 'data.frame': 40 obs. of 4 variables:
## $ sex : Factor w/ 2 levels "Female","Male": 1 1 1 1 1 1 1 1 1 1 ...
## $ year : int 1 1 1 1 2 2 2 2 3 3 ...
## $ response: Factor w/ 4 levels "A","B","C","D": 1 2 3 4 1 2 3 4 1 2 ...
## $ Freq : int 13 19 40 5 5 9 33 3 22 29 ...
Vietnam <- within(Vietnam, {year_sex <- paste(year, toupper(substr(sex,1,1)))})
Vietnam.year_sex <- xtabs(Freq ~ year_sex + response, data=Vietnam)
Vietnam.ca <- ca(Vietnam.year_sex)
summary(Vietnam.ca)
##
## Principal inertias (eigenvalues):
##
## dim value % cum% scree plot
## 1 0.085680 73.6 73.6 ******************
## 2 0.027881 23.9 97.5 ******
## 3 0.002854 2.5 100.0 *
## -------- -----
## Total: 0.116415 100.0
##
##
## Rows:
## name mass qlt inr k=1 cor ctr k=2 cor ctr
## 1 | 1F | 24 818 13 | -167 452 8 | -150 367 20 |
## 2 | 1M | 139 997 181 | 386 986 242 | -41 11 8 |
## 3 | 2F | 16 995 35 | -407 647 31 | -299 349 51 |
## 4 | 2M | 140 984 131 | 326 982 175 | -15 2 1 |
## 5 | 3F | 53 999 112 | -334 453 69 | -367 547 256 |
## 6 | 3M | 138 904 40 | 175 904 49 | -4 0 0 |
## 7 | 4F | 32 982 37 | -344 887 44 | -113 95 15 |
## 8 | 4M | 149 383 23 | 81 372 11 | 14 11 1 |
## 9 | 5F | 59 994 153 | -453 686 143 | -304 309 197 |
## 10 | 5M | 248 1000 276 | -281 608 228 | 225 391 451 |
##
## Columns:
## name mass qlt inr k=1 cor ctr k=2 cor ctr
## 1 | A | 255 985 381 | 414 985 509 | -1 0 0 |
## 2 | B | 235 720 60 | 135 608 50 | 58 112 28 |
## 3 | C | 419 999 283 | -247 773 298 | -133 226 267 |
## 4 | D | 92 995 276 | -366 383 143 | 463 612 705 |
plot(Vietnam.ca)
Based on plot, when two variables are closert to each other, it indicates an association between variables. Hence, the above plot indicates that on on year 1 and 4, female tend to gave answer C. on year 3 and 4, male to tend gave answer B. on year 1 and 2, male tend to gave answer A. Other associations don’t seem to be significant enought.
vietnam.mca <- mjca(Vietnam.year_sex)
plot(vietnam.mca)
Based on the above plot, it indicates that on year 1 and 2, male tend to response in A. on year 3 and 4, male tend to response in B. and year 1 and 4, female tend to response in C. Other associations don’t seem to be significant enought.