The data set criminal in the package logmult gives the 4 × 5 table below of the number of men aged 15-19 charged with a criminal case for whom charges were dropped in Denmark from 1955-1958.
data("criminal", package="logmult")
criminal## Age
## Year 15 16 17 18 19
## 1955 141 285 320 441 427
## 1956 144 292 342 441 396
## 1957 196 380 424 462 427
## 1958 212 424 399 442 430
1.1 Use loglm() to test whether there is an association between Year and Age. Is there evidence that dropping of charges in relation to age changed over the years recorded here?
c.lm <- loglm(Year ~ Age, data=criminal, fitted=TRUE)
summary(c.lm)## Formula:
## Year ~ Age
## attr(,"variables")
## list(Year, Age)
## attr(,"factors")
## Age
## Year 0
## Age 1
## attr(,"term.labels")
## [1] "Age"
## attr(,"order")
## [1] 1
## attr(,"intercept")
## [1] 1
## attr(,"response")
## [1] 1
## attr(,".Environment")
## <environment: R_GlobalEnv>
##
## Statistics:
## X^2 df P(> X^2)
## Likelihood Ratio 84.14370 15 1.210687e-11
## Pearson 84.29411 15 1.135747e-11
a<-residuals(c.lm)
a## Age
## Year 15 16 17 18 19
## 1955 -2.5327533 -3.3444912 -2.7248845 -0.2608238 0.3406228
## 1956 -2.2896215 -2.9446979 -1.5386917 -0.2608238 -1.1825075
## 1957 1.6925059 1.8400733 2.6764500 0.7293514 0.3406228
## 1958 2.8433554 4.0907254 1.4228176 -0.2133211 0.4860327
sum(a^2)## [1] 84.1437
assoc(criminal, shade=TRUE)As we can see, there is association between year and Age. In 1955, age 19 was high criminal age gradually, it becomes younger, in 1958, the age was 16.
1.2 (b) Use mosaic() with the option shade=TRUE to display the pattern of signs and magnitudes of the residuals. Compare this with the result of mosaic() using “Friendly shading,” from the option gp=shading_Friendly. Describe verbally what you see in each regarding the pattern of association in this table.
m1 <- mosaic(criminal, shade=TRUE)m2 <- mosaic(criminal, gp=shading_Friendly)m1## Age 15 16 17 18 19
## Year
## 1955 141 285 320 441 427
## 1956 144 292 342 441 396
## 1957 196 380 424 462 427
## 1958 212 424 399 442 430
m2## Age 15 16 17 18 19
## Year
## 1955 141 285 320 441 427
## 1956 144 292 342 441 396
## 1957 196 380 424 462 427
## 1958 212 424 399 442 430
it shows the Age and Year has an association relationship. in 1955, 19 years old teen dominates the crime, in 1958, 16 years old teen commit more crimes.
Bertin (1983, pp. 30-31) used a 4-way table of frequencies of traffic accident victims in France in 1958 to illustrate his scheme for classifying data sets by numerous variables, each of which could have various types and could be assigned to various visual attributes. His data are contained in Accident in vcdExtra, a frequency data frame representing his 5 × 2 × 4 × 2 table of the variables age, result (died or injured), mode of transportation, and gender.
2.1 Use loglm() to fit the model of mutual independence, Freq ~ age+mode+gender+result to this data set.
data("Accident", package="vcdExtra")
str(Accident, vec.len=2)## 'data.frame': 80 obs. of 5 variables:
## $ age : Ord.factor w/ 5 levels "0-9"<"10-19"<..: 5 5 5 5 5 ...
## $ result: Factor w/ 2 levels "Died","Injured": 1 1 1 1 1 ...
## $ mode : Factor w/ 4 levels "4-Wheeled","Bicycle",..: 4 4 2 2 3 ...
## $ gender: Factor w/ 2 levels "Female","Male": 2 1 2 1 2 ...
## $ Freq : int 704 378 396 56 742 ...
f.lm <- loglm(Freq ~ age + mode+ gender+result, data=Accident)
summary(f.lm)## Formula:
## Freq ~ age + mode + gender + result
## attr(,"variables")
## list(Freq, age, mode, gender, result)
## attr(,"factors")
## age mode gender result
## Freq 0 0 0 0
## age 1 0 0 0
## mode 0 1 0 0
## gender 0 0 1 0
## result 0 0 0 1
## attr(,"term.labels")
## [1] "age" "mode" "gender" "result"
## attr(,"order")
## [1] 1 1 1 1
## attr(,"intercept")
## [1] 1
## attr(,"response")
## [1] 1
## attr(,".Environment")
## <environment: R_GlobalEnv>
## attr(,"predvars")
## list(Freq, age, mode, gender, result)
## attr(,"dataClasses")
## Freq age mode gender result
## "numeric" "ordered" "factor" "factor" "factor"
##
## Statistics:
## X^2 df P(> X^2)
## Likelihood Ratio 60320.05 70 0
## Pearson 76865.31 70 0
f.m1<-mosaic(f.lm , abbreviate_labs=TRUE, clip=FALSE)f.lm2 <- loglm(Freq ~ mode+gender+age+result, data=Accident)
f.m2 <- mosaic(f.lm2, abbreviate_labs=TRUE)f.lm3 <- loglm(Freq ~ gender+result+age+mode, data=Accident)
f.m3 <- mosaic(f.lm3, abbreviate_labs=TRUE)f.lm4 <- loglm(Freq ~ mode+result+gender+age, data=Accident)
f.m4 <- mosaic(f.lm4, abbreviate_labs=TRUE)Treat result (“Died” vs. “Injured”) as the response variable, and fit the model Freq ~ agexmodexgender + result that asserts independence of result from all others jointly.
f.lm5 <- loglm(Freq ~ age*mode*gender+result, data=Accident)
summary(f.lm2)## Formula:
## Freq ~ mode + gender + age + result
## attr(,"variables")
## list(Freq, mode, gender, age, result)
## attr(,"factors")
## mode gender age result
## Freq 0 0 0 0
## mode 1 0 0 0
## gender 0 1 0 0
## age 0 0 1 0
## result 0 0 0 1
## attr(,"term.labels")
## [1] "mode" "gender" "age" "result"
## attr(,"order")
## [1] 1 1 1 1
## attr(,"intercept")
## [1] 1
## attr(,"response")
## [1] 1
## attr(,".Environment")
## <environment: R_GlobalEnv>
## attr(,"predvars")
## list(Freq, mode, gender, age, result)
## attr(,"dataClasses")
## Freq mode gender age result
## "numeric" "factor" "factor" "ordered" "factor"
##
## Statistics:
## X^2 df P(> X^2)
## Likelihood Ratio 60320.05 70 0
## Pearson 76865.31 70 0
f.m5 <- mosaic(f.lm5, abbreviate_labs=TRUE)From the graph we can see the Male, with age greater than 50 has a higher rate of deather, also, Felmale of age 10-19 years old with bicycle has a high rate of death.