2.4
- Read the data and find the total number of cases represented in this table.
data("DanishWelfare")
sum(DanishWelfare$Freq)
## [1] 5144
- See the structure of the data and change the variables Alcohol and Income into ordered variables.
str(DanishWelfare)
## 'data.frame': 180 obs. of 5 variables:
## $ Freq : num 1 4 1 8 6 14 8 41 100 175 ...
## $ Alcohol: Factor w/ 3 levels "<1","1-2",">2": 1 1 1 1 1 1 1 1 1 1 ...
## $ Income : Factor w/ 4 levels "0-50","50-100",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Status : Factor w/ 3 levels "Widow","Married",..: 1 1 1 1 1 2 2 2 2 2 ...
## $ Urban : Factor w/ 5 levels "Copenhagen","SubCopenhagen",..: 1 2 3 4 5 1 2 3 4 5 ...
levels(DanishWelfare$Alcohol)
## [1] "<1" "1-2" ">2"
levels(DanishWelfare$Income)
## [1] "0-50" "50-100" "100-150" ">150"
DanishWelfare$Alcohol = as.ordered(DanishWelfare$Alcohol)
DanishWelfare$Income = as.ordered(DanishWelfare$Income)
- Convert this data frame to table form, DanishWelfare.tab, a 4-way array containing the frequencies with appropriate variable names and level names (hint: review xtabs()).
DanishWelfare.tab = ftable(xtabs(Freq ~ Alcohol+ Urban+ Status+ Income , data = DanishWelfare))
str(DanishWelfare.tab)
## 'ftable' num [1:45, 1:4] 1 14 6 4 8 1 1 41 2 8 ...
## - attr(*, "row.vars")=List of 3
## ..$ Alcohol: chr [1:3] "<1" "1-2" ">2"
## ..$ Urban : chr [1:5] "Copenhagen" "SubCopenhagen" "LargeCity" "City" ...
## ..$ Status : chr [1:3] "Widow" "Married" "Unmarried"
## - attr(*, "col.vars")=List of 1
## ..$ Income: chr [1:4] "0-50" "50-100" "100-150" ">150"
head(DanishWelfare.tab)
##
## "Income" "0-50" "50-100" "100-150"
## "Alcohol" "Urban" "Status"
## "<1" "Copenhagen" "Widow" 1 8 2
## "Married" 14 42 21
## "Unmarried" 6 7 3
## "SubCopenhagen" "Widow" 4 2 3
## "Married" 8 51 30
## "Unmarried" 1 5 2
##
## ">150"
##
## 42
## 24
## 33
## 29
## 30
## 24
- The variable Urban has 5 categories. Find the total frequencies in each of these.
aggregate(Freq~Urban,data=DanishWelfare,sum)
## Urban Freq
## 1 Copenhagen 552
## 2 SubCopenhagen 614
## 3 LargeCity 594
## 4 City 1765
## 5 Country 1619
2.5
data("UKSoccer", package = "vcd")
ftable(UKSoccer)
## Away 0 1 2 3 4
## Home
## 0 27 29 10 8 2
## 1 59 53 14 12 4
## 2 28 32 14 12 4
## 3 19 14 7 4 1
## 4 7 8 10 2 0
- Verify that the total number of games represented in this table is 380.
summary(UKSoccer)
## Number of cases in table: 380
## Number of factors: 2
## Test for independence of all factors:
## Chisq = 18.699, df = 16, p-value = 0.2846
## Chi-squared approximation may be incorrect
- Express each of the marginal totals as proportions.
prop_table = addmargins(prop.table(UKSoccer))
# Home
prop_table[1:5, 'Sum']
## 0 1 2 3 4
## 0.20000000 0.37368421 0.23684211 0.11842105 0.07105263
# Away
prop_table['Sum', 1:5]
## 0 1 2 3 4
## 0.36842105 0.35789474 0.14473684 0.10000000 0.02894737
2.6
data("Saxony", package = "vcd")
Saxony
## nMales
## 0 1 2 3 4 5 6 7 8 9 10 11 12
## 3 24 104 286 670 1033 1343 1112 829 478 181 45 7
str(Saxony)
## 'table' num [1:13(1d)] 3 24 104 286 670 ...
## - attr(*, "dimnames")=List of 1
## ..$ nMales: chr [1:13] "0" "1" "2" "3" ...
data("Geissler", package = "vcdExtra")
str(Geissler)
## 'data.frame': 90 obs. of 4 variables:
## $ boys : int 0 0 0 0 0 0 0 0 0 0 ...
## $ girls: num 1 2 3 4 5 6 7 8 9 10 ...
## $ size : num 1 2 3 4 5 6 7 8 9 10 ...
## $ Freq : int 108719 42860 17395 7004 2839 1096 436 161 66 30 ...
- Use subset() to create a data frame, sax12 containing the Geissler observations in families with size==12.
sax12 = subset(Geissler, size == 12)
sax12
## boys girls size Freq
## 12 0 12 12 3
## 24 1 11 12 24
## 35 2 10 12 104
## 45 3 9 12 286
## 54 4 8 12 670
## 62 5 7 12 1033
## 69 6 6 12 1343
## 75 7 5 12 1112
## 80 8 4 12 829
## 84 9 3 12 478
## 87 10 2 12 181
## 89 11 1 12 45
## 90 12 0 12 7
- Select the columns for boys and Freq.
subset(sax12, select = c(boys,Freq))
## boys Freq
## 12 0 3
## 24 1 24
## 35 2 104
## 45 3 286
## 54 4 670
## 62 5 1033
## 69 6 1343
## 75 7 1112
## 80 8 829
## 84 9 478
## 87 10 181
## 89 11 45
## 90 12 7
- Use xtabs() with a formula, Freq ~ boys, to create the one-way table.
xtabs(Freq ~ boys,data = sax12)
## boys
## 0 1 2 3 4 5 6 7 8 9 10 11 12
## 3 24 104 286 670 1033 1343 1112 829 478 181 45 7