Exercise 2.4 The data set data in vcd gives a 4-way, 3 × 4 × 3 × 5 table as a data frame in frequency form, containing the variable Freq and four factors, Alcohol, Income, Status, and Urban. The variable Alcohol can be considered as the response variable, and the others as possible predictors.
Read the data and find the total number of cases represented in this table.
data("DanishWelfare")
summary(DanishWelfare)
## Freq Alcohol Income Status Urban
## Min. : 0.00 <1 :60 0-50 :45 Widow :60 Copenhagen :36
## 1st Qu.: 3.00 1-2:60 50-100 :45 Married :60 SubCopenhagen:36
## Median : 12.00 >2 :60 100-150:45 Unmarried:60 LargeCity :36
## Mean : 28.58 >150 :45 City :36
## 3rd Qu.: 35.25 Country :36
## Max. :255.00
See the structure of the data and change the variables Alcohol and Income into ordered variables.
DanishWelfare[order(DanishWelfare$Alcohol,DanishWelfare$Income),]
## Freq Alcohol Income Status Urban
## 1 1 <1 0-50 Widow Copenhagen
## 2 4 <1 0-50 Widow SubCopenhagen
## 3 1 <1 0-50 Widow LargeCity
## 4 8 <1 0-50 Widow City
## 5 6 <1 0-50 Widow Country
## 6 14 <1 0-50 Married Copenhagen
## 7 8 <1 0-50 Married SubCopenhagen
## 8 41 <1 0-50 Married LargeCity
## 9 100 <1 0-50 Married City
## 10 175 <1 0-50 Married Country
## 11 6 <1 0-50 Unmarried Copenhagen
## 12 1 <1 0-50 Unmarried SubCopenhagen
## 13 2 <1 0-50 Unmarried LargeCity
## 14 6 <1 0-50 Unmarried City
## 15 9 <1 0-50 Unmarried Country
## 16 8 <1 50-100 Widow Copenhagen
## 17 2 <1 50-100 Widow SubCopenhagen
## 18 7 <1 50-100 Widow LargeCity
## 19 14 <1 50-100 Widow City
## 20 5 <1 50-100 Widow Country
## 21 42 <1 50-100 Married Copenhagen
## 22 51 <1 50-100 Married SubCopenhagen
## 23 62 <1 50-100 Married LargeCity
## 24 234 <1 50-100 Married City
## 25 255 <1 50-100 Married Country
## 26 7 <1 50-100 Unmarried Copenhagen
## 27 5 <1 50-100 Unmarried SubCopenhagen
## 28 9 <1 50-100 Unmarried LargeCity
## 29 20 <1 50-100 Unmarried City
## 30 27 <1 50-100 Unmarried Country
## 31 2 <1 100-150 Widow Copenhagen
## 32 3 <1 100-150 Widow SubCopenhagen
## 33 1 <1 100-150 Widow LargeCity
## 34 5 <1 100-150 Widow City
## 35 2 <1 100-150 Widow Country
## 36 21 <1 100-150 Married Copenhagen
## 37 30 <1 100-150 Married SubCopenhagen
## 38 23 <1 100-150 Married LargeCity
## 39 87 <1 100-150 Married City
## 40 77 <1 100-150 Married Country
## 41 3 <1 100-150 Unmarried Copenhagen
## 42 2 <1 100-150 Unmarried SubCopenhagen
## 43 1 <1 100-150 Unmarried LargeCity
## 44 12 <1 100-150 Unmarried City
## 45 4 <1 100-150 Unmarried Country
## 46 42 <1 >150 Widow Copenhagen
## 47 29 <1 >150 Widow SubCopenhagen
## 48 17 <1 >150 Widow LargeCity
## 49 95 <1 >150 Widow City
## 50 46 <1 >150 Widow Country
## 51 24 <1 >150 Married Copenhagen
## 52 30 <1 >150 Married SubCopenhagen
## 53 50 <1 >150 Married LargeCity
## 54 167 <1 >150 Married City
## 55 232 <1 >150 Married Country
## 56 33 <1 >150 Unmarried Copenhagen
## 57 24 <1 >150 Unmarried SubCopenhagen
## 58 15 <1 >150 Unmarried LargeCity
## 59 64 <1 >150 Unmarried City
## 60 68 <1 >150 Unmarried Country
## 61 3 1-2 0-50 Widow Copenhagen
## 62 0 1-2 0-50 Widow SubCopenhagen
## 63 1 1-2 0-50 Widow LargeCity
## 64 4 1-2 0-50 Widow City
## 65 2 1-2 0-50 Widow Country
## 66 15 1-2 0-50 Married Copenhagen
## 67 7 1-2 0-50 Married SubCopenhagen
## 68 15 1-2 0-50 Married LargeCity
## 69 25 1-2 0-50 Married City
## 70 48 1-2 0-50 Married Country
## 71 2 1-2 0-50 Unmarried Copenhagen
## 72 3 1-2 0-50 Unmarried SubCopenhagen
## 73 9 1-2 0-50 Unmarried LargeCity
## 74 9 1-2 0-50 Unmarried City
## 75 7 1-2 0-50 Unmarried Country
## 76 1 1-2 50-100 Widow Copenhagen
## 77 1 1-2 50-100 Widow SubCopenhagen
## 78 3 1-2 50-100 Widow LargeCity
## 79 8 1-2 50-100 Widow City
## 80 4 1-2 50-100 Widow Country
## 81 39 1-2 50-100 Married Copenhagen
## 82 59 1-2 50-100 Married SubCopenhagen
## 83 68 1-2 50-100 Married LargeCity
## 84 172 1-2 50-100 Married City
## 85 143 1-2 50-100 Married Country
## 86 12 1-2 50-100 Unmarried Copenhagen
## 87 3 1-2 50-100 Unmarried SubCopenhagen
## 88 11 1-2 50-100 Unmarried LargeCity
## 89 20 1-2 50-100 Unmarried City
## 90 23 1-2 50-100 Unmarried Country
## 91 5 1-2 100-150 Widow Copenhagen
## 92 4 1-2 100-150 Widow SubCopenhagen
## 93 1 1-2 100-150 Widow LargeCity
## 94 9 1-2 100-150 Widow City
## 95 4 1-2 100-150 Widow Country
## 96 32 1-2 100-150 Married Copenhagen
## 97 68 1-2 100-150 Married SubCopenhagen
## 98 43 1-2 100-150 Married LargeCity
## 99 128 1-2 100-150 Married City
## 100 86 1-2 100-150 Married Country
## 101 6 1-2 100-150 Unmarried Copenhagen
## 102 10 1-2 100-150 Unmarried SubCopenhagen
## 103 5 1-2 100-150 Unmarried LargeCity
## 104 21 1-2 100-150 Unmarried City
## 105 15 1-2 100-150 Unmarried Country
## 106 26 1-2 >150 Widow Copenhagen
## 107 34 1-2 >150 Widow SubCopenhagen
## 108 14 1-2 >150 Widow LargeCity
## 109 48 1-2 >150 Widow City
## 110 24 1-2 >150 Widow Country
## 111 43 1-2 >150 Married Copenhagen
## 112 76 1-2 >150 Married SubCopenhagen
## 113 70 1-2 >150 Married LargeCity
## 114 198 1-2 >150 Married City
## 115 136 1-2 >150 Married Country
## 116 36 1-2 >150 Unmarried Copenhagen
## 117 23 1-2 >150 Unmarried SubCopenhagen
## 118 48 1-2 >150 Unmarried LargeCity
## 119 89 1-2 >150 Unmarried City
## 120 64 1-2 >150 Unmarried Country
## 121 2 >2 0-50 Widow Copenhagen
## 122 0 >2 0-50 Widow SubCopenhagen
## 123 2 >2 0-50 Widow LargeCity
## 124 1 >2 0-50 Widow City
## 125 0 >2 0-50 Widow Country
## 126 1 >2 0-50 Married Copenhagen
## 127 2 >2 0-50 Married SubCopenhagen
## 128 2 >2 0-50 Married LargeCity
## 129 7 >2 0-50 Married City
## 130 7 >2 0-50 Married Country
## 131 3 >2 0-50 Unmarried Copenhagen
## 132 0 >2 0-50 Unmarried SubCopenhagen
## 133 1 >2 0-50 Unmarried LargeCity
## 134 5 >2 0-50 Unmarried City
## 135 1 >2 0-50 Unmarried Country
## 136 3 >2 50-100 Widow Copenhagen
## 137 0 >2 50-100 Widow SubCopenhagen
## 138 2 >2 50-100 Widow LargeCity
## 139 1 >2 50-100 Widow City
## 140 3 >2 50-100 Widow Country
## 141 14 >2 50-100 Married Copenhagen
## 142 21 >2 50-100 Married SubCopenhagen
## 143 14 >2 50-100 Married LargeCity
## 144 38 >2 50-100 Married City
## 145 35 >2 50-100 Married Country
## 146 2 >2 50-100 Unmarried Copenhagen
## 147 0 >2 50-100 Unmarried SubCopenhagen
## 148 3 >2 50-100 Unmarried LargeCity
## 149 12 >2 50-100 Unmarried City
## 150 13 >2 50-100 Unmarried Country
## 151 2 >2 100-150 Widow Copenhagen
## 152 1 >2 100-150 Widow SubCopenhagen
## 153 1 >2 100-150 Widow LargeCity
## 154 1 >2 100-150 Widow City
## 155 0 >2 100-150 Widow Country
## 156 20 >2 100-150 Married Copenhagen
## 157 31 >2 100-150 Married SubCopenhagen
## 158 10 >2 100-150 Married LargeCity
## 159 36 >2 100-150 Married City
## 160 21 >2 100-150 Married Country
## 161 0 >2 100-150 Unmarried Copenhagen
## 162 2 >2 100-150 Unmarried SubCopenhagen
## 163 3 >2 100-150 Unmarried LargeCity
## 164 9 >2 100-150 Unmarried City
## 165 7 >2 100-150 Unmarried Country
## 166 21 >2 >150 Widow Copenhagen
## 167 13 >2 >150 Widow SubCopenhagen
## 168 5 >2 >150 Widow LargeCity
## 169 20 >2 >150 Widow City
## 170 8 >2 >150 Widow Country
## 171 23 >2 >150 Married Copenhagen
## 172 47 >2 >150 Married SubCopenhagen
## 173 21 >2 >150 Married LargeCity
## 174 53 >2 >150 Married City
## 175 36 >2 >150 Married Country
## 176 38 >2 >150 Unmarried Copenhagen
## 177 20 >2 >150 Unmarried SubCopenhagen
## 178 13 >2 >150 Unmarried LargeCity
## 179 39 >2 >150 Unmarried City
## 180 26 >2 >150 Unmarried Country
Convert this data frame to table form, DanishWelfare.tab, a 4-way array containing the frequencies with appropriate variable names and level names (hint: review xtabs()).
xtabs(~Alcohol + Income + Status + Urban, data = DanishWelfare)
## , , Status = Widow, Urban = Copenhagen
##
## Income
## Alcohol 0-50 50-100 100-150 >150
## <1 1 1 1 1
## 1-2 1 1 1 1
## >2 1 1 1 1
##
## , , Status = Married, Urban = Copenhagen
##
## Income
## Alcohol 0-50 50-100 100-150 >150
## <1 1 1 1 1
## 1-2 1 1 1 1
## >2 1 1 1 1
##
## , , Status = Unmarried, Urban = Copenhagen
##
## Income
## Alcohol 0-50 50-100 100-150 >150
## <1 1 1 1 1
## 1-2 1 1 1 1
## >2 1 1 1 1
##
## , , Status = Widow, Urban = SubCopenhagen
##
## Income
## Alcohol 0-50 50-100 100-150 >150
## <1 1 1 1 1
## 1-2 1 1 1 1
## >2 1 1 1 1
##
## , , Status = Married, Urban = SubCopenhagen
##
## Income
## Alcohol 0-50 50-100 100-150 >150
## <1 1 1 1 1
## 1-2 1 1 1 1
## >2 1 1 1 1
##
## , , Status = Unmarried, Urban = SubCopenhagen
##
## Income
## Alcohol 0-50 50-100 100-150 >150
## <1 1 1 1 1
## 1-2 1 1 1 1
## >2 1 1 1 1
##
## , , Status = Widow, Urban = LargeCity
##
## Income
## Alcohol 0-50 50-100 100-150 >150
## <1 1 1 1 1
## 1-2 1 1 1 1
## >2 1 1 1 1
##
## , , Status = Married, Urban = LargeCity
##
## Income
## Alcohol 0-50 50-100 100-150 >150
## <1 1 1 1 1
## 1-2 1 1 1 1
## >2 1 1 1 1
##
## , , Status = Unmarried, Urban = LargeCity
##
## Income
## Alcohol 0-50 50-100 100-150 >150
## <1 1 1 1 1
## 1-2 1 1 1 1
## >2 1 1 1 1
##
## , , Status = Widow, Urban = City
##
## Income
## Alcohol 0-50 50-100 100-150 >150
## <1 1 1 1 1
## 1-2 1 1 1 1
## >2 1 1 1 1
##
## , , Status = Married, Urban = City
##
## Income
## Alcohol 0-50 50-100 100-150 >150
## <1 1 1 1 1
## 1-2 1 1 1 1
## >2 1 1 1 1
##
## , , Status = Unmarried, Urban = City
##
## Income
## Alcohol 0-50 50-100 100-150 >150
## <1 1 1 1 1
## 1-2 1 1 1 1
## >2 1 1 1 1
##
## , , Status = Widow, Urban = Country
##
## Income
## Alcohol 0-50 50-100 100-150 >150
## <1 1 1 1 1
## 1-2 1 1 1 1
## >2 1 1 1 1
##
## , , Status = Married, Urban = Country
##
## Income
## Alcohol 0-50 50-100 100-150 >150
## <1 1 1 1 1
## 1-2 1 1 1 1
## >2 1 1 1 1
##
## , , Status = Unmarried, Urban = Country
##
## Income
## Alcohol 0-50 50-100 100-150 >150
## <1 1 1 1 1
## 1-2 1 1 1 1
## >2 1 1 1 1
The variable Urban has 5 categories. Find the total frequencies in each of these
summary(DanishWelfare$Urban)
## Copenhagen SubCopenhagen LargeCity City Country
## 36 36 36 36 36
Verify that the total number of games represented in this table is 380.
data("UKSoccer")
summary(UKSoccer)
## Number of cases in table: 380
## Number of factors: 2
## Test for independence of all factors:
## Chisq = 18.699, df = 16, p-value = 0.2846
## Chi-squared approximation may be incorrect
Express each of the marginal totals as proportions. The result should be like following:
row_sum<-margin.table(UKSoccer,1)
col_sum<-margin.table(UKSoccer,2)
prop.table(row_sum)
## Home
## 0 1 2 3 4
## 0.20000000 0.37368421 0.23684211 0.11842105 0.07105263
prop.table(col_sum)
## Away
## 0 1 2 3 4
## 0.36842105 0.35789474 0.14473684 0.10000000 0.02894737
Exercise 2.6 The one-way frequency table Saxony in vcd records the frequencies of families with 0, 1, 2, . . . 12 male children, among 6115 families with 12 children.
Another data set, Geissler, in the vcdExtra package, gives the complete tabulation of all combinations of boys and girls in families with a given total number of children (size). The task here is to create an equivalent table, Saxony12 from the Geissler data.
(a) Use subset() to create a data frame, sax12 containing the Geissler observations in families with size==12.
sax12<-subset(Geissler,size==12)
sax12
## boys girls size Freq
## 12 0 12 12 3
## 24 1 11 12 24
## 35 2 10 12 104
## 45 3 9 12 286
## 54 4 8 12 670
## 62 5 7 12 1033
## 69 6 6 12 1343
## 75 7 5 12 1112
## 80 8 4 12 829
## 84 9 3 12 478
## 87 10 2 12 181
## 89 11 1 12 45
## 90 12 0 12 7
(b) Select the columns for boys and Freq.
sax12Select<-select(sax12,Freq,boys)
sax12Select
## Freq boys
## 12 3 0
## 24 24 1
## 35 104 2
## 45 286 3
## 54 670 4
## 62 1033 5
## 69 1343 6
## 75 1112 7
## 80 829 8
## 84 478 9
## 87 181 10
## 89 45 11
## 90 7 12