Exercise 2.4 The data set data in vcd gives a 4-way, 3 × 4 × 3 × 5 table as a data frame in frequency form, containing the variable Freq and four factors, Alcohol, Income, Status, and Urban. The variable Alcohol can be considered as the response variable, and the others as possible predictors.

Read the data and find the total number of cases represented in this table.

data("DanishWelfare")
summary(DanishWelfare)
##       Freq        Alcohol      Income         Status             Urban   
##  Min.   :  0.00   <1 :60   0-50   :45   Widow    :60   Copenhagen   :36  
##  1st Qu.:  3.00   1-2:60   50-100 :45   Married  :60   SubCopenhagen:36  
##  Median : 12.00   >2 :60   100-150:45   Unmarried:60   LargeCity    :36  
##  Mean   : 28.58            >150   :45                  City         :36  
##  3rd Qu.: 35.25                                        Country      :36  
##  Max.   :255.00

See the structure of the data and change the variables Alcohol and Income into ordered variables.

DanishWelfare[order(DanishWelfare$Alcohol,DanishWelfare$Income),]
##     Freq Alcohol  Income    Status         Urban
## 1      1      <1    0-50     Widow    Copenhagen
## 2      4      <1    0-50     Widow SubCopenhagen
## 3      1      <1    0-50     Widow     LargeCity
## 4      8      <1    0-50     Widow          City
## 5      6      <1    0-50     Widow       Country
## 6     14      <1    0-50   Married    Copenhagen
## 7      8      <1    0-50   Married SubCopenhagen
## 8     41      <1    0-50   Married     LargeCity
## 9    100      <1    0-50   Married          City
## 10   175      <1    0-50   Married       Country
## 11     6      <1    0-50 Unmarried    Copenhagen
## 12     1      <1    0-50 Unmarried SubCopenhagen
## 13     2      <1    0-50 Unmarried     LargeCity
## 14     6      <1    0-50 Unmarried          City
## 15     9      <1    0-50 Unmarried       Country
## 16     8      <1  50-100     Widow    Copenhagen
## 17     2      <1  50-100     Widow SubCopenhagen
## 18     7      <1  50-100     Widow     LargeCity
## 19    14      <1  50-100     Widow          City
## 20     5      <1  50-100     Widow       Country
## 21    42      <1  50-100   Married    Copenhagen
## 22    51      <1  50-100   Married SubCopenhagen
## 23    62      <1  50-100   Married     LargeCity
## 24   234      <1  50-100   Married          City
## 25   255      <1  50-100   Married       Country
## 26     7      <1  50-100 Unmarried    Copenhagen
## 27     5      <1  50-100 Unmarried SubCopenhagen
## 28     9      <1  50-100 Unmarried     LargeCity
## 29    20      <1  50-100 Unmarried          City
## 30    27      <1  50-100 Unmarried       Country
## 31     2      <1 100-150     Widow    Copenhagen
## 32     3      <1 100-150     Widow SubCopenhagen
## 33     1      <1 100-150     Widow     LargeCity
## 34     5      <1 100-150     Widow          City
## 35     2      <1 100-150     Widow       Country
## 36    21      <1 100-150   Married    Copenhagen
## 37    30      <1 100-150   Married SubCopenhagen
## 38    23      <1 100-150   Married     LargeCity
## 39    87      <1 100-150   Married          City
## 40    77      <1 100-150   Married       Country
## 41     3      <1 100-150 Unmarried    Copenhagen
## 42     2      <1 100-150 Unmarried SubCopenhagen
## 43     1      <1 100-150 Unmarried     LargeCity
## 44    12      <1 100-150 Unmarried          City
## 45     4      <1 100-150 Unmarried       Country
## 46    42      <1    >150     Widow    Copenhagen
## 47    29      <1    >150     Widow SubCopenhagen
## 48    17      <1    >150     Widow     LargeCity
## 49    95      <1    >150     Widow          City
## 50    46      <1    >150     Widow       Country
## 51    24      <1    >150   Married    Copenhagen
## 52    30      <1    >150   Married SubCopenhagen
## 53    50      <1    >150   Married     LargeCity
## 54   167      <1    >150   Married          City
## 55   232      <1    >150   Married       Country
## 56    33      <1    >150 Unmarried    Copenhagen
## 57    24      <1    >150 Unmarried SubCopenhagen
## 58    15      <1    >150 Unmarried     LargeCity
## 59    64      <1    >150 Unmarried          City
## 60    68      <1    >150 Unmarried       Country
## 61     3     1-2    0-50     Widow    Copenhagen
## 62     0     1-2    0-50     Widow SubCopenhagen
## 63     1     1-2    0-50     Widow     LargeCity
## 64     4     1-2    0-50     Widow          City
## 65     2     1-2    0-50     Widow       Country
## 66    15     1-2    0-50   Married    Copenhagen
## 67     7     1-2    0-50   Married SubCopenhagen
## 68    15     1-2    0-50   Married     LargeCity
## 69    25     1-2    0-50   Married          City
## 70    48     1-2    0-50   Married       Country
## 71     2     1-2    0-50 Unmarried    Copenhagen
## 72     3     1-2    0-50 Unmarried SubCopenhagen
## 73     9     1-2    0-50 Unmarried     LargeCity
## 74     9     1-2    0-50 Unmarried          City
## 75     7     1-2    0-50 Unmarried       Country
## 76     1     1-2  50-100     Widow    Copenhagen
## 77     1     1-2  50-100     Widow SubCopenhagen
## 78     3     1-2  50-100     Widow     LargeCity
## 79     8     1-2  50-100     Widow          City
## 80     4     1-2  50-100     Widow       Country
## 81    39     1-2  50-100   Married    Copenhagen
## 82    59     1-2  50-100   Married SubCopenhagen
## 83    68     1-2  50-100   Married     LargeCity
## 84   172     1-2  50-100   Married          City
## 85   143     1-2  50-100   Married       Country
## 86    12     1-2  50-100 Unmarried    Copenhagen
## 87     3     1-2  50-100 Unmarried SubCopenhagen
## 88    11     1-2  50-100 Unmarried     LargeCity
## 89    20     1-2  50-100 Unmarried          City
## 90    23     1-2  50-100 Unmarried       Country
## 91     5     1-2 100-150     Widow    Copenhagen
## 92     4     1-2 100-150     Widow SubCopenhagen
## 93     1     1-2 100-150     Widow     LargeCity
## 94     9     1-2 100-150     Widow          City
## 95     4     1-2 100-150     Widow       Country
## 96    32     1-2 100-150   Married    Copenhagen
## 97    68     1-2 100-150   Married SubCopenhagen
## 98    43     1-2 100-150   Married     LargeCity
## 99   128     1-2 100-150   Married          City
## 100   86     1-2 100-150   Married       Country
## 101    6     1-2 100-150 Unmarried    Copenhagen
## 102   10     1-2 100-150 Unmarried SubCopenhagen
## 103    5     1-2 100-150 Unmarried     LargeCity
## 104   21     1-2 100-150 Unmarried          City
## 105   15     1-2 100-150 Unmarried       Country
## 106   26     1-2    >150     Widow    Copenhagen
## 107   34     1-2    >150     Widow SubCopenhagen
## 108   14     1-2    >150     Widow     LargeCity
## 109   48     1-2    >150     Widow          City
## 110   24     1-2    >150     Widow       Country
## 111   43     1-2    >150   Married    Copenhagen
## 112   76     1-2    >150   Married SubCopenhagen
## 113   70     1-2    >150   Married     LargeCity
## 114  198     1-2    >150   Married          City
## 115  136     1-2    >150   Married       Country
## 116   36     1-2    >150 Unmarried    Copenhagen
## 117   23     1-2    >150 Unmarried SubCopenhagen
## 118   48     1-2    >150 Unmarried     LargeCity
## 119   89     1-2    >150 Unmarried          City
## 120   64     1-2    >150 Unmarried       Country
## 121    2      >2    0-50     Widow    Copenhagen
## 122    0      >2    0-50     Widow SubCopenhagen
## 123    2      >2    0-50     Widow     LargeCity
## 124    1      >2    0-50     Widow          City
## 125    0      >2    0-50     Widow       Country
## 126    1      >2    0-50   Married    Copenhagen
## 127    2      >2    0-50   Married SubCopenhagen
## 128    2      >2    0-50   Married     LargeCity
## 129    7      >2    0-50   Married          City
## 130    7      >2    0-50   Married       Country
## 131    3      >2    0-50 Unmarried    Copenhagen
## 132    0      >2    0-50 Unmarried SubCopenhagen
## 133    1      >2    0-50 Unmarried     LargeCity
## 134    5      >2    0-50 Unmarried          City
## 135    1      >2    0-50 Unmarried       Country
## 136    3      >2  50-100     Widow    Copenhagen
## 137    0      >2  50-100     Widow SubCopenhagen
## 138    2      >2  50-100     Widow     LargeCity
## 139    1      >2  50-100     Widow          City
## 140    3      >2  50-100     Widow       Country
## 141   14      >2  50-100   Married    Copenhagen
## 142   21      >2  50-100   Married SubCopenhagen
## 143   14      >2  50-100   Married     LargeCity
## 144   38      >2  50-100   Married          City
## 145   35      >2  50-100   Married       Country
## 146    2      >2  50-100 Unmarried    Copenhagen
## 147    0      >2  50-100 Unmarried SubCopenhagen
## 148    3      >2  50-100 Unmarried     LargeCity
## 149   12      >2  50-100 Unmarried          City
## 150   13      >2  50-100 Unmarried       Country
## 151    2      >2 100-150     Widow    Copenhagen
## 152    1      >2 100-150     Widow SubCopenhagen
## 153    1      >2 100-150     Widow     LargeCity
## 154    1      >2 100-150     Widow          City
## 155    0      >2 100-150     Widow       Country
## 156   20      >2 100-150   Married    Copenhagen
## 157   31      >2 100-150   Married SubCopenhagen
## 158   10      >2 100-150   Married     LargeCity
## 159   36      >2 100-150   Married          City
## 160   21      >2 100-150   Married       Country
## 161    0      >2 100-150 Unmarried    Copenhagen
## 162    2      >2 100-150 Unmarried SubCopenhagen
## 163    3      >2 100-150 Unmarried     LargeCity
## 164    9      >2 100-150 Unmarried          City
## 165    7      >2 100-150 Unmarried       Country
## 166   21      >2    >150     Widow    Copenhagen
## 167   13      >2    >150     Widow SubCopenhagen
## 168    5      >2    >150     Widow     LargeCity
## 169   20      >2    >150     Widow          City
## 170    8      >2    >150     Widow       Country
## 171   23      >2    >150   Married    Copenhagen
## 172   47      >2    >150   Married SubCopenhagen
## 173   21      >2    >150   Married     LargeCity
## 174   53      >2    >150   Married          City
## 175   36      >2    >150   Married       Country
## 176   38      >2    >150 Unmarried    Copenhagen
## 177   20      >2    >150 Unmarried SubCopenhagen
## 178   13      >2    >150 Unmarried     LargeCity
## 179   39      >2    >150 Unmarried          City
## 180   26      >2    >150 Unmarried       Country

Convert this data frame to table form, DanishWelfare.tab, a 4-way array containing the frequencies with appropriate variable names and level names (hint: review xtabs()).

xtabs(~Alcohol + Income + Status + Urban, data = DanishWelfare)
## , , Status = Widow, Urban = Copenhagen
## 
##        Income
## Alcohol 0-50 50-100 100-150 >150
##     <1     1      1       1    1
##     1-2    1      1       1    1
##     >2     1      1       1    1
## 
## , , Status = Married, Urban = Copenhagen
## 
##        Income
## Alcohol 0-50 50-100 100-150 >150
##     <1     1      1       1    1
##     1-2    1      1       1    1
##     >2     1      1       1    1
## 
## , , Status = Unmarried, Urban = Copenhagen
## 
##        Income
## Alcohol 0-50 50-100 100-150 >150
##     <1     1      1       1    1
##     1-2    1      1       1    1
##     >2     1      1       1    1
## 
## , , Status = Widow, Urban = SubCopenhagen
## 
##        Income
## Alcohol 0-50 50-100 100-150 >150
##     <1     1      1       1    1
##     1-2    1      1       1    1
##     >2     1      1       1    1
## 
## , , Status = Married, Urban = SubCopenhagen
## 
##        Income
## Alcohol 0-50 50-100 100-150 >150
##     <1     1      1       1    1
##     1-2    1      1       1    1
##     >2     1      1       1    1
## 
## , , Status = Unmarried, Urban = SubCopenhagen
## 
##        Income
## Alcohol 0-50 50-100 100-150 >150
##     <1     1      1       1    1
##     1-2    1      1       1    1
##     >2     1      1       1    1
## 
## , , Status = Widow, Urban = LargeCity
## 
##        Income
## Alcohol 0-50 50-100 100-150 >150
##     <1     1      1       1    1
##     1-2    1      1       1    1
##     >2     1      1       1    1
## 
## , , Status = Married, Urban = LargeCity
## 
##        Income
## Alcohol 0-50 50-100 100-150 >150
##     <1     1      1       1    1
##     1-2    1      1       1    1
##     >2     1      1       1    1
## 
## , , Status = Unmarried, Urban = LargeCity
## 
##        Income
## Alcohol 0-50 50-100 100-150 >150
##     <1     1      1       1    1
##     1-2    1      1       1    1
##     >2     1      1       1    1
## 
## , , Status = Widow, Urban = City
## 
##        Income
## Alcohol 0-50 50-100 100-150 >150
##     <1     1      1       1    1
##     1-2    1      1       1    1
##     >2     1      1       1    1
## 
## , , Status = Married, Urban = City
## 
##        Income
## Alcohol 0-50 50-100 100-150 >150
##     <1     1      1       1    1
##     1-2    1      1       1    1
##     >2     1      1       1    1
## 
## , , Status = Unmarried, Urban = City
## 
##        Income
## Alcohol 0-50 50-100 100-150 >150
##     <1     1      1       1    1
##     1-2    1      1       1    1
##     >2     1      1       1    1
## 
## , , Status = Widow, Urban = Country
## 
##        Income
## Alcohol 0-50 50-100 100-150 >150
##     <1     1      1       1    1
##     1-2    1      1       1    1
##     >2     1      1       1    1
## 
## , , Status = Married, Urban = Country
## 
##        Income
## Alcohol 0-50 50-100 100-150 >150
##     <1     1      1       1    1
##     1-2    1      1       1    1
##     >2     1      1       1    1
## 
## , , Status = Unmarried, Urban = Country
## 
##        Income
## Alcohol 0-50 50-100 100-150 >150
##     <1     1      1       1    1
##     1-2    1      1       1    1
##     >2     1      1       1    1

The variable Urban has 5 categories. Find the total frequencies in each of these

summary(DanishWelfare$Urban)
##    Copenhagen SubCopenhagen     LargeCity          City       Country 
##            36            36            36            36            36

Exercise 2.5 The data set in vcd gives the distributions of number of goals scored by the 20 teams in the 1995/96 season of the Premier League of the UK Football Association.

Verify that the total number of games represented in this table is 380.

data("UKSoccer")
 summary(UKSoccer)
## Number of cases in table: 380 
## Number of factors: 2 
## Test for independence of all factors:
##  Chisq = 18.699, df = 16, p-value = 0.2846
##  Chi-squared approximation may be incorrect

Express each of the marginal totals as proportions. The result should be like following:

row_sum<-margin.table(UKSoccer,1)
col_sum<-margin.table(UKSoccer,2)
prop.table(row_sum)
## Home
##          0          1          2          3          4 
## 0.20000000 0.37368421 0.23684211 0.11842105 0.07105263
prop.table(col_sum)
## Away
##          0          1          2          3          4 
## 0.36842105 0.35789474 0.14473684 0.10000000 0.02894737

Exercise 2.6 The one-way frequency table Saxony in vcd records the frequencies of families with 0, 1, 2, . . . 12 male children, among 6115 families with 12 children.

Another data set, Geissler, in the vcdExtra package, gives the complete tabulation of all combinations of boys and girls in families with a given total number of children (size). The task here is to create an equivalent table, Saxony12 from the Geissler data.

(a) Use subset() to create a data frame, sax12 containing the Geissler observations in families with size==12.

sax12<-subset(Geissler,size==12)
sax12
##    boys girls size Freq
## 12    0    12   12    3
## 24    1    11   12   24
## 35    2    10   12  104
## 45    3     9   12  286
## 54    4     8   12  670
## 62    5     7   12 1033
## 69    6     6   12 1343
## 75    7     5   12 1112
## 80    8     4   12  829
## 84    9     3   12  478
## 87   10     2   12  181
## 89   11     1   12   45
## 90   12     0   12    7

(b) Select the columns for boys and Freq.

sax12Select<-select(sax12,Freq,boys)
sax12Select
##    Freq boys
## 12    3    0
## 24   24    1
## 35  104    2
## 45  286    3
## 54  670    4
## 62 1033    5
## 69 1343    6
## 75 1112    7
## 80  829    8
## 84  478    9
## 87  181   10
## 89   45   11
## 90    7   12

(c) Use xtabs() with a formula, Freq ~ boys, to create the one-way table.

Saxony12<- xtabs(sax12Select)
Saxony12
## boys
##    0    1    2    3    4    5    6    7    8    9   10   11   12 
##    3   24  104  286  670 1033 1343 1112  829  478  181   45    7