Chapter 9: Analysis of Two-Way Tables

Below are the answers and commands that correspond to the handout from class.

Example 1: Exercise and Adequate Sleep

1

(a)

Load the data:

sleep<-read.file("/home/emesekennedy/Data/Ch9/sleep.txt")
## Reading data with read.table()

Require the reshape2 package:

require(reshape2)
## Loading required package: reshape2

Create a two-way table:

acast(sleep, EnoughSleep~Exercise)
## Using Count as value column: use value.var to override.
##     High Low
## No   148 242
## Yes  151 115

Save the two-way table as sleeptable:

sleeptable<-acast(sleep, EnoughSleep~Exercise)
## Using Count as value column: use value.var to override.

(b)

Include optional commands to display the row and column sums:

acast(sleep, EnoughSleep~Exercise,fun.aggregate=sum, margins=T)
## Using Count as value column: use value.var to override.
##       High Low (all)
## No     148 242   390
## Yes    151 115   266
## (all)  299 357   656

(c)

Look at the joint distribution:

prop.table(sleeptable)
##          High       Low
## No  0.2256098 0.3689024
## Yes 0.2301829 0.1753049

(d)

Look at the conditional distributions:

prop.table(sleeptable, 1)
##          High       Low
## No  0.3794872 0.6205128
## Yes 0.5676692 0.4323308
prop.table(sleeptable, 2)
##          High       Low
## No  0.4949833 0.6778711
## Yes 0.5050167 0.3221289

2

(a)

Create a bargraph where you group by the variable Exercise:

barchart(Count~EnoughSleep, groups=Exercise, data=sleep, auto.key=T)

(b)

Create a barraph where you group by the variable EnoughSleep::

barchart(Count~Exercise, groups=EnoughSleep, data=sleep, auto.key=T)

3

After looking at the two-way tables and barcharts it appears that there is an association between adequate sleep and exercise. It seems like that people who don’t get enough sleep also don’t exercise much, and people who get enough sleep tend to exercise more.

4

(a)

Compute expected cell counts:

Yes-High:

266*299/656
## [1] 121.2409

Yes-Low:

266*357/656
## [1] 144.7591

No-High:

390*299/656
## [1] 177.7591

No-Low:

390*357/656
## [1] 212.2409

(b)

Compute the \(\chi^2\) value

(151-121.24)^2/121.24+(115-144.76)^2/144.76+(148-177.76)^2/177.76+(242-212.24)^2/212.24
## [1] 22.57833

The \(\chi^2\) value is approximately 22.58.

(c)

Plot the distribution of the \(\chi^2\) statistic:

plotDist("chisq", df=1)

The plot doesn’t look like much. Let’s plot it as a histogram:

plotDist("chisq", df=1, kind="histogram")

(d)

1-xpchisq(22.58, df=1)

## [1] 2.015721e-06

So the \(P\)-value is \(2\times10^{-6}=.000002\).

(e)

Carry out the significance test using the command chisq.test:

chisq.test(sleeptable)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  sleeptable
## X-squared = 21.825, df = 1, p-value = 2.987e-06

Notice that the values are a little different from what we computed. That is because by default R Studio uses continuity correction to make the results more accurate. We can turn this off by using the option correct=F:

chisq.test(sleeptable, correct=F)
## 
##  Pearson's Chi-squared test
## 
## data:  sleeptable
## X-squared = 22.577, df = 1, p-value = 2.019e-06

This gives the same values we found.

(f)

The \(P\)-value is really small, so we can definitely reject the null hypothesis. This means that we have very strong evidence to suggest that there is an association between adequate sleep and exercise.