Question 1

a.

## X-squared 
##     11.46

## [1] 0.003246

b.

\(H_0\): \(p_1\)=\(p_2\)

\(p_1\)=proportion of college graduates supporting offshore drilling

\(p_2\)=proportion of non-college graduates supporting offshore drilling

## [1] 0.5007

c.

The two p-values do no agree. This is surprising, and something may have gone wrong. The pearson test for independence likely took into account the fact that the data was in 2x3 table. Given that it’s a 2xn table, the p-values should have been similar. However, while testing for equivalence using the two sample proportion test, it may have been assumed that the data was binomial, resulting in a different p-value from the chisquare test.

d.

##     support oppose unsure
## yes  0.3702  2.588 -3.161
## no  -0.3702 -2.588  3.161

Based on the standardized residuals, it appears as though there was a more positive response rate for the college graduates than was expected. In terms of the non-college graduates, there was less support and opposition than expected. Overall, the expectation for the college versus non-college graduates was opposite than what was predicted.

Question 2

a.

\(H_0\): Proportion of particular words used in Austen’s text = Proportion of particular words used in Imitator’s text

\(H_a\): Proportion of particular words used in Austen’s text \(\neq\) Proportion of particular words used in Imitator’s text

The pearson statistic and p-value are:

## 
##  Pearson's Chi-squared test
## 
## data:  words
## X-squared = 46, df = 15, p-value = 0.00006

## X-squared 
##     45.58

## [1] 0.00006205

The likelihood statistic and p-value are:

## [1] 44.32

## [1] 0.00009783

b.

We reject the null hypothesis that the two are independent. We would conclude that the imitator is unsuccessful in copying Austen’s style in terms of relative frequencies of the six words because the p-values are less than alpha = 0.05 and this shows that the proportions stated in the null were not the same.

c.

##              ss       emma  austen imitator
## a       -1.6120 -0.1853720  2.3228 -0.08492
## an      -0.7389 -1.5889795 -1.2154  4.23349
## this     0.1744  0.5108918 -0.5075 -0.37267
## that     2.1618  1.6656960 -1.1233 -3.75319
## with    -0.6755  0.0002633 -1.2305  2.09339
## without  1.7044 -1.7099560  1.2670 -1.18893

The words “a” and “without” are used more in Austen’s Sandition than in any other book. Whereas “an” and “without” are used far less in ‘Emma’ than any other text. The imitator also underestimates the usage of most words in Sandition than what Austen used. In general, there does not seem to be a correlation or pattern created amongest the data presented. However the large varying differences between the standardized residuals demonstrate the imitators lack of ability to mimick Austen’s style exactly.

d.

The pearson and likelihood statistics for the Austen table are:

## 
##  Pearson's Chi-squared test
## 
## data:  data.table.Austen
## X-squared = 12, df = 10, p-value = 0.3

## [1] 12.59

The pearson and likleihood statistics for the Austen x Imitators work are:

## 
##  Pearson's Chi-squared test
## 
## data:  pooled
## X-squared = 33, df = 5, p-value = 0.000004

## [1] 31.74

Question 3

##           Cancer Controlled Not Controlled
## Surgery                  21              2
## Radiation                15              3

## [1] 0.4762

θ is the odds of a radiation therapy controlling cancer .4762 times compared to that of surgery.

2.30

## [1] 0.3808

The p-value for the fisher test is .3808. This is not statistically significant so we would not reject the null hypothesis that surgery and radiation have the same impact on whether or not cancer is controlled.

2.31

## [1] 0.6384

The two-sided exact p-value using fisher’s exact test results in 0.6384 which is greater than 0.05 showing it is not statistically significant enough to reject the null hypothesis that the odds = 1.

## [1] 0.8947

The one-sided mid p-value is 0.8947. A benefit to this mid p-value is that it is more conservative than the ordinary one.

Question 4

a.

b.

Partial Tables

Odds.white

## [1] 2.796

Odds.black

## [1] 3.286

Controlling for the race of the defendent, the odds of receiving the death penalty were lower for the white victims than for black victims.

c.

Marginal tables

##       yes  no
## white  30 184
## black   6 106

Odds Ratio

## [1] 2.88

Ignoring the race of the defendent, the odds of receiving the death penalty were 2.8804 higher for whites than for blacks.

This shows Simpson’s Paradox because the isolated groups show the odds being greater for blacks than for whites, but when the data is combined (marginal table), the odds are higher for whites.

STAT 226: Homework #4

Jessica Hanlon

Tuesday, January 30th by 11am