#### December 2nd, 2018

Chapter 6 - Inference for Categorical Data Practice: 6.5, 6.27, 6.43 Graded: 6.6, 6.28, 6.44

*6.6 2010 Healthcare Law. On June 28, 2012 the U.S. Supreme Court upheld the much debated 2010 healthcare law, declaring it constitutional. A Gallup poll released the day after this decision indicates that 46% of 1,012 Americans agree with this decision. At a 95% confidence level, this sample has a 3% margin of error. Based on this information, determine if the following statements are true or false, and explain your reasoning.

1. We are 95% confident that between 43% and 49% of Americans in this sample support the decision of the U.S. Supreme Court on the 2010 healthcare law.

False. A confidence interval is constructed to estimate the population proportion, we know that 46% of our sample support this decision.

1. We are 95% confident that between 43% and 49% of Americans support the decision of the U.S. Supreme Court on the 2010 healthcare law.

True. Since we have a 3% margin of error at the 95% confidence interval we can assume with 95% confidence that the 46% ± 3% support the decision.

1. If we considered many random samples of 1,012 Americans, and we calculated the sample proportions of those who support the decision of the U.S. Supreme Court, 95% of those sample proportions will be between 43% and 49%.

False. 95% of samples will include the true population proportion of Americans who support the decision of the Supreme Court.

1. The margin of error at a 90% confidence level would be higher than 3%.

False. z value of 90% confidence interval is lower that 95%, so our margin or error will be lower.

*6.28 Sleep deprivation, CA vs. OR, Part I. According to a report on sleep deprivation by the Centers for Disease Control and Prevention, the proportion of California residents who reported insu

SE <- sqrt( ((0.08 * (1 - 0.08)) / 11545) +  ((0.088 * (1 - 0.088)) / 4691))
#z for 95% CI is 1.96

ME = 1.96*SE

round(0.088-0.08-ME,4)
## [1] -0.0015
round(0.088-0.08+ME,4)
## [1] 0.0175

Since our confidence interval includes 0 - we cannot reject the null hypothesis and we conlude the sleep deprivation in California and Oregon is not significantly different.

*6.44 Barking deer. Microhabitat factors associated with forage and bed sites of barking deer in Hainan Island, China were examined from 2001 to 2002. In this region woods make up 4.8% of the land, cultivated grass plot makes up 14.7% and deciduous forests makes up 39.6%. Of the 426 sites where the deer forage, 4 were categorized as woods, 16 as cultivated grassplot, and 61 as deciduous forests. The table below summarizes these data.62 Woods Cultivated grassplot Deciduous forests Other Total 4 16 67 345 426

1. Write the hypotheses for testing if barking deer prefer to forage in certain habitats over others.

H0: Barking Deer are proportionally distributed over the various types of land H1: Barking Deer are more likely to forage in particular type of land

1. What type of test can we use to answer this research question?

We can use a chi-squared goodness of fit test.

1. Check if the assumptions and conditions required for this test are satisfied.

The assumptions and conditions are satisfied: - The observations are independent, we assume there is no dependence between the cases of deer distribution we are considering - We have at least 5 expected cases for each scenario. Woods have 0.048*426 = 20.5 cases.

1. Do these data provide convincing evidence that barking deer prefer to forage in certain habitats over others? Conduct an appro- priate hypothesis test to answer this research question.
df <- 4-1

#Proporiton of "other"
1-0.048-0.147-0.396
## [1] 0.409
chi<-((4-0.048*426)^2)/(0.048*426)+((16-0.147*426)^2)/(0.147*426)+((67-0.396*426)^2)/(0.396*426)+((345-0.409*426)^2)/(0.409*426)

p_Val <- pchisq(chi, 3, lower.tail = FALSE)
p_Val
## [1] 1.144396e-59

Since the p-value is less than 5%, we reject our null hypothesis. The data provides convincing evidence that the deer are not proportionally distributed over various types of land.