HW6

6.6

FALSE. The sample taken represents the true population with 95% confidence, not the sample.
TRUE. The sample taken represents the true population with 95% confidence.
TRUE. Since it is a representation of the true population, other samples should fall within the margin of error.
FALSE. It would be slightly smaller. Calculations say around 2.6%.

6.12

POPULATION PARAMETER because it is describing a percentage of the population.
With a margin of error around 3% we are 95% confident that the percent of people in US who believe that marijuana should be legal falls between 45% and 51%.
With this data, the sampling method and the conditions that it meets are the most important. If the method was a simple random sample with at least 10 approvals and 10 dissapprovals then it should resemble the true population.
YES. The confidence interval does include that the mean percentage of the population to approve legalization is could be 51%

6.20

At least 2397 people need to be surveyed for a margin of error of 2%

p <- .48
moe <- .02
ci <- 1.96

n <- (ci^2)*(p*(1-p))/(moe^2)
n

## [1] 2397.158

6.28

We are 95% confident that the difference in proportions of Californians and Oregonians who are sleep deprived changes between 0 and 2 percent

p1 <- .088
p2 <- .08
p_hat <- p1 - p2
ci <- 1.96

n1 <- 4691
n2 <- 11545

se1 <- (p1*(1-p1))/n1
se2 <- (p2*(1-p2))/n2

se <- sqrt(se1+se2)

ci_low <- p_hat - (se*ci)
ci_high <- p_hat + (se*ci)

print(paste0("CI is between ", round(ci_low, 2), " and ", round(ci_high, 2)))

## [1] "CI is between 0 and 0.02"

6.44

\(H_o:\) There is no difference in habitats on where the barking deer prefer to forage

\(H_a:\) The barking deer do prefer to forage in one habitat over the others.

A chi-square test
The expected cases were more than 5. Sites were picked solely on where the deer foraged and not dependent on where the previous site was picked so we can say these were independent.
With the p-value very close to zero and the Chi-test statistic at 276 there if substancial evidence to reject the null hypothesis.

#Standard errors for the point estimates

#Woods
Z1 <- (4-20.45)^2/(20.45)

#Grassplot
Z2 <- (16-62.62)^2/(62.62)

#Forest
Z3 <- (67-168.7)^2/(168.7)

#Other
Z4 <- (345-174.2)^2/(174.2)


#Chi-square test statistic
X <- Z1 + Z2 + Z3 + Z4

#Degrees of Freedom
DF <- 3

1-pchisq(X, df=DF)

## [1] 0

6.48

Chi-squared test for two-way tables

\(H_o:\) There is no association between coffee intake and depression

\(H_a:\) There is an association between coffee intake and depression

total <- 50739
dep <- 2607/total
happy <- 48132/total

print(paste0(round(dep,2)*100,"% of women are depressed"))

## [1] "5% of women are depressed"

print(paste0(round(happy,2)*100,"% of women are not depressed"))

## [1] "95% of women are not depressed"

z <- 6617*.05
print(paste0("Expected value is ", z))

## [1] "Expected value is 330.85"

x <- (373-z)^2/z
print(paste0("Contribution is ", x))

## [1] "Contribution is 5.36987305425419"

p <- pchisq(20.93, df=4, lower.tail = F)
print(paste0("p-value is ", p))

## [1] "p-value is 0.000326950725917055"

The p-value was very small, therefore we can acceptable the null hypothesis that coffee consumption does not contribute to depression

Even though, as a group, there wasn’t any link between coffee and depression, the observed cases for depression for 2-6 cups/week for actually higher than the expected cases which might need to be taken under consideration.

HW6

Chad Smith

November 5, 2017

6.6

6.12

6.20

6.28

6.44

6.48