False. We are 100% confident of the responses of the sample (subject to any errors in surveying, etc.). 46% of the 1,012 Americans surveyed agreed with the decision.
True. The margin of error of 3% is added and subtracted from the statistic of 46%, to reflect that we are 95% confident that the actual population parameter is between 43% and 49%
True. This is what we mean by a 95% confidence interval.
False. Margin of error is arrived by multiplying the z score by the standard error. The z score for 90% level is lower than for a 95% confidence level, so for the same standard error the margin of error will be lower for 90% confidence level.
CA.phat <- .08
OR.phat <- .088
diff.phat <- abs(CA.phat - OR.phat)
CA.n <- 11545
OR.n <- 4691
slp.se <- sqrt(((CA.phat * (1 - CA.phat) / CA.n) + ((OR.phat * (1 - OR.phat)) / OR.n)))
slp.z <- 1.96
slp.me <- slp.z * slp.se
slp.lo <- diff.phat - slp.me
slp.up <- diff.phat + slp.me
slp.ci <- c(slp.lo, slp.up)
round(slp.ci, 5)## [1] -0.0015 0.0175
The 95% confidence interval for the difference between CA and OR populations overlaps 0, so we cannot conclude that the populations are different from another.
Woods Cultivated grassplot Deciduous forests Other Total 4 16 61 345 426
H0: barking deer don’t exhibit preferences for certain habitats over others for foraging.
HA: barking deer do exhibit those preferences.
We can use a chi-squared test.
We can safely assume that each case contributes a count to the table that is independent of other cases. However, each cell count should containt at least 5 expected cases to meet the sample size / distribution case, which is not meat for woods habitats.
deer.obs <- c(4, 16, 61, 345) # Observed distribution of foraging habitats
deer.obs.total <- 426
deer.exp <- c(.048, .147, .396, (1 - .048 - .147 - .396)) * deer.obs.total # Expected distribution of foraging habitats based on land distribution
deer.k <- length(deer.obs) # Number of habitat groups
deer.df <- deer.k - 1 # Degrees of freedom for chi-square test
# Loop to arrive at a test statistic.
deer.chi <- 0
for (i in 1:deer.k) {
deer.chi <- deer.chi + ((deer.obs[i] - deer.exp[i])^2 / deer.exp[i])
}
# Calculcate the p-value using pchisq function and computed test statistics.
deer.pval <- pchisq(deer.chi, df = deer.df, lower.tail = F)
deer.pval## [1] 2.799724e-61
As the p-value is very small, we reject the null hypothesis anbd conclude that barking deer do prefer to forage in certain habitats over others.
Caffeinated coffee consumption ???1 2-6 1 2-3 $ 4 cup/week cups/week cup/day cups/day cups/day Total Clinical Yes 670 373 905 564 95 2,607 depression No 11,545 6,244 16,329 11,726 2,288 48,132 Total 12,215 6,617 17,234 12,290 2,383 50,739
** A two-table chi-squared test would be appropriate because … ANSWER**
H0: The proportion of women who are depressed does not vary based on coffee consumption.
HA: The proportion of women who are depressed varies based on coffee consumption.
ydprs <- c(670, 373, 905, 564, 95)
ydprs.total <- sum(ydprs)
ndprs <- c(11545, 6244, 16329, 11726, 2288)
ndprs.total <- sum(ndprs)
cfe <- ydprs + ndprs
cfe.total <- ydprs.total + ndprs.total
ydprs.prop <- ydprs.total / cfe.total
ndprs.prop <- ndprs.total / cfe.total
paste0(round((100 * ydprs.prop), 1), "% of women suffer from depression")## [1] "5.1% of women suffer from depression"
paste0(round((100 * ndprs.prop), 1), "% of women do not suffer from depression")## [1] "94.9% of women do not suffer from depression"
ydprs.2_6cupwk.exp <- cfe[2] * ydprs.prop
paste0("The expected count is ", round(ydprs.2_6cupwk.exp, 2))## [1] "The expected count is 339.99"
ydprs.2_6cupwk.contrib <- (ydprs[2] - ydprs.2_6cupwk.exp)^2 / ydprs.2_6cupwk.exp
paste0("The contribution is to the test statistic is ", round(ydprs.2_6cupwk.contrib, 2))## [1] "The contribution is to the test statistic is 3.21"
cfe.chi <- 20.93
cfe.k <- length(cfe)
cfe.df <- cfe.k - 1
cfe.pval <- pchisq(cfe.chi, df = cfe.df, lower.tail = F)
cfe.pval## [1] 0.0003269507
Given the p-value of .0003 is less than .05, we reject the null hypothesis and conclude there is a relationship between caffeinated coffee consumption and depression.
Yes, I agree - statistical significance is not the same as clinical statistical significance. While this study identifies a correlatioin, it does not imply causation.