a.False: this sample has a 46% approval rate. We know that the US population approval rate with 95% confidence intervals is between 43% and 49%.
b.True: This aligns with definiton of confidence interval. The sample is less than 10% of the population.
c.False: 95% will contain the population proportion.
d.False: The margin of error at a 90% confidence level, since we are lowering our confidence.
a.48%, was derived from a sample of 1259 US residents, and NOT from the US population.
b.
# 95% Confidence Interval -> alpha of 0.05, z = 1.96
z <- 1.96
p <- .48 # proportion is 0.48
n <- 1259
me <- z * sqrt(p*(1-p)/n)
ci.lower <- (p - me) * 100
ci.upper <- (p + me) * 100
ci.lower## [1] 45.24028
ci.upper## [1] 50.75972
The 95% confidence interval for the proportion of US residents who think marijuana should be made legal is from 45.240277 and 50.759723.
c.If observations are independent: < 10% of the population and sample size is sufficent this will hold true.
d.False. The 95% confidence interval falls between 45.24% to 50.76%. The chances are high that it will be < 50%.
#margin of err =2%
mj.p <- .48
se <- .02/1.96
# Standard of Error = sqrt(p * (1-p) / n)
mj.n <- (mj.p * (1-mj.p))/(se^2)
mj.n## [1] 2397.158
ca.n <- 11545
ca.p <- 0.08
or.n <- 4691
or.p <- 0.088
z <- 1.96 # For 95% Confidence Interval
se.ca <- sqrt((ca.p)*(1-ca.p)/ca.n)
me.ca <- z * se.ca # Margin of Error at 95% confidence interval
se.or <- sqrt((or.p)*(1-or.p)/or.n)
me.or <- z * se.or # Margin of Error at 95% confidence interval
round(me.ca * 100, 2)## [1] 0.49
# Now to calculate the confidence interval
ca.lower <- ca.p - me.ca
ca.upper <- ca.p + me.ca
or.lower <- or.p - me.or
or.upper <- or.p + me.or
ca.lower## [1] 0.07505122
ca.upper## [1] 0.08494878
or.lower## [1] 0.07989296
or.upper## [1] 0.09610704
se <- sqrt((ca.p)*(1-ca.p)/ca.n + (or.p)*(1-or.p)/or.n) # Calculating a new SE for the differences
state.me <- z * se
state.me## [1] 0.009498128
#95% CI
diff <- or.p - ca.p
diff.lower <- diff - state.me
diff.upper <- diff + state.me
diff.lower## [1] -0.001498128
diff.upper## [1] 0.01749813
a.Hypothesis
H0: the barking deer have no preference to certain habitats and that they have equal preference among them all.
Ha: The barking deer have a preference to certain habitats
b.The Chi-square test can be used here.
c.Check conditions for inference
1.The deer are not fluencing each otehr and are independent of each other
2.Sample size and distribution: .048 * 426 = 20.448, which is greater or equal to 5
deer <- c(4, 16, 61, 345)
percnt <- c(0.048, 0.147, 0.396, 0.409)
chisq.test(x = deer, p = percnt)##
## Chi-squared test for given probabilities
##
## data: deer
## X-squared = 284.06, df = 3, p-value < 2.2e-16
Since the p-value is less than 0.05, we reject the H0 and conclude that barking deer do perfer some habitats over others
library(visualize)
visualize.chisq(stat= 284.06, df = 3, section = "upper")a.Chi-squared test
b.Hypothesis
H0: The coffee consumption and depression is not related.
Ha: The coffee consumption and depression is related.
c.
deprs <- 2607
not.deprs <- 48132
total <- deprs + not.deprs
deprs/total * 100## [1] 5.138059
not.deprs/total * 100## [1] 94.86194
d.Ans:
6617 * 2607 / 50739## [1] 339.9854
(373 - 339.9854)^2 / 339.9854## [1] 3.205914
e.Ans: p.value = 0.0003267
test.stat<- data.frame(Yes = c(670,373,905,564,95),
No =c(11545,6244,16329,11726,2288)
)
chisq.test(test.stat)##
## Pearson's Chi-squared test
##
## data: test.stat
## X-squared = 20.932, df = 4, p-value = 0.0003267
library(visualize)
visualize.chisq(stat=20.932, df = 4, section = "upper")f.We reject null Hypothesis(H0) g.I Agree that it is too early to make this recommendation. The chisquare test only shows that there is a relationship in the study, not exactly what that relationship is. Correlation does not necessarily mean there’s causation.