6.6 2010 Healthcare Law
- No, we are 100% sure that 46% of people in this sample support the decision
- Yes, our confidence interval is extrapolated from our sample data and we are 95% sure the average population mean is within that Confidence interval
- No. 95% of our samples confidence intervals will contain the mean, however, we can’t tell how many sample means may fall outside of this samples means confidence interval
- False, as we decrease the confidence level requirement, the margin of error decreases
6.12 Legalization of marijuana, Part I.
- It is a sample statistic. Although it can be used to estimate our population parameter, it is the average of our sample.
N= 1,259
z=1.96
PE= 48
SE = \(\sqrt {\frac {P_0(1-P_0)}{N}}\)= \(\sqrt {\frac {.5(1-.5)}{1,259}}\)
CI= point estimate \(\pm\) Z * SE
Our confidence interval is around 45-50%. Contextually that means the public is pretty much split on support of legalization
se <- sqrt((.5)*.5/1259)
paste("Se= ",round(se,3))## [1] "Se= 0.014"
upper <- .48+ 1.96*se
lower <- .48-1.96*se
paste("our confidence interval is",lower,"-",upper)## [1] "our confidence interval is 0.452380665449998 - 0.507619334550002"
- C. to use normal model two conditions must be met
- Independence- less than 10% population
- Success-failure conditions must be satisfied.
- Both conditions are met, thus I would consider the use of the normal model valid in this example
- D. Considering 50% is within our confidence interval it is possible. But stating it as a fact is misleading
6.20 Legalize Marijuana, Part II
\(N= Z^2*.48*.52/ ME^2\)
- answer= 2398
n=1.96^2*.48*.52/.02^2
paste (n)## [1] "2397.1584"
6.28 Sleep deprivation, CA vs. OR, Part I.
N_1= 11,545
P_1=.08
N_2= 4691
P_2= .088
z=1.96
PE= \(p_1-p_2\)= .8
SE = \(\sqrt {\frac {P_1(1-P_1)}{N_1}+\frac {P_2(1-P_2)}{N_2}}\)= \(\sqrt {\frac {.5(1-.5)}{1,259}}\)
CI= point estimate \(\pm\) Z * SE
N_1= 11545
P_1=.08
N_2= 4691
P_2= .088
z=1.96
se <- sqrt( ((.08)*(1-.08)/11545)+((.088)*(1-.088)/4691))
upper <- .008+ 1.96*se
lower <- .008-1.96*se
paste("our confidence interval is",round(lower*100,3),"% to",round(upper*100,3),"%")## [1] "our confidence interval is -0.15 % to 1.75 %"
- Contextually this means that random sampling can account for the differences in means that were observed between these samples
6.44 Barking Deer
| type | Woods | grass | Deciduous | Other |
|---|---|---|---|---|
| actual count | 4 | 16 | 67 | 345 |
| null pct | 4.8 | 14.7 | 39.6 | 40.9 |
| null count | 20 | 63 | 169 | 174 |
percents <- c(.048,.147,.396,(1-.048-.147-.396))
null <- round(percents*426,0)
print (null)## [1] 20 63 169 174
A. \(H_o\) Barking deer don’t prefer to forage in certain habitats
\(H_A\) Barking deer prefer to forage in certain habitats
- Chi square goodness of fit
C. conditions are as follows
Sampling method is simple random sampling.- True
The variable under study is categorical.- True
The expected value of the number of sample observations in each level of the variable is at least 5.- True
z= \(\frac{observed- null}{SE_o}\)
|Z1| + |Z2| + |Z3|+|z4|
percents <- c(.048,.147,.396,(1-.048-.147-.396))
actual <- c(4,16,67,345)
null <- round(percents*426,0)
deers = as.data.frame(rbind(actual, null))
names(deers) = c('Woods','grass','Deciduous','Other')
short_hand_x_squared <- chisq.test(deers,correct = TRUE)
long_hand_x_squared <- sum((actual-null)^2/null)
paste("with predefined function",short_hand_x_squared[1], " chi^2 by hand", long_hand_x_squared)## [1] "with predefined function 139.019318509426 chi^2 by hand 277.477346378938"
- For some reason the chi square comes out differently here, but both indicate that there Barking deer prefer to forage in certain habitats
6.48 Coffee and Depression
A. Chi square test for two way tables
B. \(H_o\) there is no association between coffee intake and depression
\(H_A\) there is an association between coffee intake and depression
- C. total depression ratio= 5.13% of women suffer from depression
2607/50739## [1] 0.05138059
- expected_count= 339 Chi_square= 3.3
expected <- 6617*.0513
actual <- 373
chi_sq <- (actual-expected)^2/expected
paste("expected value count is ",expected," the chi square value is ",chi_sq)## [1] "expected value count is 339.4521 the chi square value is 3.31552402948752"
- E. Using the chi square probability table the P value is less than .001 and using R it comes out to .0003
C <- 5
R <- 2
df <- (R-1)*(C-1)
chi2 <- 20.93
p_value <- 1 - pchisq(chi2, df)F. The conclusion of the study is we can reject the null hypothesis and state there is an association between coffee intake and depression
- As this was an observational study, causation can’t be concluded. Therefore I agree it is too early to recommend women load up on coffee