pi=1/15
u=pi+1.96*sqrt((pi*(1-pi))/15)
l=pi-1.96*sqrt((pi*(1-pi))/15)
c(l,u)
## [1] -0.05956933 0.19290266
pi_b=(1+.5)/(15+1)
u=pi_b+1.96*sqrt((pi_b*(1-pi_b))/16) #divide by 16 because adding a half a success and failure
l=pi_b-1.96*sqrt((pi_b*(1-pi_b))/16)
c(l,u)
## [1] -0.04907549 0.23657549
phi=log(pi/(1-pi))
iphi=15*pi*(1-pi)
l=phi-1.96*1/sqrt(iphi)
u=phi+1.96*1/sqrt(iphi)
plow=exp(l)/(1+exp(l))
pup=exp(u)/(1+exp(u))
c(plow,pup)
## [1] 0.009305044 0.351998845
library(binom)
y <- 1
n <- 15
binom.confint(y, n, conf.level = 0.95, methods = "lrt")
## method x n mean lower upper
## 1 lrt 1 15 0.06666667 0.003926124 0.2621293
Also, of these, which one of these intervals is preferred and why? We will choose the LR method (part d) because: 1. it produces the smallest interval within the correct range of 0 to 1 2. Schafer notes says that statisticians tend to prefer the LR method.
β’ the number of Loyola students you see with orange hair has a Poisson distribution with mean π, and that independently, β’ the number of Loyola students you see with green hair has a Poisson distribution with mean π, and that independently, β’ the number of Loyola students you see with purple hair has a Poisson distribution with mean π.
For Poisson mean = variance = 8+7+5=20.
The number of Loyola students you see at the main entrance of Loyola from 9am to 10am has a Poisson distribution with \(\lambda=20\).
dpois(15, 20)
## [1] 0.05164885
ppois(15, 20)
## [1] 0.1565131
#normal approximation
zstar <- (15.5 - 20) / sqrt(20)
pnorm(zstar)
## [1] 0.1571523
dmultinom(c(5,2,3), size = 10, prob = c(8/20,7/20,5/20), log = FALSE)
## [1] 0.049392
p=(2*15+33)/192;p
## [1] 0.328125
pi1=p^2;pi1
## [1] 0.107666
pi2=2*p*(1-p);pi2
## [1] 0.440918
pi3=(1-p)^2;pi3
## [1] 0.451416
obs.data <- c(15, 33, 48)
gof.test <- chisq.test(obs.data, p = c(pi1,pi2,pi3));gof.test
##
## Chi-squared test for given probabilities
##
## data: obs.data
## X-squared = 4.6623, df = 2, p-value = 0.09718
Since our p-value 0.0971831 > .05, we fail-to-reject the null hypothesis and connot conclude that the data does not follow the Hardy-Weinberg Theory.
#Likelihood-ratio Test
gsq <- 2*(15*log(15/(pi1*96)) + 33*log(33/(pi2*96)) + 48*log(48/(pi3*96)));gsq
## [1] 4.555382
pv=1-pchisq(gsq,1);pv
## [1] 0.03281543
Since our p-value 0.0328154 < .05, we reject the null hypothesis and conclude that the data does not follow the Hardy-Weinberg Theory.
gof.test$residuals
## [1] 1.4507395 -1.4337712 0.7085007
sqrt(2/3)
## [1] 0.8164966
AA was the fartest from what we expected and contributed the most the the \(\chi^2\) statistic, while aa was the closest to what we expected and contributed the least the the \(\chi^2\) statistic.
None of these appear to be much larger than the k-1/k threshold. Therefore, all three appear to fit the model.