Thirty recreational basketball players were asked to shoot two free throws. Data on whether they made or missed their shots are shown in the table below. The question of interest is whether the probability of making a shot on the first attempt is different than the probability of making a shot on the second attempt.
mat<-matrix(c(4,5,14,7),ncol=2)
rownames(mat)<-c("Made First","Missed First")
colnames(mat)<-c("Made Second","Missed Second")
addmargins(mat)
## Made Second Missed Second Sum
## Made First 4 14 18
## Missed First 5 7 12
## Sum 9 21 30
Use McNemar’s test to answer this question.
\(H_0: P_{made.both} + P_{made.only.first} = P_{made.both} + P_{made.only.second}\) More simply: \(H_0: P_{made.only.first} = P_{made.only.second}\) \(H_a: P_{made.only.first} \neq P_{made.only.second}\)
mcn = mcnemar.test(mat); mcn
##
## McNemar's Chi-squared test with continuity correction
##
## data: mat
## McNemar's chi-squared = 3.3684, df = 1, p-value = 0.06646
With a chi-squared value of 3.3684211 and a p-value of 0.0664574 > \(\alpha = 0.05\), we do not have enough evidence to reject the null. Therefore we cannot conclude there is a difference in the probability of making a shot on the first attempt compared to the probability of making a shot on the second attempt.
(GRAD ONLY) Ignore the pairing of the data and analyze this data using a permutation χ2 test. Compare this p-value with the p-value from the previous part. Comment. \(H_0:\) There is no association between first and second free-throw shot \(H_a:\) There is an association between first and second free-throw shot
fish = fisher.test(mat); fish
##
## Fisher's Exact Test for Count Data
##
## data: mat
## p-value = 0.4181
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
## 0.05986525 2.61005193
## sample estimates:
## odds ratio
## 0.4131166
The p-value from fisher’s exact test 0.4181063 is greater than the p-value from McNemar’s test. Neither are significant, but the McNemar’s test is better in this situation because the data is paired and the assumption is that there is some correlation between shots. What’s strange in this situation is that there is no treatment between first and second shot.
The data below are the oesinophil counts taken from blood samples of 40 health rabbits. Obtain bootstrap estimates of the MSE and standard error of: The sample mean
eosinophil <- c(55,140,91,122,111,185,203,101,76,145,95,101,196,45,299,226,65,70,196,72,121,171,151,113,112,67,276,125,100,81,122,71,158,78,162,128,96,79,67,119)
theta.hat = mean(eosinophil)
set.seed(1)
n = 40
nsim = 1000
theta.boots = rep(NA,nsim)
for (i in 1:nsim){
boots.Sample = eosinophil[sample(1:n, n, replace = T)]
theta.boots[i] = mean(boots.Sample)
}
MSE.Boot.mean = mean((theta.boots-theta.hat)^2); MSE.Boot.mean
## [1] 76.40564
se.boot.mean = sd(theta.boots); se.boot.mean
## [1] 8.742941
The Standard Deviation
set.seed(1)
theta.hat = sd(eosinophil)
for (i in 1:nsim){
boots.Sample = eosinophil[sample(1:n, n, replace = T)]
theta.boots[i] = sd(boots.Sample)
}
MSE.Boot.sd = mean((theta.boots-theta.hat)^2); MSE.Boot.sd
## [1] 64.87702
se.boot.sd = sd(theta.boots); se.boot.sd
## [1] 7.937986
The 95th percentile.
set.seed(1)
nsim = 1000
theta.hat = quantile(eosinophil, probs = .95)
perc.boots = rep(NA,nsim)
for (i in 1:nsim){
boots.Sample = eosinophil[sample(1:n, n, replace = T)]
perc.boots[i] = quantile(boots.Sample, probs = .95)
}
MSE.Boot.95p = mean((perc.boots-theta.hat)^2); MSE.Boot.95p
## [1] 1322.626
sqrt(MSE.Boot.95p)
## [1] 36.36793
#bias = mean(perc.boots)-theta.hat
se.boot.95p = sd(perc.boots); se.boot.95p
## [1] 36.24879
Simulation Study 1: • Generate a random sample of size n = 15 from a normal distribution witn mean 5 and variance 36. • Calculate X ̄ from your sample. • What is the true value of var(X ̄)?
set.seed(1)
n = 15
Xbar = rnorm(n, 5, 6)
xbar.hat = mean(Xbar); xbar.hat
## [1] 5.605057
var.xbar.true = 36/n; var.xbar.true
## [1] 2.4
Calculate var(X Ì„) using the bootstrap approach based on your random sample of data. Compare the theoretical var(X Ì„) to the bootstrap estimate var(X Ì„).
nsim = 1000
mean.boots = rep(NA, nsim)
for (i in 1:nsim){
sample.boots = Xbar[sample(1:n, n , replace = T)]
mean.boots[i] = mean(sample.boots)
}
c(var(mean.boots), var.xbar.true)
## [1] 2.367037 2.400000
The variance from my sample was further away from the true variance of 36 than my bootstrap variaance.
(GRAD STUDENTS ONLY) Simulation Study 2: • Generate a random sample of size n = 100 from a random variable X such that X|B = 1 ∼ Normal(20,5), X|B = 0 ∼ Normal(10,10), and B ∼ Binomial(1, p = 0.75). • Calculate X ̄ from your sample. • What is the true value of var(X ̄)? Law of total variance: \(Var(\bar{X})\) = \({E[\bar{X}^2] - E(\bar{X})]^2}\over n\) \[= [0.75*(20^2 +5) + .25* (10^2 + 10)] - (.75*20 + .25*10)^2 \over100\]
true.var = ((0.75*(20^2 +5) + .25* (10^2 + 10)) - (.75*20 + .25*10)^2)/n; true.var
## [1] 1.666667
Dr. Lou helped us with this.
• Calculate var(X ̄) using the bootstrap approach based on your random sample of data. • Compare the theoretical var(X ̄) to the bootstrap estimate var(X ̄).
set.seed(1)
n = 100
B = rbinom(n, 1, p=.75)
x = rep(NA, n)
for (i in 1:n){
if (B[i] == 1){
x[i] = rnorm (1, 20, sqrt(5))
}
else{
x[i] = rnorm(1, 10, sqrt(10))
}
}
plot(density(x))
mean(x)
## [1] 17.29198
set.seed(1)
nsim = 1000
theta.boots = rep(NA,nsim)
for (i in 1:nsim){
boots.Sample = x[sample(1:n, n, replace = T)]
theta.boots[i] = mean(boots.Sample)
}
c(var(theta.boots) , true.var)
## [1] 0.2144304 1.6666667