488 Homework 4

(1). Thirty recreational basketball players were asked to shoot two free throws. Data on whether the made or missed their shots are shown in the table below. The question of interest is whether the probability of making a shot on the first attempt is different than the probability of making a shot on the second attempt.

Use McNemar’s test to answer this question.

mat<-matrix(c(4,5,14,7),ncol=2)
rownames(mat)<-c("Made First","Missed First")
colnames(mat)<-c("Made Second","Missed Second")
mcnemar.test(mat)

## 
##  McNemar's Chi-squared test with continuity correction
## 
## data:  mat
## McNemar's chi-squared = 3.3684, df = 1, p-value = 0.06646

At \(\alpha\)=0.05 and p=0.06646, we fail to reject the null hypothesis and do not have enough evidence to suggest there is a difference in the probability of making the first free throw shot vs the second free throw shot.

(GRAD ONLY) Ignore the pairing of the data and analyze this data using a permutation \(\chi^{2}\) test. Compare this p-value with the p-value from the previous part. Comment.

fisher.test(mat)

## 
##  Fisher's Exact Test for Count Data
## 
## data:  mat
## p-value = 0.4181
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  0.05986525 2.61005193
## sample estimates:
## odds ratio 
##  0.4131166

At \(\alpha\)=0.05 and p=0.4181, we fail to reject the null hypothesis and do not have enough evidence to suggest there is a difference in the probability of making the first free throw shot vs the second free throw shot.
Both tests reported p-values > 0.05 and we failed to reject the null hypothesis in both cases, although the p-value from McNemar’s test was much lower and relatively close to the significant/not significant cut off (despite how conservative continuity corrections are).

(2). The data below are the eosinophil counts taken from blood samples of 40 health rabbits. Obtain bootstrap estimates of the MSE and standard error of the sample mean, the standard deviation and the 95th percentile.

eosinophil <- c(55,140,91,122,111,185,203,101,
                76,145,95,101,196,45,299,226,
                65,70,196,72,121,171,151,113,
                112,67,276,125,100,81,122,71,
                158,78,162,128,96,79,67,119)

n<-length(eosinophil)

Sample Mean

set.seed(1234)
nsim<-1000

theta.hat<-mean(eosinophil)

theta.boots<-rep(NA,nsim)
for (i in 1:nsim){
  boots.sample<-eosinophil[sample(1:n,n,replace=TRUE)]
  theta.boots[i]<-mean(boots.sample)
  }

# MSE
mean((theta.boots-theta.hat)^2)

## [1] 84.1727

# Standard Error
sd(theta.boots)

## [1] 9.154394

Standard Deviation

set.seed(1234)
nsim<-1000

theta.hat<-sd(eosinophil)

theta.boots<-rep(NA,nsim)
for (i in 1:nsim){
  boots.sample<-eosinophil[sample(1:n,n,replace=TRUE)]
  theta.boots[i]<-sd(boots.sample)
  }

# MSE
mean((theta.boots-theta.hat)^2)

## [1] 70.37959

# Standard Error
sd(theta.boots)

## [1] 8.313093

\(95^{th}\) Percentile

set.seed(1234)
nsim<-1000

theta.hat<-quantile(eosinophil, probs=0.95)

pct.boots<-rep(NA,nsim)
for (i in 1:nsim){
  boots.sample<-eosinophil[sample(1:n,n,replace=TRUE)]
  pct.boots[i]<-quantile(boots.sample, probs=0.95)
  }

# MSE
mean((pct.boots-theta.hat)^2)

## [1] 1393.55

# Standard Error
sd(pct.boots)

## [1] 36.85756

(3). Simulation Study 1:

Generate a random sample of size n = 15 from a normal distribution with mean 5 and variance 36.

set.seed(1234)
n<-15
xbar<-rnorm(n, 5, sqrt(36))

Calculate \(\bar{X}\) from your sample.

mean(xbar)

## [1] 2.976218

What is the true value of var(\(\bar{X}\))?

36/n

## [1] 2.4

Calculate var(\(\bar{X}\)) using the bootstrap approach based on your random sample of data.

set.seed(1234)
nsim<-1000

theta.boots<-rep(NA,nsim)

for (i in 1:nsim){
  boots.sample<-xbar[sample(1:n,n,replace=TRUE)]
  theta.boots[i]<-mean(boots.sample)
  }

var(theta.boots)

## [1] 1.844354

Compare the theoretical var(\(\bar{X}\)) to the bootstrap estimate var(\(\bar{X}\))

Bootstrap estimate var(\(\bar{X}\))=1.84 < theoretical var(\(\bar{X}\))=2.4

(4). (GRAD STUDENTS ONLY) Simulation Study 2:

Generate a random sample of size n = 100 from a random variable X such that:

\[ X|B=1∼Normal(20, 5) \]

\[ X|B=0∼Normal(10, 10) \]

\[ B=Binomial(1, p=0.75) \]

Calculate \(\bar{X}\) from your sample.

set.seed(1234)

N<-100
B<-rbinom(N, 1, 0.75)

X <- rep(NA,N)
for(i in 1:N){
  if (B[i] == 1) {
    X[i] = rnorm(1,20,sqrt(5))
    }
  else {
    X[i] = rnorm(1,10,sqrt(10))
  }
  }

# Sample mean
mean(X)

## [1] 18.30675

What is the true value of var(\(\bar{X}\))?

p.success=0.75; mu.success=20; var.success=5;
p.fail=0.25; mu.fail=10; var.fail=10;

((p.success*(mu.success^2 + var.success) + p.fail* (mu.fail^2 + var.fail)) - (p.success*mu.success + p.fail*mu.fail)^2)/100

## [1] 0.25

True value of var(\(\bar{X}\))=0.25

Calculate var(\(\bar{X}\)) using the bootstrap approach based on your random sample of data.

set.seed(1234)
nsim<-1000

theta.boots<-rep(NA,nsim)
for (i in 1:nsim){
  boots.sample<-X[sample(1:N,N,replace=TRUE)]
  theta.boots[i]<-mean(boots.sample)
  }

var(theta.boots)

## [1] 0.1885444

Compare the theoretical var(\(\bar{X}\)) to the bootstrap estimate var(\(\bar{X}\)).

Bootstrap estimate var(\(\bar{X}\))=0.1885 < theoretical var(\(\bar{X}\))=0.25

488 Homework 4

Sofia Bello

3/22/2019