Problem 2: What’s wrong?

  1. The standard deviation of the bootstrap distribution will be approximately the same as the standard deviation of the original sample.

Hopefully the standard deviation of the bootstrap distribution is different than the standard deviation of the original sample - the one from the bootstrap sample should be close to a normal distribution since (supposedly) it will be run many times, while the standard deviation for the original sample likely isn’t normally distributed and may have skew.

  1. The bootstrap distribution is created by resampling without replacement from the original sample.

Not true, because in bootstrapping observations are drawn with replacement.

  1. When generating the resamples, it is best to use a sample size smaller than the size of the original sample.

The sample size of the bootstrapping resamples should be the same size as the original data.

  1. The bootstrap distribution is created by resampling with replacement from the population.

The bootstrap distribution is created by resampling with replacement from the original sample.

Problem 3: A small-sample permutation test

  1. Calculate the difference in means between the two groups. This is the observed value of the statistic.

\(\bar{x}_t\) = (57 + 61)/2 = 59

\(\bar{x}_c\) = (42 + 62 + 41 + 28)/4 = 43.25

\(\bar{x}_t\) - \(\bar{x}_c\) = 15.75

  1. Resample

\(\bar{x}_t\) = (41 + 42)/2 = 41.5

\(\bar{x}_c\) = (57 + 62 + 61 + 28)/4 = 52

\(\bar{x}_t\) - \(\bar{x}_c\) = -10.5

  1. Repeat part (b) 20 times. Make a histogram of the distribution of these 20 values.
treatment = c(57, 61)
control = c(42, 62, 41, 28)

set.seed(1)

bootStrapCI2<-function(data1, data2, nsim){
  
  n1<-length(data1)
  n2<-length(data2)
  
  bootCI2<-c()
  
  for(i in 1:nsim){
    bootSamp1<-sample(1:n1, n1, replace=TRUE)
    bootSamp2<-sample(1:n2, n2, replace=TRUE)
    thisXbar<-mean(data1[bootSamp1])-mean(data2[bootSamp2])
    bootCI2<-c(bootCI2, thisXbar)
  }
  
  return(bootCI2)
}


reading = bootStrapCI2(treatment, control, 20)
reading
##  [1] 12.25  5.50  0.00 17.25  5.00 22.75 -3.00 22.50 17.00  8.75 14.00 -3.00
## [13] 14.00 16.00 24.00 13.75 19.00 12.25 10.75 21.25
hist(reading)

  1. What proportion of the 20 statistic values were equal or greater than the original value in part a?

It looks like 8/20 or 0.4 were equal or greater than the original value of 15.75 in part a.

  1. For this small data set, there are only 15 possible permutations of the data. As a result, we can calculate the exact p-value by counting the number of permutations with a statistic value greater or equal to the original value and then dividing by 15. What is the exact p-value here? How close was your estiamte?
A = 57
B = 61 
C = 42
D = 62
E = 41
G = 28


#Treatment means
mt1 = mean(c(A, B))
mt2 = mean(c(A, C))
mt3 = mean(c(A, D))
mt4 = mean(c(A, E))
mt5 = mean(c(A, G))
mt6 = mean(c(B, C))
mt7 = mean(c(B, D))
mt8 = mean(c(B, E))
mt9 = mean(c(B, G))
mt10 = mean(c(C, D))
mt11 = mean(c(C, E))
mt12 = mean(c(C, G))
mt13 = mean(c(D, E))
mt14 = mean(c(D, G))
mt15 = mean(c(E, G))

#Control means
mc1 = mean(c(C, D, E, G))
mc2 = mean(c(B, D, E, G))
mc3 = mean(c(B, C, E, G))
mc4 = mean(c(B, C, D, G))
mc5 = mean(c(B, C, D, E))
mc6 = mean(c(A, D, E, G))
mc7 = mean(c(A, C, E, G))
mc8 = mean(c(A, C, D, G))
mc9 = mean(c(A, C, D, E))
mc10 = mean(c(A, B, E, G))
mc11 = mean(c(A, B, D, G))
mc12 = mean(c(A, B, D, E))
mc13 = mean(c(A, B, C, G))
mc14 = mean(c(A, B, C, E))
mc15 = mean(c(A, B, C, D))

#Difference in means (treatment - control)
m1 = mt1 - mc1
m2 = mt2 - mc2
m3 = mt3 - mc3
m4 = mt4 - mc4
m5 = mt5 - mc5
m6 = mt6 - mc6
m7 = mt7 - mc7
m8 = mt8 - mc8
m9 = mt9 - mc9
m10 = mt10 - mc10
m11 = mt11 - mc11
m12 = mt12 - mc12
m13 = mt13 - mc13
m14 = mt14 - mc14
m15 = mt15 - mc15

difference_in_means = c(m1, m2, m3, m4, m5, m6, m7, m8, m9, m10, m11, m12, m13, m14, m15)
difference_in_means
##  [1]  15.75   1.50  16.50   0.75  -9.00   4.50  19.50   3.75  -6.00   5.25
## [11] -10.50 -20.25   4.50  -5.25 -21.00
hist(difference_in_means)

Since 3 of the values for differences in means were greater or equal to the observed difference in means, the true p-value for this data is 0.2. My estimated p-value of 0.4 was fairly off.

Problem 4: Bootstrap Comparison of Tree Diameters

  1. Use a side-by-side boxplot or faceted histograms to examine the data graphically (splitting by region). Does it appear reasonable to use standard t procedures? (Hint: Recall assumptions for t-procedures such as approximate normality and/or sufficiently large sample size)
ggplot(nspines, aes(x = ns, y = dbh, color = ns)) + geom_boxplot()

ggplot(nspines, aes(x = dbh, color = ns)) + geom_histogram(binwidth = 5) + facet_wrap(~ns)

The sample size is barely sufficiently large enough (exactly 30 for each), and the distribution itself isn’t very normal.

  1. Calculate the observed statistic (North sample mean - South sample mean)
nspines_grouped = nspines %>% group_by(ns) %>% summarize(Mean = mean(dbh))
nspines_grouped
## # A tibble: 2 x 2
##   ns     Mean
##   <fct> <dbl>
## 1 n      23.7
## 2 s      34.5
nmean = 23.7
smean = 34.5

test_statistic = nmean - smean
test_statistic
## [1] -10.8
  1. Bootstrap the difference in means (at least 1000 times) and look at the bootstrap distribution (include the histogram).
north_data = nspines %>% filter(ns == "n")
south_data = nspines %>% filter(ns == "s")

#I know nothing about setting seeds and what an ideal value for it is - I simply chose one so that the distribution would be consistent each time it was run. 
set.seed(1)

north_south_pines = bootStrapCI2(north_data$dbh, south_data$dbh, 1000)
hist(north_south_pines)

  1. There are two ways we discussed calculating a bootstrap confidence interval, the quantile method and the hybrid method. Calculate both types of confidence intervals.
#Quantile Method 
quantile(north_south_pines, c(0.025, 0.975))
##      2.5%     97.5% 
## -18.88783  -3.27825
#Hybrid Method
bootSE = sd(north_south_pines)
(mean(north_data$dbh)-mean(south_data$dbh))+c(-1,1)*qt(0.975, df=14)*bootSE
## [1] -19.675967  -1.990699
  1. Comment on whether the conditions for the hybrid method are met. Do you believe this interval would be reliable?

For the main conditions for the hybrid method to be true:

  1. the bootstrap distribution should be symmetrical/close to a normal distribution
  2. the center of the bootstrap should be close to the observed value (bias is small)

Looking at the distributions for the bootstrap sample, it does look relatively normally distributed. For the hybrid method the mean should be around -10.8 and the mean looks close to that value here, so I think the conditions would be met and it would be relatively reliable.

  1. Compare the bootstrap results from the usual two-sample t confidence interval. How do the intervals differ? Which would you use?
t.test(north_data$dbh, south_data$dbh)
## 
##  Welch Two Sample t-test
## 
## data:  north_data$dbh and south_data$dbh
## t = -2.6286, df = 55.725, p-value = 0.01106
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -19.090199  -2.576468
## sample estimates:
## mean of x mean of y 
##  23.70000  34.53333

The hybrid method provides the largest interval, with the quantile method generating the smallest interval, and the t-test somewhere in between. I personally prefer the quantile method because it makes no underlying assumptions about the distribution and the interval is the smallest.