Bootstrapping Sample Size

Exercise 1

Use the first two years of monthly observations from the co2 data frame as your pilot study sample. What is the point estimate of the sample variance?

library(boot)
pilot.data <- co2[1:24]
var(pilot.data)

## [1] 3.463971

Exercise 1 Response

The data suggests that the point estimate for the sample variance is 3.463971. This value will be used as our pilot study sample.

Exercise 2

Calculate a 95% bootstrap confidence interval type=bca for the pilot.data. Save the upper confidence limit to working variable var.upper. How large is the upper confidence limit in comparison to the point estimate?

Boot.Link <- function(theData, theIndicies) {
  return( var(theData[theIndicies]) )
}
Bco2var <- boot(pilot.data, Boot.Link, R=1000)
( varCI <- boot.ci(Bco2var, type="bca") )

## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 1000 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = Bco2var, type = "bca")
## 
## Intervals : 
## Level       BCa          
## 95%   ( 2.299,  5.803 )  
## Calculations and Intervals on Original Scale
## Some BCa intervals may be unstable

var.upper <- varCI$bca[5]

Exercise 2 Response

The interval calculation, also referred to as the upper confidence limit, ranges from 2.205 to 5.726. This means the point estimate is a strong pilot study sample.

Exercise 3

Use the power.t.test() function to estimate the sample sizes for a two-sample t test comparing mean CO2 levels. Use a delta of 5ppm, sig.level (alpha) of 0.05, and power of 0.95. Make two estimates: one with the point estimate of the sample variance, and another with the upper confidence limit. In each case, how large should each sample interval be?

power.t.test(n=NULL, delta=5, sd=sqrt(var(pilot.data)), 
             sig.level=0.05, power=0.95, alternative = "two.sided")

## 
##      Two-sample t test power calculation 
## 
##               n = 4.814728
##           delta = 5
##              sd = 1.861175
##       sig.level = 0.05
##           power = 0.95
##     alternative = two.sided
## 
## NOTE: n is number in *each* group

power.t.test(n=NULL, delta=5, sd=sqrt(var.upper), 
             sig.level=0.05, power=0.95, alternative = "two.sided")

## 
##      Two-sample t test power calculation 
## 
##               n = 7.155238
##           delta = 5
##              sd = 2.409001
##       sig.level = 0.05
##           power = 0.95
##     alternative = two.sided
## 
## NOTE: n is number in *each* group

Exercise 3 Response

Given the significance level, the delta and null values were calculated for the upper confidence interval and the point estimate. In order to get a CO2 power level of 95%, every sample should lie between the interval calculated in Exercise 2.