Abstract
Bootstrapping involves sampling with replacement from a population. It is an elegant and accurate method that does not require that the sample size is large, or that the distributions are Normal. These examples and resampling strategies are from ‘Resampling Stats Illustrations’ of Peter Bruce. See http://www.statistics101.net/PeterBruce_05-illus.pdf.What is the probability that at least 2 people in a room of 25 share the same birthday?
Approach: generate an urn with 365 numbers. Pull 25 out of it, with replacement. Check for duplicates; a score of >= 1 duplicates counts as 1.
pacman::p_load(data.table, foreach, magrittr, ggplot2, bootES)
reps = 1000
urn = 1:365
reps.found = foreach(1:reps, .combine = c) %do% {
s = sample(urn, 25, replace=T)
score = sum(duplicated(s)) >= 1
} %>% sum
p = reps.found/reps
cat("probability = ", p)
## probability = 0.572
In the 1988 NBA series, Larry Bird got only 20 baskets out of 57 shots. Normally he is a 48% shooter. Is he in a slump?
Appproach: generate an urn with 100 numbers. Pull 57 numbers out of it, with replacement; these are the shots. Count the numbers <= 48; this is what he would normally score. Out of this series, count the number of shots that are <= 20.
urn = 1:100
reps = 1000
l = foreach(1:reps, .combine = c) %do% {
s = sample(urn, 57, replace = T)
normal.count = sum(s <= 48)
}
low.count = sum(l <= 20)
p = low.count/reps
cat("probability =", p)
## probability = 0.024
If you consider a normal series of 48% success rate in series of 57 shots, you wil find less than 5% cases where the success rate is <= 20%. So this looks indeed like a slump.
Out of 1500 voters, 840 would vote for Bush and 660 for Dukakis. What is the confidence interval for the Bush voters?
Approach: generate an urn filled with balls 1:100. Sample 1500 out of this urn with replacement. Results <= 56 are Bush voters (840/1500). Calculate percentiles.
urn = 1:100
l = foreach(1:reps, .combine=c) %do% {
s = sample(urn, 1500, replace=T)
bush = (840/1500) * 100
score = sum(s < bush)/1500
}
p = quantile(l, c(0.025, 0.975)) %>% round(3)
cat("95% confidence interval =", p)
## 95% confidence interval = 0.535 0.586
An agricultural lab experiments with a new pig ration on 12 pigs. After 4 weeks, the pigs have on average gained 508. The individual weight gains are included below (“urn”).
urn = c(496,544,464,416,512,560,608,544,480,466,512,496)
reps = 1000
l = foreach(1:reps, .combine = c) %do% {
s = sample(urn, 12, replace = T)
m = mean(s)
}
p = quantile(l, c(0.025,0.975)) %>% round
cat("95% confidence interval =", p )
## 95% confidence interval = 480 535
This confidence is slightly narrower than conventional; a better estimation can be obtained using the bootES package.
bootES(urn, R=1000, ci.type="bca")
##
## 95.00% bca Confidence Interval, 1000 replicates
## Stat CI (Low) CI (High) bias SE
## 508.167 478.894 536.000 0.914 14.237
We have 2 brands of batteries and tested how long they lasted. Brand A is somewhat better (delta = 1.3). Is this significant?
Approach: If there is no difference between brand A and brand B, then we can consider them to be part of the same population (= null hypothesis). We can put them all together in one urn and draw repeated samples with replacement out of it.
A = c(30,32,31,28,31,29,29,24,30,31)
B = c(28,28,32,31,24,23,31,27,27,31)
m1 = mean(A)
m2 = mean(B)
delta = m1-m2
urn = c(A,B)
reps = 1000
l = foreach(1:reps, .combine = c) %do% {
s1 = sample(urn, length(A), replace = T)
s2 = sample(urn, length(A), replace = T)
ms1 = mean(s1)
ms2 = mean(s2)
deltas = ms1 - ms2
}
prob = length(l[l > delta])/reps
cat("Probability = ", prob)
## Probability = 0.139
The probability of drawing random samples out of the combined population that has a difference larger than 1.3 is larger than 5%. The difference is therefore not significant.
An airline wants to know whether the no-show rates for business routes is different from that of vacation routes. It has taken a sample of 1000 for both groups. How significant is the difference?
Approach: if the difference is not significant (null hypothesis) then we can take artificial samples and label them “business” and “vacation” respectively. We test how often no shows occur by comparing the samples against the average no-show. Then, we see how often the difference in the artificial sample is larger than the actual no-show difference between vacation and business.
n = 1000 # size of groups
noshow.business = 384 # actual no-shows from sample
noshow.vacation = 341
delta = noshow.business - noshow.vacation
null.hyp.est = (noshow.business + noshow.vacation) / (2 * n) # best estimate for difference
avg = ceiling(null.hyp.est * 1000)
reps = 1000
l = foreach(1:reps, .combine = c) %do% {
business = sample(1:1000, n, replace = T)
vacation = sample(1:1000, n, replace = T)
bnoshow = business %between% c(1,avg) %>% abs %>% sum
vnoshow = vacation %between% c(1,avg) %>% abs %>% sum
diff = bnoshow - vnoshow
}
prob = length(l[l >= delta]) / reps
cat("Probability = ", prob)
## Probability = 0.023
The difference is smaller than 5%, which is significant.