Synopsis

The purpose of this file is to create boostrap examples to reference. This file will cover nonparametric bootstrapping.

First, let us discuss the reason to use the bootstrap approach. “When doing statistical inference, it generaally relies on the sampling distribution as well as a standard error.” The sample mean \(\bar{X}\) deviates from the actual mean \(\mu\) of a population, and the standard error measures that deviation and is the accuracy of the error of that sample. By a proper defintion: “Standard Error is the standard deviation of the sampling distribution of a statistic.”

Bootstrapping

# First we need to initiate the following:
# 1. data = x
x<-father.son$sheight

# 2. resampling size = n
n<-length(father.son$sheight)

# 3. Iterations for resampling = B
B<-10000
# Next we must create the bootstrap of resampling of the medians
# 1. create a boostrap of resamples with replacement
resamples<-matrix(sample(x,n*B,replace=TRUE),B,n)

# 2. run the medians across matrix
xBar<-apply(resamples,1,mean)
# Next we determine a statistical inference
# 1. Estimated Standard Error of Means
sd(xBar)
## [1] 0.08517796
# 2. Define the confidence interval
quantile(xBar,c(0.025,0.975))
##     2.5%    97.5% 
## 68.51840 68.84896
# difference between sample mean and population mean
abs(mean(xBar) - mean(father.son$sheight))
## [1] 0.0006997477
# finally visualize the distribution and the theorical mean and actual mean
hist(xBar,breaks=40,freq=FALSE)
lines(density(xBar),col="red")
abline(v=mean(xBar),col="green",lwd=3) #theoritical mean
abline(v=mean(father.son$sheight),col="blue",lwd=3) # actual mean