Mean Ages

Next we will take the same data and explore the bootstrapping method. Many of you were introduced to this in Intro to Stats. I’ll repeat my hypothesis test that the mean ages of men and women olympic athletes are different. If we establish the mean age of men and women olympians as \(\mu_M\) and \(\mu_W\) respectively, our null and alternative hypothesis will be:

\[ \begin{align} H_0:\ \mu_M=\mu_W\\ H_A:\ \mu_M\neq\mu_W. \end{align} \] I have added the library boot to my packages that I load at the beginning of the document and I’ll set.seed, there is nothing special about the number you pick, if you change it you’ll get another random, but repeatable action. Any Douglas Adams fans know why I use 42? In order to run the boot command you have to pass it a function that can be sub-divided. Essentially you just have to pass it a function that slices the dataset. Mimic what I do below and you should be golden!

set.seed(42)
samp_mean <- function(x, i) {
  mean(x[i])
}
ages <- na.omit(data$Age) #needed to get rid of the NaN's
results <- boot(ages, samp_mean, 100)
plot(results)

boot.ci(results, type="perc")
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 100 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = results, type = "perc")
## 
## Intervals : 
## Level     Percentile     
## 95%   (25.54, 25.58 )  
## Calculations and Intervals on Original Scale
## Some percentile intervals may be unstable

Let’s compute a \(p\) value for the following hypothesis test \[ H_0: \mu = 25.5\\ H_A: \mu > 25.5 \] The \(t\)-Student \(p\)-value will be computed with the following code.

pt((25.5 - results$t0)/sd(results$t),length(results$t)-1)
## [1] 7.747508e-07

We can also compute the percentile \(p\) value based on the bootstrap by doing

mean(25.5>results$t)
## [1] 0

Clearly it is a small value here! The sample size also drives this result to be 0 rather than a reportable value.

Sex and Age

Let’s repeat this but first divide the data based on Sex.

menages <- na.omit(data[which(data$Sex == 'M'),'Age'])
womenages <- na.omit(data[which(data$Sex == 'F'),'Age'])
bootM <- boot(menages, samp_mean,100)
bootW <- boot(womenages, samp_mean, 100)
boot.ci(bootM, type = 'perc')
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 100 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = bootM, type = "perc")
## 
## Intervals : 
## Level     Percentile     
## 95%   (26.24, 26.30 )  
## Calculations and Intervals on Original Scale
## Some percentile intervals may be unstable
boot.ci(bootW, type = 'perc')
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 100 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = bootW, type = "perc")
## 
## Intervals : 
## Level     Percentile     
## 95%   (23.69, 23.77 )  
## Calculations and Intervals on Original Scale
## Some percentile intervals may be unstable

We notice that the center of both confidence intervals fall outside of the other so we will reject the null hypothesis. The mean age of men and women are different. The \(p\) value of obtaining the women’s mean age in a men’s bootstrap is

mean(bootM$t<bootW$t0)
## [1] 0

Here the \(t** returned the bootstrap distribution and the **\)t0 gave the mean of the bootstrap distribution.

Sex and Age Looking at Difference

So the above analysis will work but will not give me an estimate of where the difference of the mean ages of the men and women actually lie. Because of this, I will be hard pressed to recover a \(p\) value and get a good estimate of the difference in the ages. To do that (and what I should have done in the first place!) I am going to repeat this bootstrap but look at the difference of the means when the people are randomly assigned gender. This is one of the difficulties of computing \(p\) values is that you really must assume the null hypothesis is true, gender does not play a role in age.

diff2 = function(d1,i){
    d = d1; 
    d$group <- d$Sex[i];  # randomly re-assign groups
    Mean= tapply(X=d$Age, INDEX=d$group, mean)
    Diff = Mean[1]-Mean[2]
    Diff
}

d1 <- na.omit(data[c('Sex','Age')])

diffboot = boot(data = d1,statistic= diff2, R=100)
print(diffboot)
## 
## ORDINARY NONPARAMETRIC BOOTSTRAP
## 
## 
## Call:
## boot(data = d1, statistic = diff2, R = 100)
## 
## 
## Bootstrap Statistics :
##      original   bias    std. error
## t1* -2.544681 2.545117  0.02842457

So then the actual difference between the ages was -2.544681 and under the null hypothesis that there should be no difference I found that! For the \(p\) value, I’ll ask how many do in fact fall outside of that range and take an average.

mean(abs(diffboot$t)>abs(diffboot$t0))
## [1] 0

A \(p\) value of zero is extreme, but honestly the difference in ages here is very extreme because of how big the sample size actually is!

Non-Parametric Stat

Let’s look at a non-parametric statistic. I am going to ask if the median age of winter athletes is different from summer athletes.

winterAge <- na.omit(data[which(data$Season== 'Winter'),'Age'])
summerAge <- na.omit(data[which(data$Season== 'Summer'),'Age'])
median(winterAge)
## [1] 24
median(summerAge)
## [1] 24

Well they both have the same sample median so I doubt they will be different but let’s test it anyway. \[ H_0; m_S=m_W\\ H_A: m_S\neq m_W \]

samp_median <- function(x,i){
  median(x[i])
}

bootW = boot(winterAge,samp_median,100)
boot.ci(bootW, type = 'perc')
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 100 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = bootW, type = "perc")
## 
## Intervals : 
## Level     Percentile     
## 95%   (24, 25 )  
## Calculations and Intervals on Original Scale
## Some percentile intervals may be unstable
bootS = boot(summerAge, samp_median,100)
boot.ci(bootS, type = 'perc')
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 100 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = bootS, type = "perc")
## 
## Intervals : 
## Level     Percentile     
## 95%   (24, 25 )  
## Calculations and Intervals on Original Scale
## Some percentile intervals may be unstable

Since the Confidence intervals are the same, we will fail to reject the null hypothesis. Thus we do not have evidence to suggest that the median age of winter and summer athletes is different.

I should also do the \(p\) value here. Again I’m comparing values so to get an accurate \(p\) value I need to assume the null hypothesis is true, assume that the sports have no effect.

diff3 = function(d2,i){
    d = d2; 
    d$group <- d$Season[i];  # randomly re-assign groups
    Median= tapply(X=d$Age, INDEX=d$group, median)
    Diff = Median[1]-Median[2]
    Diff
}

d2 <- na.omit(data[c('Season','Age')])

diffmedian = boot(data = d2, statistic = diff3, R = 100)
print(diffmedian)
## 
## ORDINARY NONPARAMETRIC BOOTSTRAP
## 
## 
## Call:
## boot(data = d2, statistic = diff3, R = 100)
## 
## 
## Bootstrap Statistics :
##     original  bias    std. error
## t1*        0   -0.36   0.5029208
mean(abs(diffmedian$t)>abs(diffmedian$t0))
## [1] 0.38

With a \(p\) value above my rejection standard (I normally go with 5%), I will fail to reject my null hypothesis.

I should also note that in science a fail to reject is almost never reported. This is sometimes called the ‘file drawer problem’ in that research that yields a ‘fail to reject the null hypothesis’ is never reported. This is an issue because if an erroneous experiment is repeated 20 times, 19 of these will fail to reject but the one that does reject the null hypothesis will be published. Sometimes this is also referred to as the reproducibility crisis. It is a big problem in the social science literature!

Actually let’s repeat doing the median of men and women. I am not happy with my \(p\) values yet and I can use the practice.

\[ H_0: m_M = m_W\\ H_A: m_M\neq m_W \]

diff4 = function(d1,i){
    d = d1; 
    d$group <- d$Sex[i];  # randomly re-assign groups
    Median= tapply(X=d$Age, INDEX=d$group, median)
    Diff = Median[1]-Median[2]
    Diff
}

diffBootMedianAge = boot(data = d1, statistic = diff4, R=100)
print(diffBootMedianAge)
## 
## ORDINARY NONPARAMETRIC BOOTSTRAP
## 
## 
## Call:
## boot(data = d1, statistic = diff4, R = 100)
## 
## 
## Bootstrap Statistics :
##     original  bias    std. error
## t1*       -2    2.15   0.4578165
mean(abs(diffBootMedianAge$t)>abs(diffBootMedianAge$t0))
## [1] 0

Another stinking 0 but I am confident these are really different so yes let’s say the ages of men and women are truly different!