About My Data
This data set was collected by S.G. Sani, E.D. Kizilkanat and N. Boyan, in 2005. Its featured in “Stature Estimation Based on Hand Length and Foot Length,” Clinical Anatomy. The data collected has the variables Stature, hand length, and foot length among 80 males and 75 females. The data is simulated to have equal means, SDs, and correlations. I chose for this data set to focus on how hand length and height are correlated
handfoot <- read.csv("https://raw.githubusercontent.com/jaidenneff/sta321/main/stature_hand_foot%20(1).csv", header = TRUE)
#head(diabites)
#dim()
pairs(handfoot, main ="Pair-wise Association: Scatter Plot")
The Pairwise Scatter plot shows that height hand length and foot length all have positive linear relationships.
Handlength <- handfoot$handLen
Height <- handfoot$height
plot(Height, Handlength, pch = 21, col ="red",
main = "Relationship between Height and Hand Length")
When looking a little closer and just using hand length and height in an association graph you can see that the data points have a strong positive linear relationship.
vec.id <- 1:length(Handlength) # vector of observation ID
boot.id <- sample(vec.id, length(Handlength), replace = TRUE) # bootstrap obs ID.
boot.handlength <- Handlength[boot.id] # bootstrap handlegth
boot.height <- Height[boot.id] # corresponding bootstrap height
B <- 1000 # number of bootstrap replicates
# define empty vectors to create a wearhouse for the data
boot.beta0 <- NULL
boot.beta1 <- NULL
## bootstrap regression models loops here
vec.id <- 1:length(Handlength)
for(i in 1:B){
boot.id <- sample(vec.id, length(Handlength), replace = TRUE) # bootstrap obs ID.
boot.handlength <- Handlength[boot.id] # bootstrap handlength
boot.height <- Height[boot.id] # corresponding bootstrap height
## regression
boot.reg <-lm(Handlength[boot.id] ~ Height[boot.id])
boot.beta0[i] <- coef(boot.reg)[1] # bootstrap intercept
boot.beta1[i] <- coef(boot.reg)[2] # bootstrap slope
}
## 95% bootstrap confidence intervals
boot.beta0.ci <- quantile(boot.beta0, c(0.025, 0.975), type = 2)
boot.beta1.ci <- quantile(boot.beta1, c(0.025, 0.975), type = 2)
boot.coef <- data.frame(rbind(boot.beta0.ci, boot.beta1.ci))
names(boot.coef) <- c("2.5%", "97.5%")
kable(boot.coef, caption="Bootstrap confidence intervals of regression coefficients.")
| 2.5% | 97.5% | |
|---|---|---|
| boot.beta0.ci | -29.9115743 | 13.5786098 |
| boot.beta1.ci | 0.1106715 | 0.1368198 |
For this regression model we bootstrap the sample and then find the confidence interval of the regression coefficients. The 95% confidence interval for the bootstrap intercept is (-28.51, 14.953). The 95% confidence interval for the slop is (0.1098,0.13608).
m0 = lm(Height~Handlength)
E = resid(m0) # Original residuals
a.hat = coef(m0)[1]
b.hat = coef(m0)[2]
##
B = 1000 # generating 1000 bootstrap samples
bt.alpha = rep(0, B)
bt.beta = bt.alpha
for(i in 1:B){
bt.e = sample(E, replace = TRUE) # bootstrap residuals
y.hat = a.hat + b.hat*Handlength + bt.e # bootstrap hand length
## bootstrap SLR
bt.m = lm(y.hat ~ Handlength)
bt.alpha[i] = coef(bt.m)[1]
bt.beta[i] = coef(bt.m)[2]
}
alpha.CI = quantile(bt.alpha, c(0.025, 0.975))
beta.CI = quantile(bt.beta, c(0.025, 0.975))
##
per.025 = c(alpha.CI[1],beta.CI[1]) # lower CI for alpha and beta
per.975 = c(alpha.CI[2],beta.CI[2]) # upper CI for alpha and beta
After running this model I found the 95% confidence intervals to be more useful then the original model. The 95% confidence interval for the bootstrap of alpha or height in inches is (342.98,556.11). And the 95% confidence interval for the bootstrap of beta hand length in inches is (5.62,6.69). These values are more useful for looking at the predictive and comparative values of hand length and height as apposed to looking at the values on a graphical level such as intercept and slope.