Predict son's height using father's height

Guanglai Li

https://goo.gl/BWSYvf

Background

the data
In 1903, Pearson and Lee published a paper On the Laws of Inheritance in Man: I. Inheritance of Physical Characters. In this paper, they studied the relationship between father's and son's height. For this purpose, they measured the heights of 1078 father-son pairs.

goal of the project
In this project, we build a predictive model based on this data set to predict son's height using father's height.

Examining the data

Pearson's father-son height data set has been incorporated into UsingR package. The data has two columns, fheight for father's height and sheight for son's height. Father's height ranges 59.01 to 75.43 inches and son's heigh 58.51 to 78.36 inches.

library(UsingR)
data(father.son)
summary(father.son)
##     fheight         sheight     
##  Min.   :59.01   Min.   :58.51  
##  1st Qu.:65.79   1st Qu.:66.93  
##  Median :67.77   Median :68.62  
##  Mean   :67.69   Mean   :68.68  
##  3rd Qu.:69.60   3rd Qu.:70.47  
##  Max.   :75.43   Max.   :78.36

Linear regression model

A linear regression model is used to fit the father.son height data. The model is summarized below.

LM <- lm(fheight ~ sheight, data=father.son)
summary(LM)
## 
## Call:
## lm(formula = fheight ~ sheight, data = father.son)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.3590 -1.6406  0.0761  1.6095  7.1044 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 34.10745    1.76826   19.29   <2e-16 ***
## sheight      0.48890    0.02572   19.01   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.376 on 1076 degrees of freedom
## Multiple R-squared:  0.2513, Adjusted R-squared:  0.2506 
## F-statistic: 361.2 on 1 and 1076 DF,  p-value: < 2.2e-16

Making predictions with the model

In the following figure, scattered circles represent measured data and the blue line is the model fitting.

plot(father.son$fheight, father.son$sheight, xlab="father's height (inch)", ylab="son's height (inch)")
abline(LM, col='blue')

plot of chunk unnamed-chunk-3

The model fitting is used to predict son's height. Please visit this interactive shiny application to make predictions.