0.1 SUMMARY

Predicting a child’s future height is crucial for health monitoring, nutritional guidance, and early intervention to address potential medical issues. It assists in tailoring physical activities, psychologically preparing children for growth outcomes, and planning medical treatments. Additionally, it aids in genetic counseling and educational or social planning, ensuring the child receives appropriate support at different developmental stages. Overall, it helps parents and healthcare providers make informed decisions to support the child’s well-being and development.

This project’s aim is to provide with an app to estimate children’s future height taking into account genetics and sex.

We empirically test the designed formula to validate the results given within a reasonable margin error.

##      father          mother       childHeight   
##  Min.   :1.575   Min.   :1.473   Min.   :1.422  
##  1st Qu.:1.727   1st Qu.:1.600   1st Qu.:1.626  
##  Median :1.753   Median :1.626   Median :1.689  
##  Mean   :1.758   Mean   :1.628   Mean   :1.695  
##  3rd Qu.:1.803   3rd Qu.:1.673   3rd Qu.:1.770  
##  Max.   :1.994   Max.   :1.791   Max.   :2.007

0.2 INTRODUCTION

There are several techniques to predict a child’s future height:

We’ll use a variant of the midparent height method.

Then, we will test the designed formula so as to ensure that it provides with results within a reasonable margin error (\(\alpha\)=10%)

0.3 DATA

We use a well-known historical dataset (GaltonFamilies) containing the heights and sex of 934 children within 204 families with their respective parents height.

There’s a clear correlation between Parents height and their children height as we can observe from the graphs.

0.4 METHODOLOGY

We use simple linear multivariable regression to estimate the coefficients for the following equation:

\[ HC=b+a_1*HF+a_2*HM+a_3*S \] where:

  -   HC:Child height
  -   HF:Father height
  -   HM:Mother height
  -   S:Dicotomic sex variable (1 if male, 0 if female)
  

We minimize the sum of the squared residuals by choosing the correct coefficients.

\[ MIN\sum_{i=1}^{934}(RH-*EH)^2 \]

By solving the above optimization problem we find the coefficients leading to the following prediction equation: \[ HC=0.4196+0.3928*HF+0.3176*HM+0.1324*S \]

## 
## Call:
## lm(formula = GaltonFamilies$childHeight ~ GaltonFamilies$father + 
##     GaltonFamilies$mother + GaltonFamilies$gender, data = GaltonFamilies)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.241927 -0.037218  0.002395  0.037744  0.231650 
## 
## Coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               0.419639   0.069271   6.058    2e-09 ***
## GaltonFamilies$father     0.392843   0.028677  13.699   <2e-16 ***
## GaltonFamilies$mother     0.317610   0.031000  10.245   <2e-16 ***
## GaltonFamilies$gendermale 0.132461   0.003602  36.775   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.05498 on 930 degrees of freedom
## Multiple R-squared:  0.6354, Adjusted R-squared:  0.6342 
## F-statistic: 540.3 on 3 and 930 DF,  p-value: < 2.2e-16

0.5 APP

We developed a public accesible free-to-use app to calculate children height estimations based on genetics and sex.

https://guillermomh.shinyapps.io/my_shiny_app/

0.6 FURTHER RESEARCH

There’s still a lot to improve in terms of the estimation equation due to the lack of datasets containing weight,height,age,sex of children together with height of their respective parents.

Improvements could be made to the regression model by adding environmental factors such as nutrition level or ethnicity of the children whose adult height we are trying to predict.