Predicting a child’s future height is crucial for health monitoring, nutritional guidance, and early intervention to address potential medical issues. It assists in tailoring physical activities, psychologically preparing children for growth outcomes, and planning medical treatments. Additionally, it aids in genetic counseling and educational or social planning, ensuring the child receives appropriate support at different developmental stages. Overall, it helps parents and healthcare providers make informed decisions to support the child’s well-being and development.
This project’s aim is to provide with an app to estimate children’s future height taking into account genetics and sex.
We empirically test the designed formula to validate the results given within a reasonable margin error.
## father mother childHeight
## Min. :1.575 Min. :1.473 Min. :1.422
## 1st Qu.:1.727 1st Qu.:1.600 1st Qu.:1.626
## Median :1.753 Median :1.626 Median :1.689
## Mean :1.758 Mean :1.628 Mean :1.695
## 3rd Qu.:1.803 3rd Qu.:1.673 3rd Qu.:1.770
## Max. :1.994 Max. :1.791 Max. :2.007
There are several techniques to predict a child’s future height:
Mid-parental height method
Tanner method
RWT method
Khamis Roche method
We’ll use a variant of the midparent height method.
Then, we will test the designed formula so as to ensure that it provides with results within a reasonable margin error (\(\alpha\)=10%)
We use a well-known historical dataset (GaltonFamilies) containing the heights and sex of 934 children within 204 families with their respective parents height.
There’s a clear correlation between Parents height and their children height as we can observe from the graphs.
We use simple linear multivariable regression to estimate the coefficients for the following equation:
\[ HC=b+a_1*HF+a_2*HM+a_3*S \] where:
- HC:Child height
- HF:Father height
- HM:Mother height
- S:Dicotomic sex variable (1 if male, 0 if female)
We minimize the sum of the squared residuals by choosing the correct coefficients.
\[ MIN\sum_{i=1}^{934}(RH-*EH)^2 \]
By solving the above optimization problem we find the coefficients leading to the following prediction equation: \[ HC=0.4196+0.3928*HF+0.3176*HM+0.1324*S \]
##
## Call:
## lm(formula = GaltonFamilies$childHeight ~ GaltonFamilies$father +
## GaltonFamilies$mother + GaltonFamilies$gender, data = GaltonFamilies)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.241927 -0.037218 0.002395 0.037744 0.231650
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.419639 0.069271 6.058 2e-09 ***
## GaltonFamilies$father 0.392843 0.028677 13.699 <2e-16 ***
## GaltonFamilies$mother 0.317610 0.031000 10.245 <2e-16 ***
## GaltonFamilies$gendermale 0.132461 0.003602 36.775 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05498 on 930 degrees of freedom
## Multiple R-squared: 0.6354, Adjusted R-squared: 0.6342
## F-statistic: 540.3 on 3 and 930 DF, p-value: < 2.2e-16
We developed a public accesible free-to-use app to calculate children height estimations based on genetics and sex.
There’s still a lot to improve in terms of the estimation equation due to the lack of datasets containing weight,height,age,sex of children together with height of their respective parents.
Improvements could be made to the regression model by adding environmental factors such as nutrition level or ethnicity of the children whose adult height we are trying to predict.