Project - Presentation using RStudio Presenter - Son's Height Predictor

Partha Majumdar
Mon Sep 15 10:25:05 2014

Context

The presentation is linked to the Shiny Project “Son's Height Predictor”.

“Son's Height Predictor” available at https://partha6369.shinyapps.io/DSCourseDPProject/ (provide username = “testuser” and password = “abcd” to use the application)

The presentation explains the following.

  • The data used for the Tool.
  • The algorithm used for making the prediction.
  • Possible pitfalls of the algorithm.

The data used for the Tool

The Tool uses the data provided in the “UsingR” library and stored as “father.son”.

library(UsingR)
summary(father.son)
    fheight        sheight    
 Min.   :59.0   Min.   :58.5  
 1st Qu.:65.8   1st Qu.:66.9  
 Median :67.8   Median :68.6  
 Mean   :67.7   Mean   :68.7  
 3rd Qu.:69.6   3rd Qu.:70.5  
 Max.   :75.4   Max.   :78.4  

The algorithm used

The tool uses Linear Regression Model to predict the height of a Son for a given height of the Father.

So, suppose that Height of the Son is Y for a height of the Father given as X, we would use the formula,

Y = beta0 + beta1 * X (The function “lm” was used to determine “beta0” and “beta1”.)

Below we predict the height of the Son whose Father's height is 65 inches.

fit$coeff[1] + fit$coeff[2] * 65
(Intercept) 
       67.3 

Possible pitfalls of the algorithm

  • Unsure if the data used for the model can be applied universally across the Globe.
  • Not sure if data was collected from all parts of the world or from United States of America or United Kingdom only.
  • This can be verified by collecting data from countries of application of the algorithm like my interest is in applying this to Indian population.