R Studio Pitch Presentation

d2i2k
April 5, 2015

Galton Family Data (Slide 1)

The GaltonFamilies(HistData) dataset lists the individual observations for 934 adult children born to 205 fathers and mothers on which Sir Francis Galton (1886) based regression toward the mean. He wrote that, “the average regression of the offspring is a constant fraction of their respective mid-parental deviations.” For height, Galton estimated this regression coefficient to be about two-thirds (2/3).

Galton, F. (1886). “Regression towards mediocrity in hereditary stature”. The Journal of the Anthropological Institute of Great Britain and Ireland 15: 246-263.

Scatterplot of Galton Family Data (Slide 2)

Scatterplot of Galton family data with height of the son or daughter in inches on the ordinate (y-axis) and parental mid-height in inches on the abcissa (x-axis).

plot of chunk unnamed-chunk-1

Linear Regression Models for Galton Family Data (Slide 3)

Gender-specific linear regression models fitted to Galton family data with height of the son or daughter as the dependent variable and parental mid-height as the independent variable.

lm(formula = childHeight ~ midparentHeight, data = subset(GaltonFamilies, 
    gender == "female"))

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)     18.33348    3.60497   5.086 5.38e-07 ***
midparentHeight  0.66075    0.05202  12.701  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.024 on 451 degrees of freedom
Multiple R-squared:  0.2634,  Adjusted R-squared:  0.2618 
F-statistic: 161.3 on 1 and 451 DF,  p-value: < 2.2e-16
lm(formula = childHeight ~ midparentHeight, data = subset(GaltonFamilies, 
    gender == "male")) 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)     19.91346    4.08943   4.869 1.52e-06 ***
midparentHeight  0.71327    0.05912  12.064  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.3 on 479 degrees of freedom
Multiple R-squared:  0.2331,    Adjusted R-squared:  0.2314 
F-statistic: 145.6 on 1 and 479 DF,  p-value: < 2.2e-16

Scatterplot Matrix of Galton Family Data (Slide 4)

Multiple Scatterplots of Galton family data with height of the son or daughter in inches on the ordinate (y-axis) and parental mid-height in inches on the abcissa (x-axis).

plot of chunk unnamed-chunk-2

Multiple Regression Model for Galton Family Data (Slide 5)

Multiple regression model fitted to Galton family data with child height as the dependent variable, parental mid-height and gender of the child as the independent variables. The estimated common slope or regression coefficient equals 0.687 with 95% confidence interval (0.61, 0.76) covering two-thirds.

lm(formula = childHeight ~ midparentHeight + gender, data = GaltonFamilies)

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)     16.51410    2.73392    6.04 2.22e-09 ***
midparentHeight  0.68702    0.03944   17.42  < 2e-16 ***
gendermale       5.21511    0.14216   36.69  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.17 on 931 degrees of freedom
Multiple R-squared:  0.6332,    Adjusted R-squared:  0.6324 
F-statistic: 803.6 on 2 and 931 DF,  p-value: < 2.2e-16