The data below are the heights of fathers and their songs, based on an experiment by Karl Pearson in 1900. Heights are rounded to the nearest 0.1 inch. From my understanding of data, the most repeated example of normal distributions has always been heights. Due to my sense of curiosity, I decided to see if that is in the case.
HYPOTHESIS: Similar to the measurements in heights, the differences in heights between each son and his father should also normally distributed.
Figure 1: Based on the appearance of the histogram, it appears that the frequencies follow a normal distribution. The mean is centered towards the center of the histogram and carries the largest of the frequencies.
Figure 2: Another way to judge the data’s normality would be plotting it into a Q-Q plot. A Q-Q plot displays the distribution of the data against the expected normal distribution. Visually, most of the data points do follow the red line but a few outliers exist.
Figure 3: Using the standard deviation and mean from each collection of heights, the density histogram gives the reliative likelihood of a point (height) fallingw within a specific range. Plotted against our observed data, the the probabilities of obtaining any given value appear to follow that of a normal distribution.
Figure 4: Son’s and father’s heights plotted against each other, with the height of fathers being the X variable and sons being the Y variable. Note there isn’t a presence of a strong correlation and a simple linear relationship is very unlikely, due to various other variables that include genes from the son’s mothers.
Figure 5: As we plot the height differences between a father’s height and his son’s, we see that the density frequencies closely resemble that of a normal distribution.
Figure 6: Height differences hug the Q-Q plot line pretty accurately.
Heights were plotted via a histogram and a density curve, noting they appear to be normally distributed as expected. Similar to the unadjusted heights, the height differences appear to follow the boundaries proposed by the bell curve and ploted against a Q-Q line confirms my hypothesis.