Normal distributions

Author

Manish Gyawali

First, we calculate the means and standard deviation of the original sample, which is as follows:

Mean = 66.37 Standard deviation = 3.85

Let’s also look at the distribution:

Central Limit Theoreom

Now we come to the main part of the assignment. First, let us compute the means and standard deviations for several values that are both below and above \(30\). Notice the original value is at the top of the table.

N mean sd
orig 66.37 3.85
5 66.61 1.75
10 66.13 1.20
30 66.39 0.76
50 66.48 0.54
100 66.35 0.36

It is clear that the mean is roughly constant but the variance changes, as predicted by theory. Let’s see if the values tally: For \(n = 50\), theory predicts standard deviation should be \(\sigma/\sqrt{n}\), or \(3.85/\sqrt{50} =\) 0.54 which is close to the value shown in the table.

Visualization

Unstandardized values

Now let’s visualize the results, using a histogram. In the histogram, the sample sizes are labelled on top of each plot. We can see that the distribution gets more tightly centered around the mean as the sample size increases, as is predicted by theory. Beginning with a sample size of \(5\), we notice that there is a lot of dispersion even though the distribution looks normal. Further, these are not standardized values, so the plots are not centered at zero.

Also Notice that in the last two graphs, where N is 50 or 100, the mean is extremely close to the theoretical value (the red dashed line).

Standardized values

The following plots are standardized, and this can be verified by the fact that they are all centered at zero. After standardization, the plots don’t look very different from each other, and that is natural, because we have diluted the effect of sample size.

Case II: Standardized normal sums

Now we calculate the value of \(\sum_i^n X_i\) for \(100\) random samples for various sample sizes. We standardize the values by using \(Z = \sum_i^n X_i - n.\mu/\sigma.\sqrt(n)\) for each of the values and plot the values.

First we visualize:

B. Table

N mean sd
orig 66.37 3.85
5 0.02 1.11
10 -0.14 1.09
30 -0.05 1.04
50 0.04 0.98
100 0.01 1.08